
Faculty Candidate Seminar
Adventures in Feature Space: Vector Similarity Measures for Machine Learning
This event is free and open to the publicAdd to Google Calendar

Zoom link for remote participants
Abstract: Across eras and applications of machine learning, we see an enduring theme of transforming data so that data points—like words or images—with meaningful similarities are mapped to nearby coordinates in a low-dimensional feature space. But how do we mathematically express similarity in feature space to make predictions, or learn more effective representations of data? This talk looks at a humble but widely-used operation, the dot product, and its relative the cosine similarity through the lenses of algebra, geometry, and code. We will also consider an application to semantic search with a pre-trained sentence transformer model. The talk will end with time for questions as well as a brief overview of Dr. O’Brien’s research on how programmers in science research use large language models as code assistants.
Bio: Elle O’Brien, Ph.D. is lecturer and research investigator at the University of Michigan School of Information, where she teaches graduate courses about statistics and data science. Her research program asks how scientists who program use large language models as coding assistants and how they validate generated code. Previously, she studied auditory neuroscience using mathematical models of sensory perception at the University of Washington and worked as a developer advocate for open-source tools like Data Version Control.