Communications and Signal Processing Seminar
Learning from Multiple Biased Sources
This event is free and open to the publicAdd to Google Calendar
Abstract
When high-quality labeled training data are unavailable, an alternative is to learn from training sources that are biased in some way. This talk will cover my group’s recent work on three problems where a learner has access to multiple biased sources. First, we consider the problem of classification given multiple training data sets corrupted by label noise, and describe a weighted empirical risk minimization strategy where the weights are optimized according to the degree of corruption of each source. Second, we consider the Sim-to-real problem in reinforcement learning, where the learning agent has access to multiple biased simulators from which it can learn before being deployed in the real world. We present a novel theoretical framework for Sim-to-real, and an algorithm whose real-world sample complexity is smaller than what is currently achievable when learning without access to simulators. Finally, we consider the problem of clustering when observations are not iid, but are organized into groups of realizations coming from the same (unknown) cluster. We discuss identifiability of this problem, and present a practical algorithm for inferring the components of nonparametric mixture models from paired observations under very general nonparametric assumptions on the underlying data distribution (in particular, the mixture components can have substantial overlap).
Biography
Clay Scott received his PhD in Electrical Engineering from Rice University in 2004, and is currently Professor of Electrical Engineering and Computer Science at the University of Michigan. He researches statistical machine learning theory and algorithms, with an emphasis on nonparametric methods for supervised and unsupervised learning. He has also worked on a number of applications stemming from various scientific disciplines, including brain imaging, nuclear threat detection, environmental monitoring, and computational biology. In 2010, he received the Career Award from the National Science Foundation.