Finding low-dimensional structure in messy data
This event is free and open to the publicAdd to Google Calendar
In order to draw inferences from large, high-dimensional datasets, we often seek simple structure that models the phenomena represented in those data. Low-rank linear structure is one of the most flexible and efficient such models, allowing efficient prediction, inference, and anomaly detection. However, classical techniques for learning low-rank models assume your data have only minor corruptions that are uniform over samples. Modern research in optimization has begun to develop new techniques to handle realistic messy data — where data are missing, have wide variations in quality, and/or are observed through nonlinear measurement systems.
In this talk I will give a high-level overview of recent research in this area. Then I will focus on the problem of learning linear subspace structure from multiple data sources of varying quality. This is common in problems like sensor networks or medical imaging, where different measurements of the same phenomenon are taken with different quality sensing (eg high or low radiation). In this context, learning the low-rank structure via PCA suffers from treating all data samples as if they are equally informative. I will discuss our theoretical results on weighted PCA. I will then present new algorithms for the non-convex probabilistic PCA formulation of this problem and a novel SDP relaxation.
Laura Balzano is an associate professor of Electrical Engineering and Computer Science, and of Statistics by courtesy, at the University of Michigan. She is recipient of the NSF Career Award, ARO Young Investigator Award, AFOSR Young Investigator Award, and faculty fellowships from Intel and 3M. She received the Vulcans Education Excellence Award at the University of Michigan. Her main research focus is on modeling with big, messy data — highly incomplete or corrupted data, uncalibrated data, and heterogeneous data — and its applications in a wide range of scientific problems. Her expertise is in statistical signal processing, matrix factorization, and optimization. Laura received a BS from Rice University, MS from the UCLA, and PhD from the University of Wisconsin in Electrical and Computer Engineering.