Loading Events

Dissertation Defense

User-Centric Machine Learning Systems

Jiachen LiuPh.D. Candidate
WHERE:
3941 Beyster BuildingMap
SHARE:

Hybrid Event: 3941 BBB / Zoom

Abstract: Over the past five years, artificial intelligence (AI) has evolved from a specialized technology confined to large corporations and research labs into a ubiquitous tool integrated into everyday life. While AI extends its reach beyond niche domains to individual users across diverse contexts, the widespread adoption has given rise to new needs for machine learning (ML) systems to balance user-centric experiences—such as real-time responsiveness, accessibility and personalization—with system efficiency, including operational cost and resource utilization. However, designing such systems is complex due to diverse AI workloads—spanning conversational services, collaborative learning, and large-scale training—as well as the heterogeneous resources, ranging from cloud data centers to resource-constrained edge devices. My research addresses these challenges to achieve these dual objectives through a set of design principles centered on a sophisticated resource scheduler with a server-client co-design paradigm.

Our contributions are threefold. First, we propose Andes to address the critical need for real-time responsiveness in LLM-backed conversational AI by introducing the concept of QoE tailored for such text streaming service. Our server-side token-level scheduling algorithm dynamically prioritizes token generation based on user-centric metrics, while a co-designed client-side token buffer smooths the streaming experience. This approach significantly improves user experience during peak demand and achieves substantial GPU resource savings.

Second, we propose Auxo to deliver personalized AI services to a diverse set of end users through scalable collaborative learning. We propose a novel client-clustering mechanism that adapts to statistical data heterogeneity and resource constraints, complemented by a cohort affinity mechanism that empowers clients to join preferred groups while preserving privacy. This approach improves the personalized model performance, adapting to varying needs and contexts of end users.

Third, we propose Venn, to handle escalating demand for efficient resource sharing in multi-job collaborative learning environments. Our resource scheduler resolves complex resource contention proactively and introduces a novel job offer abstraction that allows client resources to identify eligible jobs based on their local resources. This significantly reduces job completion times and improves resource efficiency.

Organizer

CSE Graduate Programs Office

Faculty Host

Prof. Mosharaf Chowdhury