Loading Events

Faculty Candidate Seminar

Aligning AI Agents with an Ideal

Silviu PitisPh.D. CandidateUniversity of Toronto
WHERE:
3725 Beyster Building
SHARE:
Silviu Pitis

Zoom link for remote attendees

Meeting ID: 999 4631 3276 Passcode: 123123

Abstract: AI is becoming increasingly impactful and general purpose, to the point where the intended behaviors of current and future systems have become controversial and difficult to specify. In this talk, I will discuss how we can approach this problem by reasoning about the goals of an “ideal” AI agent. I will first present an axiomatic foundation for ideal reward functions, and then use that to motivate two complementary approaches to reducing uncertainty about general purpose goals: alignment via prediction, which evaluates present actions according to their future effects, and alignment via inference, which seeks to clarify underspecified objectives. I will highlight how these approaches may be used to steer frontier AI systems, including large language models, toward less ambiguous, better understood, and more socially beneficial outcomes.

Bio: Silviu Pitis recently received his PhD from the University of Toronto, advised by Jimmy Ba, where he does ongoing work funded by an OpenAI Superalignment Grant. He is a graduate affiliate at the Schwartz Reismann Institute for Technology and Society, and previously, a student researcher at Microsoft Research Montreal. His research broadly focuses on the design, evaluation and alignment of general purpose AI agents, and applies methods from reinforcement learning, language modeling and decision theory. He holds a Masters in CS from the Georgia Institute of Technology, a JD from Harvard Law School, and a BBA from the Schulich School of Business.

Organizer

Stephanie Jones

Faculty Host

Maggie Makar