Loading Events

Dissertation Defense

Understanding and Identifying Challenges in Design of Safety-Critical AI Systems

Ekdeep Singh Lubana
WHERE:
1180 DuderstadtMap
SHARE:
Ekdeep Singh Lubana Defense Photo

As AI systems proliferate into our society, ensuring their safe and reliable deployment has become exceedingly important. To that end, current regulatory frameworks have grounded themselves in risk regulation, assuming potential harms from AI systems can be predicted and mitigated. The goal of this dissertation is to challenge this design choice. We first demonstrate that the unpredictable nature of model capabilities—where unexpected behaviors can suddenly emerge—render preemptive risk assessments inadequate. Second, we discuss limitations of fine-tuning protocols, the current de-facto strategy for mitigating vulnerabilities identified in a model. We show such protocols learn minimal transformations of base capabilities that are insufficient to guarantee safety beyond the data distribution used for fine-tuning. Lastly, we explore how minor input modifications can drastically alter a model’s output, relating this behavior with Bayesian hypothesis selection and hence arguing that establishment of safe use standards for modern, exceedingly open-ended models may be difficult. Overall, the contributions of this dissertation suggest regulation of AI systems requires exploration of more nuanced regulation paradigms that go beyond mere risk regulation.

 

Chair: Professor Robert Dick