Communications and Signal Processing Seminar

(CANCELED) Discrete Optimization for Adversarial Attacks on Large Language Models

Name: (CANCELED) Discrete Optimization for Adversarial Attacks on Large Language Models
Start: 2023-11-30T15:30:00-05:00
End: 2023-11-30T17:00:00-05:00
Location: 3427 EECS Building

Zico KolterAssociate Professor of Computer ScienceCarnegie Mellon UniversityChief Scientist of AI ResearchBosch Center for AI(BCAI), Pittsburgh Office

WHERE:

3427 EECS BuildingMap

WHEN:

Thursday, November 30, 2023 @ 3:30 pm - 5:00 pm
This event is free and open to the publicAdd to Google Calendar

Abstract: In this talk, I’ll discuss our recent work on adversarial attacks against public large language models (LLMs), such as ChatGPT and Bard. At a high level, the attacks look for “adversarial suffix” strings that cause these models to ignore their guardrails and answer potentially harmful user queries. This talk will specifically focus on the optimization aspects of this problem, where the task at hand involves a relatively unstructured optimization over discrete objects (the tokens in the adversarial suffix). I will highlight the challenges of this problem from an optimization standpoint, and highlight the main features of our method, which combines gradient-based information and with greedy search. I will highlight potential future directions for research in such optimization settings, as well as discuss the broader implications on LLM robustness.

Bio: Zico Kolter is an Associate Professor in the Computer Science Department at Carnegie Mellon University, and also serves as chief scientist of AI research for the Bosch Center for Artificial Intelligence. His work spans the intersection of machine learning and optimization, with a large focus on developing more robust and rigorous methods in deep learning. In addition, he has worked in a number of application areas, highlighted by work on sustainability and smart energy systems. He is a recipient of the DARPA Young Faculty Award, a Sloan Fellowship, and best paper awards at NeurIPS, ICML (honorable mention), AISTATS (test of time), IJCAI, KDD, and PESGM.

*** The event will take place in a hybrid format. The location for in-person attendance will be

Faculty Host

Liyue ShenAssistant ProfessorUniversity of Michigan, Electrical and Computer Engineering

Michigan Engineering

Electrical Engineering and Computer Science Department

Computer Science
and Engineering
Bob and Betty Beyster Building
2260 Hayward Street
Ann Arbor, MI 48109-2121
Contact >CSE Intranet >
Electrical and
Computer Engineering
EECS Building
1301 Beal Avenue
Ann Arbor, MI 48109-2122
Contact >ECE Intranet >

© 2024 The Regents of the University of Michigan

Privacy Policy

Campus Safety

Non-Discrimination Policy

Placeholder

Events

Communications and Signal Processing Seminar

(CANCELED) Discrete Optimization for Adversarial Attacks on Large Language Models

Faculty Host