
AI Seminar | Natural Language Processing Seminar
Executable and Trustworthy Planning with Large Language Models
This event is free and open to the publicAdd to Google Calendar

Abstract: While large language models (LLM) can provide decent instructions, they are far from able to come up an executable and trustworthy plan for a particular user or agent, grounding to their specific situation and needs. To address this, I advocate for the methodology of using LLM as a code generator to create a formal representation of the planning environment. In conjunction with tools in classical AI planning, a plan can be found deterministically and faithfully. In this talk, I will discuss two strands of efforts. The first tackles fully-observed planning domains, where the model is given complete information and must propose a complete plan that satisfies given constraints. The second tackles partially-observed planning domains, where the model makes partial observations about the environment, propose partial plans, and iteratively acquire knowledge to complete a task. In both settings, we show that state-of-the-art models like DeepSeek-R1 and gpt-4o are heavily challenged by even the simplest tasks like rearranging or looking for objects. When prompted to generate the planning domain definition language (PDDL) input into a solver, LLMs outperform generating the plans directly. Even so, both syntactic and semantic errors point to LLMs’ weakened ability to generate formal representations, especially when the language or domain is underrepresented in their pre-training.
Biography: Li “Harry” Zhang is an assistant professor at Drexel University, focusing on Natural Language Processing (NLP) and artificial intelligence (AI). He obtained his PhD degree from the University of Pennsylvania advised by Prof. Chris Callison-Burch. Prior, he obtained his Bachelor’s degree at the University of Michigan mentored by Prof. Rada Mihalcea and Prof. Dragomir Radev. His current research uses large language models (LLMs) to reason and plan in an executable and trustworthy manner via symbolic and structured representations. He has published more than 20 peer-reviewed papers in NLP and AI conferences, such as ACL, EMNLP, and NAACL, that have been cited more than 1,000 times. He also consistently serves as Area Chair, Session Chair, and reviewer in those venues. Being a musician, producer, and content creator having over 50,000 subscribers, he is also passionate in the research of AI music and creativity.