AI Seminar
Benchmarking LLMs’ Judgments with No Gold Standard
This event is free and open to the publicAdd to Google Calendar
Location: BBB 3725
Zoom: https://umich.zoom.us/j/97434198716
Meeting ID: 974 3419 8716
Passcode: aiseminar
Benchmarking LLMs’ Judgments with No Gold Standard
Abstract:
With the advent of Large Language Models, a key question is how to evaluate the text they produce. This talk will introduce the GEM (Generative Estimator for Mutual Information), an evaluation metric for assessing language generation by Large Language Models (LLMs), particularly in generating informative judgments, without the need for a gold standard reference. GEM broadens the scenarios where we can benchmark LLM generation performance-from traditional ones, like machine translation and summarization, where gold standard references are readily available, to subjective tasks without clear gold standards, such as academic peer review. GEM uses a generative model to estimate mutual information between candidate and reference responses, without requiring the reference to be a gold standard. This work builds upon previous work about mechanisms for information elicitation. Although NLG Evaluation may not seem related to mechanism design, this talk will make this connection clear.
Bio:
Grant Schoenebeck is an associate professor at the University of Michigan in the School of Information. His work has recently focused on develop and analyze systems for eliciting and aggregating information from of diverse group of agents with varying information, interests, and abilities by combining ideas from theoretical computer science, machine learning, and economics (e.g game theory, mechanism design, and information design). More generally his recent work has been about incentives and (machine) learning in a variety of contexts. His research is supported by the NSF including an NSF CAREER award. Before coming to the University of Michigan in 2012, he was a Postdoctoral Research Fellow at Princeton. Grant received his PhD at UC Berkeley, studied theology at Oxford University, and received his BA in mathematics and computer science from Harvard.