Natural Language Processing Seminar
Evaluation Algorithms for Extractive Summaries
Add to Google Calendar
We start by observing that a summary for the same document can be quite different when written by different humans. This makes comparing a machine generated summary against a set of human written ones an interesting problem, for which we discuss a new methodology based on weighted relatedness to reference summaries, normalized by the relatedness of reference summaries among themselves. Comparing two summaries is also sensitive to their lengths and the length of the document they are extracted from. To address this, we argue that the overlap between two summaries should be compared against the average intersection size of random sets. Further challenges come from comparing human written abstractive summaries to machine generated extractive ones. We discuss a flexible evaluation mechanism using semantic equivalence relations derived from WordNet and word2vec. We conclude with an overview of our ongoing work on building a data repository based on scientific documents where author-written summaries provide a baseline for the evaluation of computer generated ones.
This is joint work with Fahmida Hamid and David Haraburda.
Paul Tarau is a Professor of Computer Science at the University of North Texas. He holds a PhD in CS from the University of Montreal and is the author of more than 140 peer reviewed papers in several fields, including Logic Programming, Natural Language Processing, Theoretical Computer Science and Computational Mathematics. He is the developer of the open-source Prolog systems BinProlog, Jinn and Styla, several tree-based numbering systems supporting computations with giant numbers and a co-author of the TextRank family of algorithms widely used in Graph-based Natural Language Processing.