AIIS LogoArtificial Intelligence and Information Systems Seminar


Michael White

Towards High-Quality Paraphrasing with CCG


Research on automatic paraphrase generation has been gaining steam in recent years. Or in other words, research on generating paraphrases automatically has seen increasing progress lately. In this talk, I'll present our initial efforts on a project whose aim is to learn to generate such paraphrases using Combinatory Categorial Grammar (CCG), and to use this technology to improve MT evaluation. In the first part of the talk, I'll show how the OpenCCG surface realizer has been extended to efficiently generate paraphrases from disjunctive logical forms, or packed semantic representations, and sketch how this method enhances the prospects for learning to generate paraphrases with a reversible lexicalized grammar. In the second part of the talk, I'll describe our progress in scaling up OpenCCG for broad coverage realization using grammars extracted from an enhanced version of the CCGbank. In particular, I'll (1) discuss the importance of our method of supertagging, or lexical category prediction, for efficient surface realization, and (2) show how our efforts to improve the CCGbank by (i) more accurately projecting Propbank roles onto it and (ii) refining the analysis of punctuation has paid off in improved realization quality, as measured by BLEU scores and exact matches.

The talk will in part cover joint work with Stephen Boxwell, Dominic Espinosa, Scott Martin, Dennis Mehay and Rajakrishnan Rajkumar.