Abstract - Natural Language Processing ||
A Dataset for the Machine Comprehension of Text, and some Preliminary Results

Chris Burges

Microsoft Research

 

Abstract
I will describe MCTest, a freely available set of stories and associated questions intended for research on the machine comprehension of text. I will also describe our current work on this and other datasets. It seems likely that any complete semantic model of text will require some form of recursion. As a test bed for these ideas, we are investigating the use of selective classifiers to improve part of speech tagging accuracy, and I will show some early results along these lines.

Bio:
Chris Burges received his PhD in theoretical physics, on constraints on new particle mass scales and on models of early universe cosmology, at Brandeis University in 1984. After a two-year post doc at the theoretical physics department at MIT, during which he worked on supersymmetry in Anti deSitter space, cellular automata models of thermal fluids, and the gravitational Aharanov-Bohm effect, and after the arrival of his family's first baby, he rather abruptly switched to a more family-friendly field and became a systems engineer for AT&T Bell Labs. There he worked on network performance and routing: AT&T still uses his algorithms to route their CCS7 signaling network (the nervous system of the long distance network). When he saw a cool demo of neural networks reading handwritten digits, he switched fields again, and began his long descent into machine learning. He has worked on handwriting and machine print recognition (he worked on a system now used to read millions of checks daily, and he worked on zip code and handwritten address recognition for the USPS), support vector machines, audio fingerprinting (his work is currently used in XBox and Windows Media Player to identify music), speaker verification, information retrieval, and semantic modeling. His ranking algorithm is currently used in Bing for ranking web search results. Chris was program co-chair of Neural Information Processing Systems 2012 and general co-chair of NIPS 2013. His main current research interest is on modeling meaning in text.