Themes

Our research focuses on the computational foundations of intelligent behavior. We develop theories and systems pertaining to intelligent behavior using a unified methodology -- at the heart of which is the idea that learning has a central role in intelligence.

We attempt to understand the role of learning in supporting intelligent inference and use this understanding to develop systems that learn and make intelligent inferences in complex domains. Such systems would acquire the bulk of their knowledge from raw, real world data, and behave robustly when presented with new, previously unseen, situations. These systems can be studied at various levels of abstraction and from various viewpoints. We have concentrated on developing the theoretical basis within which to address some of the obstacles and on developing an experimental paradigm so that realistic experiments (in terms of scale and resources) can be performed to validate the theoretical basis. The main area of concentration in this realm has been that of natural language understanding and intelligent access to textual information.

Our work spans several aspects of this problem -- from foundational questions in learning, knowledge representation and reasoning, to experimental paradigms and large scale system development -- and draws on methods from theoretical computer science, probability and statistics, artificial intelligence, linguistics and experimental computer science. We are driven both by longer term goals to understand and develop capabilities for natural language comprehension, and by challenging shorter term applications in the area of information extraction and knowledge access.

In terms of "traditional" research areas our work falls mostly into Natural Language Processing and Machine Learning, but also to Learning Theory, Knowledge Representation and Reasoning. In the last few years we have been developing the following interrelated lines of work, all pertain to the idea that learning is of prime importance in performing knowledge intensive inferences and use the natural language domain as the main application area.

Theme 1:  Learning in Natural Language  

We have addressed several key problems in natural language from a unified point of view and developed both a theoretical understanding of several fundamental issues and an experimental paradigm that builds on this view. In particular, we have moved from studying stand alone natural language predication problems to higher level problems such as semantic parsing and textual entailment. These efforts have also driven the development of several successful systems and natural language processing tools.

Theme 2:  Learning and Inference  

Continuing our long standing line of work on integrating theories of learning and inference we have developed an integrated learning and inference approach in the context of natural language processing, addressing natural language problems that are global in that multiple interrelated components at several levels of abstraction affect the inference.

Theme 3:  Intelligent Information Access  

Developing better methods to access knowledge represented in text is one of the forces driving the integrated learning and inference paradigm above; we have addressed multiple fundamental information access problems from this perspective.

Theme 4:  Knowledge Representation and Inference  

We study intermediate knowledge representations that facilitate learning and inference in complex domains. We have also developed novel inference formulations and algorithms for complex probabilistic and relational representations including, for the first time, a relational inference algorithm for probabilistic representations.

Theme 5:  Learning Theory  

Much of the research described above builds on our work in learning theory and drives the foundational learning theory questions we address. Over the last few years we have addressed learning problems with constrained output, ranking problems and kernels over structures.

Perhaps the most significant aspect of this research program is in helping to place theoretical work in learning and inference in the context of realistic inference problems and thus contributing to a shift in focus in the natural language research community, a move to more advanced learning paradigms, and to work on higher level problems. Along with developing the theoretical foundations we have contributed to the development of a practical approach to building large-scale learning-centered intelligent systems and NLP tools and have developed advanced tools that are being used by many researchers and in industry.