
Hover over a fund to read a quick overview; click on the fund to read a complete description.

The objective of DARPA's Machine Reading Program (MRP) is to create a universal Reading System that can take any natural text and any reasoning context as input and can effectively apply the knowledge contained in that text in that reasoning context. The intention is to combine natural language processing (NLP) and Artificial Intelligence (AI) reasoning into a new technology that provides the benefits of both.
The Bootstrapped Learning (BL) program seeks to make instructable computing a reality. The "electronic student" will learn from a human teacher who uses spoken language, gestures, demonstration, and many other methods one would find in a human mentored relationship. Furthermore, it will build upon learned concepts and apply that knowledge across different fields of study. This research is performed under sub-contract with the Stanford Research Institute.

The goal of the Center for Multimodal Information Access and Synthesis Center (MIAS) at UIUC is to develop the fundamental theories, computational models, algorithms, and tools for analysts to access a variety of data formats and models, to integrate them with existing resources, and to transform raw data into useful and understandable information, in support of productive and efficient analysis. We aim at extending the state-of-the-art and develop new technologies for: (1) Focused data retrieval and integration, to identify and collect relevant data from multiple modalities, (2) Semantic data enrichment, to allow navigation and search across disparate data modalities and augment knowledge bases by inferring semantics from unstructured data and images, (3) Entity identification and relationship discovery, to identify real-world entities and relate them to existing institutional resources, (4) Knowledge discovery and hypotheses generation and verification, to construct the rich semantic structure and hidden networks of entity linkages, and (5) Fundamental machine learning, database and data mining, natural language processing, and computer vision techniques required for and driven by the aforementioned problems.

IARPA’s Foresight and Understanding from Scientific Exposition (FUSE) Program seeks to develop automated methods that aid in the systematic, continuous, and comprehensive assessment of technical emergence using publicly available information found in published scientific, technical and patent literature.

Summary: A significant amount of the software written today interacts with naturally occurring (sensor) data such as text, speech, images and video, streams of financial data, and biological sequences. Frequently there exists a need to reason with respect to concepts that are complex and often difficult to define explicitly in terms of the raw data observed. Examples include determining the gender of a person in an image; determining the topic of an article; determining the role of a noun phrase in a sentence; determining whether more than three people are currently meeting in someone's office; or scheduling a computation in a grid in a way that adapts to a multitude of properties of the resources and links. Applications that require such abilities are expected to rapidly grow even more important in future years.
The Research Experiences for Undergraduates (REU) program grant supports active research participation by undergraduate students in computer science funded by the National Science Foundation. REU students generally participate in projects listed under the Learning Based Programming funding summary (NSF SoD-HCER-0613885)
A fundamental task in sentence comprehension involves assigning semantic roles to sentence constituents, thus determining "who does what to whom." The syntactic bootstrapping theory proposes that even very young children use precursors of the adult's knowledge of syntax to accomplish this task. The project combines experimental and computational approaches to test and refine this theory.

A fundamental task in sentence comprehension is to assign semantic roles to sentence constituents. The structure-mapping account proposes that children start with a shallow structural analysis of sentences: children treat the number of nouns in the sentence as a cue to its semantic predicate-argument structure, and represent language experience in an abstract format that permits rapid generalization to new verbs. In this project, we test the consequences of these representational assumptions via experiments with a system for automatic semantic role labeling (SRL), trained on a sample of child-directed speech.

Effective tactical decision-making requires a quick and accurate appreciation of the current and near-future situation. In today's highly interconnected world, data relevant to situational awareness is plentiful. The objective of this project is to study a unified inference framework that can take as input models that correspond to disparate sources and modalities. The intention is to study both how to learn good models from different sources with different kinds of associated uncertainty, and how to combine these into a coherent decision, taking into account characteristics of the data as well as of its source.

This grant addresses issues relating to network science, along with associated issues related to massive data handling, large scale information mining, and the rapid processing needed for rapid analysis.

SHARPS aims to develop techniques to reduce security and privacy risks that pose barriers to the meaningful use of health information technology. Areas of focus include electronic health records, health information exchanges, and telemedicine. The project includes . . .

A key recent educational priority has been to use summative assessment as an incentive to, and for measurement of, learner progress and school improvement. It is widely agreed, however, that this should to be supplemented by formative assessment, or assessment for screening, monitoring and diagnosis. This project aims to tackle the challenge of effective formative assessment in the area of writing. Project goals are to: 1) develop the Assess-As-You-Go Writing Assistant using an interdisciplinary team of computer scientists, measurement specialists, content area experts, and educational practitioners and 2) refine a prototype through field tests.

For some of our projects, as noted in the project descriptions, we have collaborative funding arrangements with other Departments of the University of Illinois.

Trustworthiness of information is a problem our society needs to address. As computer scientists, we have to begin developing a principled approach to this difficult problem. There are multiple challenges here, from the problem of retrieving relevant evidence, to that of dealing with conflicting claims, to developing a level of trust that agrees with one’s subjective beliefs and common sense knowledge.
Extending from our other research activities in textual entailment, the objective of this Google-sponsored project is to investigate and develop advanced learning and reasoning technologies in support of natural language understanding-related tasks.

The goal is to develop a research program in the area of Cyber Analytics. This will allow us to join expertise in the text and information extraction area (Roth) with expertise in data mining (Han). It will also build on existing efforts and collaborations that exist between UIUC and Boeing in the area of Entity Recognition and Tracking, Information Extraction, and Moving Object Data Mining . . .
This project is building a capability for identifying text fragments that exhibit a set of specifiable semantic properties in large text corpora. The objective is to investigate and develop advanced learning and reasoning technologies in support of natural language understanding-related tasks. We will develop an approach to focused textual entailment in the context of text anonymizing. The research is investigating both a novel inference method for focused textual entailment, and methods for acquiring appropriate declarative knowledge required to support this inference. We will apply and evaluate it in the context of anonymizing text snippets with respect to specific goals.

Current tools to help write "correctly" are very limited, and don't extend much beyond automatic spell checking against a dictionary. We are developing an authoring assistance tool that will help writers identify and correct mistakes that a spell checker cannot catch.

Dash Optimization has generously provided the XPress-MP Optimization Suite which has served as an advanced optimization tool for the research in our group.

The ECHO Depository is a digital preservation research and development project funded by the Library of Congress under their National Digital Information Infrastructure and Preservation Program. The ECHO Depository project pulls together several streams of activities aimed at helping to answer the question of how digital resources will be identified, archived, and preserved for the future.

This research seeks to develop an integrated view - theoretical understanding, algorithms development and experimental evaluation - for learning coherent concepts. These are learning scenarios that are common in cognitive learning - where multiple learner
Recent advances in Natural Language Processing, in particular the ability to use unstructured data to answer natural language questions, are very exciting from an educational perspective. They offer the promise of systems that can automatically respond to students' questions, thus supporting not only a guided but also an open ended, exploration based, approach to learning.
The goal of this project is to apply research in Computer Science -- particularly Natural Language Processing -- and the Learning Sciences, to developing an intelligent tutor that can provide the right kind of environment for students, one that facilitates rather than inhibits inquiry through a known knowledge space and provides a jumping-off space for trying to find or generate new knowledge.
The testbed domain in this project involves high school and undergraduate level students studying concepts in BioInformatics.
This project enables a "grid" to be used in the following four research projects: (1) Advanced Programming Environments for Cluster and Grids, (2) Parallel Applications for Clusters and Grids, (3) Dynamic Sequential Code Optimization, and (4) Architectures for Multimedia and Communications Applications.
The ability to speak and understand language is probably the most intricate skill that people possess. It is certainly our most uniquely human ability. This project investigates how such an important skill is acquired and continues to develop throughout o

The project studies a machine learning centered approach to data-intensive and computing-intensive processing for intelligent context-sensitive human-machine interfaces. The future of intelligent human-machine interaction is in the ability to perform co