Abstract - Natural Language Processing ||
Estimating the Partition Function with Random Maximum A-Posteriori Perturbations

TTI-Chicago

 

Learning and inference in complex models drives much of the research in machine learning applications, from computer vision, natural language processing, to computational biology. The inference problem in such cases involves assessing the likelihood of possible structures, whether objects, parsers, or molecular structures. Although it is often feasible to only find the most likely or maximum a-posteriori (MAP) assignment rather than considering all possible assignments, MAP inference is limited when there are other likely assignments. In a fully probabilistic treatment, all possible alternative assignments are considered thus requiring summing over the assignments with their respective weights -- evaluating the partition function -- which is considerably harder. The main surprising result of our work is that MAP inference (maximization) can be used to approximate and bound the partition function (weighted counting). Specifically, we relate the partition function to the expected MAP value of perturbations, thus we are able for the first time to directly use efficient MAP solvers such as graph-cuts in calculating the partition function. This approach leads to a new approximate inference framework, which supports efficient sampling and statistical approximations whenever the MAP assignment can be recovered efficiently. The approach excels in regimes where there are several but not exponentially many prominent assignments. For example, this happens in cases where observations carry strong signals (local evidence) but are also guided by strong consistency constraints (couplings).