Design Challenges and Misconceptions in Named Entity Recognition

Full Text   Presentation   Poster

Authors:

Lev Ratinov and Dan Roth

Abstract:

We analyze some of the fundamental design challenges and misconceptions that underlie the development of an efficient and robust NER system. In particular, we address issues such as the representation of text chunks, the inference approach needed to combine local NER decisions, the sources of prior knowledge and how to use them within an NER system. In the process of comparing several solutions to these challenges we reach some surprising conclusions, as well as develop an NER system that achieves 90.8 F1 score on the CoNLL-2003 NER shared task, the best reported result for this dataset.

Citation:

L. Ratinov and D. Roth, Design Challenges and Misconceptions in Named Entity Recognition. CoNLL  (2009)

Bibitem:

@conference{RatinovRo09,
  author = {L. Ratinov and D. Roth},
  title = {Design Challenges and Misconceptions in Named Entity Recognition},
  booktitle = {CoNLL},
  month = {6},
  year = {2009},
  url = " http://cogcomp.cs.illinois.edu/papers/RatinovRo09.pdf",
  funding = {MIAS, SoD, Library},
  projects = {IE},
  comment = {Named entity recognition; information extraction; knowledge resources; word class models; gazetteers; non-local features; global features; inference methods; BIO vs. BIOLU; text chunk representation},
}