Scenario Customization for Information Extraction
DEFENSE ADVANCED RESEARCH PROJECTS AGENCY ARLINGTON VA
Pagination or Media Count:
Information Extraction IE is an emerging NLP technology, whose function is to process unstructured, natural language text, to locate specific pieces of information, or facts, in the text, and to use these facts to fill a database. IE systems today are commonly based on pattern matching. The core IE engine uses a cascade of sets of patterns of increasing linguistic complexity. Each pattern consists of a regular expression and an associated mapping from syntactic to logical form. The pattern sets are customized for each new topic, as defined by the set of facts to be extracted. Construction of a pattern base for a new topic is recognized as a time-consuming and expensive process--a principal roadblock to wider use of IE technology in the large.
- Information Science