Extracting Dynamic Evidence Networks
Final rept. Sep 2001-Oct 2004
BBN TECHNOLOGIES CAMBRIDGE MA
Pagination or Media Count:
BBNs primary goal was to dramatically increase the accuracy of evidence extraction. Using a hybrid of statistical learning algorithms and handcrafted patterns, SERIF achieved 93 of human performance in extracting entities, events, and relations, and 96 of human performance in extracting relations given entities and events. A second performance objective was to be able to extract entities that have names at 80 of human performance. This performance was then further improved in the relation extraction work done in 2004. An additional objective was to have a prototype robust enough that it could extract evidence continually 24x7 from a daily English news feed. All objectives were achieved. BBNs SERIF system also represents a significant advance for extraction systems in architecture and implementation. The combination of general linguistic models trained on preexisting corpora with domain specific components trained for the particular task allows powerful linguistic analysis tools to be brought to bear on extracting the relations and events of a new domain. The use of propositions as an intermediate step was an important part of this strategy, encapsulating the literal meaning of the text from which the target relations could then be derived.
- Statistics and Probability