University of Sheffield: Description of the LaSIE System as Used for MUC-6

Gaizauskas, R.; Wakao, T.; Humphreys, K.; Cunningham, H.; Wilks, Y.

University of Sheffield: Description of the LaSIE System as Used for MUC-6

Active / Technical Report | Accession Number: ADA636158 |

Open PDF

Abstract:

The LaSIE Large Scale Information Extraction system has been developed at the University of Sheffield as part of an ongoing research effort into information extraction and, more generally, natural language engineering. LaSIE is a single, integrated system that builds up a unified model of a text which is then used to produce outputs for all four of the MUC-6 tasks. Of course this model may also be used for other purposes aside from MUC-6 results generation, for example we currently generate natural language summaries of the MUC-6 scenario results. Put most broadly, and superficially, our approach involves compositionally constructing semantic representations of individual sentences in a text according to semantic rules attached to phrase structure constituents which have been obtained by syntactic parsing using a corpus-derived context-free grammar. The semantic representations of successive sentences are then integrated into a discourse model which, once the entire text has been processed, may be viewed as a specialisation of a general world model with which the system sets out to process each text. LaSIE has a historical connection with the University of Sussex MUC-5 system GCE93 from which it derives its approach to world modelling and co-reference resolution and its approach to recombining fragmented semantic representations which result from partial grammatical coverage. However, the parser and grammar differ significantly from those used in the Sussex system. In its approach to named entity identification LaSIE borrows to some extent from the approach adopted in the MUC-5 Diderot system CGJ93. Virtually all of the code in LaSIE is new and has been developed since January 1995 with about 20 person-months of effort.

Author(s):

Gaizauskas, R. ; Wakao, T. ; Humphreys, K. ; Cunningham, H. ; Wilks, Y.

Author Organization(s):

SHEFFIELD UNIV (UNITED KINGDOM)

Descriptive Note:

Conference paper

Supplementary Note:

Proceedings of the Sixth Message Understanding Conference (UC-6), 6-8 Nov 1995, Columbia, MD. Sponsored by the Defense Advanced Research Projects Agency.

Pagination:

0015

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:

Approved For Public Release

Distribution Statement:

Approved For Public Release; Distribution Is Unlimited.

RECORD

Collection: TR

Identifying Numbers

Monitor Series:

DARPA

Subject Terms

Joint Capability Areas:

JCA_5_Command and Control; JCA_5.5.2_Task; JCA_5.5_Direct; JCA_1_Force Support; JCA_5.3_Planning; JCA_1.2_Force Preparation; JCA_1.2.3_Educating; JCA_8_Building Partnerships

Modernization Areas:

AI and Machine Learning

Communities of Interest:

Materials and Manufacturing Processes

Descriptor(s):

*INFORMATION RETRIEVAL, *NATURAL LANGUAGE, COMPUTER PROGRAMS, EXTRACTION, GLOBAL, GRAMMARS, INTEGRATED SYSTEMS, MODELS, PARSERS, SCENARIOS, SEMANTICS, SYNTAX, WORDS(LANGUAGE)

Field(s)/Group(s):

Information Science, Linguistics, Computer Programming and Software

Keyword(s):

LASIE(LARGE SCALE INFORMATION EXTRACTION)

Report Date:

1995 Nov 01

Creation Date:

2016 Aug 29