Accession Number:

ADA464116

Title:

Deep Versus Broad Methods for Automatic Extraction of Intelligence Information From Text

Descriptive Note:

Conference paper

Corporate Author:

NAVAL POSTGRADUATE SCHOOL MONTEREY CA

Report Date:

2005-06-01

Pagination or Media Count:

42.0

Abstract:

Extraction of intelligence from text data is increasingly becoming automated as software and network technology increases in speed and scope. However, enormous amounts of text data are often available and one must carefully design a data mining strategy to obtain the relevant nuggets of gold from the mountains of useless dross. Two strategies can be tried. A deep approach is to use a few strong clues to find reasonable sentence candidates, then apply linguistic restrictions to find and extract key information if any surrounding the candidates. A broad approach is to focus on large numbers of weaker clues such as specific words whose implications can be combined to rate sentences and present those of high likelihood of relevance. In the work reported here, we tested the deep approach on military intelligence reports about enemy positions, which were relatively short text extracts, and we tested the broad approach on news stories from the World Wide Web involving terrorism, which presented a large volume of text information.

Subject Categories:

  • Information Science
  • Linguistics
  • Military Intelligence

Distribution Statement:

APPROVED FOR PUBLIC RELEASE