Accession Number : ADA437173


Title :   An Integrated Suite of Text and Data Mining Tools - Phase II


Descriptive Note : Final rept. 22 May 2002-30 Aug 2005


Corporate Author : SEARCH TECHNOLOGY INC NORCROSS GA


Personal Author(s) : Frey, Paul R ; Minsk, Brian S ; Porter, Alan L


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a437173.pdf


Report Date : 30 Aug 2005


Pagination or Media Count : 67


Abstract : This report summarizes the results of a three-year SBIR project to develop an integrated suite of text and data mining tools. The goal of this project is to provide tools that can help analysts find connections between requirements (as expressed in requirements, documents, or databases) and open-source research literature. An overall approach is outlined, and a step-by-step overview of the work is presented. The tool suite includes parsers for text data sources, metadata extraction, record combining, entity extraction, data normalization, sub-and cross-dataset analysis, multi-field analysis and visualizations, feature selection, XML importers, and indirect link analysis. A set of recommendations for expanding the use of the tools is presented.


Descriptors :   *DATA BASES , *INFORMATION RETRIEVAL , *TEXT PROCESSING , *SOFTWARE TOOLS , *METADATA , MODELS , COMPUTER GRAPHICS , PARSERS , SCIENTIFIC LITERATURE , ANALYSTS , DATA FUSION , PROGRAMMING LANGUAGES , INTEGRATION


Subject Categories : Information Science
      Computer Programming and Software


Distribution Statement : APPROVED FOR PUBLIC RELEASE