Accession Number:

ADA422048

Title:

Centering Resonance Analysis: A Superior Data Mining Algorithm for Textual Data Streams

Descriptive Note:

Final rept. 1 Sep 2003-29 Feb 2004

Corporate Author:

CRAWDAD TECHNOLOGIES LLC CHANDLER AZ

Report Date:

2004-03-24

Pagination or Media Count:

81.0

Abstract:

Report developed under STTR contract for topic AF03T011 of STTR Program Solicitation FY 2003. Current knowledge-based intelligence systems do not perform well with streaming media because of performance shortcomings and an inability to work in storage constrained environment. The purpose of this research was to demonstrate that Centering Resonance Analysis CRA provides a superior approach to performing text mining under storage constraints. CRA is a radically different approach to modeling text compared to traditional word frequency-based approach. The project demonstrated that a CRA-based approach is superior to a word frequency approaches up to 15 times better in identifying relevant documents, and up to 5 times greater precision in topic tracking experiments. A CRA data structure requires one-third the space required for raw compressed text, and can execute on a typical desktop computer. Future RD efforts will focus on commercializing a product with applications to government and commercial business processes.

Subject Categories:

  • Information Science
  • Computer Programming and Software

Distribution Statement:

APPROVED FOR PUBLIC RELEASE