Centering Resonance Analysis: A Superior Data Mining Algorithm for Textual Data Streams
Final rept. 1 Sep 2003-29 Feb 2004
CRAWDAD TECHNOLOGIES LLC CHANDLER AZ
Pagination or Media Count:
Report developed under STTR contract for topic AF03T011 of STTR Program Solicitation FY 2003. Current knowledge-based intelligence systems do not perform well with streaming media because of performance shortcomings and an inability to work in storage constrained environment. The purpose of this research was to demonstrate that Centering Resonance Analysis CRA provides a superior approach to performing text mining under storage constraints. CRA is a radically different approach to modeling text compared to traditional word frequency-based approach. The project demonstrated that a CRA-based approach is superior to a word frequency approaches up to 15 times better in identifying relevant documents, and up to 5 times greater precision in topic tracking experiments. A CRA data structure requires one-third the space required for raw compressed text, and can execute on a typical desktop computer. Future RD efforts will focus on commercializing a product with applications to government and commercial business processes.
- Information Science
- Computer Programming and Software