SRI International Fastus System MUC-4 Test Results and Analysis
SRI INTERNATIONAL MENLO PARK CA
Pagination or Media Count:
The system that SRI used for the MUC-4 evaluation represents a significant departure from system architectures that have been employed in the past. In MUC-2 and MUC-3, SRI used the TACITUS text processing system 1, which was based on the DIALOGIC parser and grammar, and an abudctive reasoner for horn-clause logic. In MUC-4, SRI designed a new system called FASTUS a permutation of the initial letters in Finite State Automata-based Text Understanding System which we feel represents a significant advance in the state of the art of text processing. The system shares certain modules with the earlier TACITUS system, namely modules for text preprocessing and standardization, spelling correction, Hispanic name recognition, and the core lexicon. However, the DIALOGIC system and abductive reasoner, which were the heart and soul of the previous system, were replaced by a system whose architecture is based on cascaded finite-state automata. Using this system we were capable of achieving a significant level of performance on the MUC-4 task with less than one month devoted to domain-specific development. In addition, the system is extremely fast, and is capable of processing texts at the rate of approximately 3,200 words per minute, measured in CPU time on a Sun SPARC-2 processor. Measured according to elapsed real time, the system about 50 slower, but the observed time depends on the particular hardware configuration involved.