Accession Number : ADA463040


Title :   Entropy Based Classifier Combination for Sentence Segmentation


Corporate Author : INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA


Personal Author(s) : Magimai-Doss, M ; Hakkani-Tur, D ; Cetin, O ; Shriberg, E ; Fung, J ; Mirghafori, N


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a463040.pdf


Report Date : Jan 2007


Pagination or Media Count : 5


Abstract : We describe recent extensions to our previous work, where we explored the use of individual classifiers, namely, boosting and maximum entropy models for sentence segmentation. In this paper we extend the set of classification methods with support vector machine (SVM). We propose a new dynamic entropy-based classifier combination approach to combine these classifiers, and compare it with the traditional classifier combination techniques, namely, voting, linear regression and logistic regression. Furthermore, we also investigate the combination of hidden event language models with the output of the proposed classifier combination, and the output of individual classifiers. Experimental studies conducted on the Mandarin TDT4 broadcast news database shows that the SVM classifier as an individual classifier improves over our previous best system. However, the proposed entropy-based classifier combination approach shows the best improvement in F-Measure of 1% absolute, and the voting approach shows the best reduction in NIST error rate of 2.7% absolute when compared to the previous best system.


Descriptors :   *WORDS(LANGUAGE) , *SPEECH RECOGNITION , CLASSIFICATION , SEGMENTED , ENTROPY , LINEAR REGRESSION ANALYSIS , VECTOR ANALYSIS


Subject Categories : Linguistics
      Voice Communications


Distribution Statement : APPROVED FOR PUBLIC RELEASE