We describe here the application of classification and regression trees to some problems in speech and language. We begin with a brief overview of the technique. We then describe their application to 1 End of sentence detection The not-so-simple problem of deciding when a period in text corresponds to the end of a declarative sentence and not an abbreviation is produced with trees using the Brown corpus as input. The result is 99.8 correct classification. 2 Segment duration modelling in speech synthesis 400 utterances from a single speaker and 4000 utterances from 400 speakers were used to build decision trees that predict segment durations based on features such as lexical position, stress, and phonetic context. Over 70 of the durational variance for the single speaker and over 60 for the multiple speakers was accounted by these methods. 3 Phoneme to phone prediction A lattice of possible close phonetic transcriptions given a phonemic transcription from the orthography and a dictionary is produced using the 4000 TIMIT database as input. The most likely phone corresponding to a phoneme can be predicted 83 correctly. The five most likely phones can be predicted 99 correctly.
This article is from 'Computing Science and Statistics: Proceedings of the Symposium on the Interface Critical Applications of Scientific Computing: Biology, Engineering, Medicine, Speech Held in Seattle, Washington on 21-24 April 1991,' AD-A252 938, p1-6.