Accession Number : ADA511688


Title :   The MIT Lincoln Laboratory RT-04F Diarization Systems: Applications to Broadcast Audio and Telephone Conversations


Descriptive Note : Conference paper


Corporate Author : MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB


Personal Author(s) : Reynolds, D A ; Torres-Carrasquillo, P


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a511688.pdf


Report Date : Nov 2004


Pagination or Media Count : 11


Abstract : Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization has utility in making automatic transcripts more readable and in searching and indexing audio archives. In this paper we describe the systems developed by MITLL and used in DARPA EARS Rich Transcription Fall 2004 (RT-04F) speaker diarization evaluation. The primary system is based on a new proxy speaker model approach and the secondary system follows a more standard BIC based clustering approach. We present experiments analyzing performance of the systems and present a cross-cluster recombination approach that significantly improves performance. In addition, we also present results applying our system to a telephone speech, summed channel speaker detection task.


Descriptors :   *ACOUSTIC DATA , *CHANGE DETECTION , *SPEECH ANALYSIS , INFORMATION RETRIEVAL , IDENTIFICATION SYSTEMS , ACOUSTIC RECORDING SYSTEMS , TELEPHONE SIGNALS , ALGORITHMS , SEX , AUDITORY SIGNALS , CLUSTERING , AUTOMATION , WORKSHOPS


Subject Categories : Cybernetics
      Acoustic Detection and Detectors


Distribution Statement : APPROVED FOR PUBLIC RELEASE