Improving the Capacity of Language Recognition Systems to Handle Rare Languages Using Radio Broadcast Data
Final rept. 15 Oct 2008-15 Dec 2010
BRNO UNIV OF TECHNOLOGY (CZECH REPUBLIC)
Pagination or Media Count:
The total duration of the project is divided into 2 phases The first phase planned for the period May 2008 to Oct 2008. The second phase planned for Nov 2008 to April 2008. It has the following 3 work-packages WP. This project counts on Voice of America VOA data collection performed by LDC in the several past years. The VOA data will need to be completed with the available meta-information, especially about the languages contained. The following step will consist of cleaning the data and selecting relevant speech information, as we are aware of the automatically acquired data being quite dirty for the purposes of LRE 1. automatic segmentation into speech, music and noise segments, while only speech will be retained. The speechmusic segmentation was the topic of a diploma thesis finished at our department Hovorka2006 and is available for use in this project. 2. voice activity detection VAD that will be performed by our phoneme recognizer Schwarz2006 with all phoneme classes linked to speech class. This setup was successfully used in a wide range of applications such as speaker recognition, language recognition, speech transcription and spoken term detection and evaluated in several NIST evaluations. 3. detecting telephone conversations in the data. In this project, we will mainly investigate the data that is as closed as possible to the target domain conversational telephone speech CTS. Therefore, we will concentrate on the segments with detected telephone speech people calling in the broadcast as we believe these should correspond the best to CTS. Initial work on Thai done for NIST LRE 2007 has shown a yield of 8 hours of telephone conversations from approximately 400 hours of VOA data downloaded from the Internet archive of VOA.
- Radio Communications
- Voice Communications