Accession Number:

ADA522033

Title:

2-D Processing of Speech with Application to Pitch and Formant Estimation

Descriptive Note:

Briefing charts

Corporate Author:

MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB

Personal Author(s):

Report Date:

2007-11-10

Pagination or Media Count:

15.0

Abstract:

The grating compression transform GCT maps harmonically-related signal components to a concentrated entity in a spatial 2-D frequency plane The GCT forms the basis of a pitch estimator that uses the radial distance to the largest peak of the GCT The resulting pitch estimator appears robust under noise conditions and amenable to extension to two-speaker pitch estimation The GCT forms the basis of a formant estimator that exploits separability of speech source and vocal tract information via changing pitch Although the spectrogram provides a useful starting point for the GCT, alternate transforms can provide improved performance Fan-chirp transform is one possibility Possible GCT directions Alternate time-frequency distributions Pitch estimation Extended evaluation to a larger corpus and use of voicedunvoiced speech Two-speaker pitch estimation Formant estimation in noise GCT as model of auditory cortical processing Sthamma, Ezzat, and Poggio

Subject Categories:

  • Linguistics
  • Voice Communications

Distribution Statement:

APPROVED FOR PUBLIC RELEASE