A Study in Speech Recognition Using a Kohonen Neural Network Dynamic Programming and Multi-Feature Fusion

Recla, Wayne F.

A Study in Speech Recognition Using a Kohonen Neural Network Dynamic Programming and Multi-Feature Fusion

Active / Technical Report | Accession Number: ADA216282 |

Open PDF

Abstract:

The human perception system is multi-dimensional humans process more than just the sound of the word. Any speech recognition system that mimics human speech perception will need to be multi-dimensional. This methodology formed the basis for the design approach used in this research effort. Linear Predictive Coefficients LPC and formants were used as distinct and independent inputs into a recognition system consisting of a Kohonen neural network and a dynamic programming word classifier. A feature-fusion section and rule-based system were used to integrate the two input feature sets into one output result. This research effort involved extensive testing of the Kohonen network. Using a speech input signal, different Kohonen gain reduction methods, initial gain values, and conscience values were tested for various iteration times in an effort to quantify the response and capabilities of the Kohonen network. Three- dimensional Kohonen-Dynamic Programming surfaces were developed that graphically showed the effects of gain, conscience, and iteration time on the speech recognition response of a Kohonen neural network. A new standard iteration time called a multiple was used during training of the Kohonen networks. The results of the basic research on the Kohonen produced an optimized Kohonen configuration that was used in the multiple-feature recognition system. A 70-word vocabulary of F-16 cockpit commands were used to evaluate the new feature-fusion method. Theses

Author(s):

Recla, Wayne F.

Author Organization(s):

AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING

Descriptive Note:

Master's thesis,

Pagination:

0189

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:

Approved For Public Release

Distribution Statement:

Approved For Public Release; Distribution Is Unlimited.

RECORD

Collection: TR

Identifying Numbers

Report Number(s):

AFIT/GE/ENG/89D-41

Monitor Series:

AFIT

Subject Terms

Joint Capability Areas:

JCA_1.2.1_Training; JCA_5_Command and Control; JCA_1.2.5_Lessons Learned; JCA_5.3_Planning; JCA_1.3.2_Personnel Management; JCA_4.6_Engineering; JCA_5.2.2_Develop Knowledge and Situational Awareness; JCA_5.2_Understand; JCA_5.4_Decide; JCA_1.2.6_Concepts; JCA_1.2.3_Educating; JCA_5.5.2_Task; JCA_5.5_Direct

Modernization Areas:

Autonomy; AI and Machine Learning

Communities of Interest:

Energy and Power Technologies

Descriptor(s):

*SPEECH RECOGNITION, *SPEECH, NEURAL NETS, HUMANS, THESES, REDUCTION, TIME, RESPONSE, SIGNALS, GAIN, SOUND, RECOGNITION, WORDS(LANGUAGE), VALUE, DYNAMIC PROGRAMMING, ITERATIONS, PERCEPTION(PSYCHOLOGY), INPUT, METHODOLOGY

Field(s)/Group(s):

Cybernetics, Voice Communications

Keyword(s):

*KOHONEN NEURAL NETWORKPROGRAMMING

Report Date:

1989 Dec 01