Discriminating nasals and approximants in English language using zero time windowing

Authors: Ravi Shankar Prasad,Sudarsana Reddy kadiri,Suryakanth V Gangashetty, B Yegnanarayana
Conference: Annual Conference of the International Speech Communication Association 2018 (INTERSPEECH-2018 2018)
Location Hyderabad, India
Date: 2018-09-02
Report no: IIIT/TR/2018/104

Abstract

Nasals and approximants consonants are often confused with each other. Despite the distinction in the production mechanism, these two sound classes exhibit a similar low frequency behavior, and lack significant high frequency content. The present study uses a spectral representation obtained using the zero time windowing (ZTW) analysis of speech, for the task of distinction between these two. The instantaneous spectral representation has good resolution at resonances, which helps to highlight the difference in the acoustic vocal tract system response for these sounds. The ZTW spectra around the regions of glottal closure instants are averaged to derive parameters for their classification in continuous speech. A set of parameters based on the dominant resonances, center of gravity, band energy ratio, and cumulative spectral sum in low frequencies, is derived from the average spectrum. The paper proposes classification using a knowledge–based approach and training a support vector machine. These classifiers are tested on utterances from different English speakers in the TIMIT dataset. The proposed methods result in an average classification accuracy of 90% between the two classes in continuous speech.

Full paper: pdf

Centre for Language Technologies Research Centre

IIIT Hyderabad Publications

Discriminating nasals and approximants in English language using zero time windowing

Abstract