INFORMATION FUSION BASED APPROACH FOR INDIAN LANGUAGE IDENTIFICATION

Author: KANIKA GUPTA
Date: 2021-08-27
Report no: IIIT/TH/2021/106
Advisor:Suryakanth V Gangashetty

Abstract

Language identification (LID) refers to the task of figuring out the language in a given speech utterance. The task of Language Identification is very important in development of spoken dialogue system especially for multilingual society like India. In this work, an implicit approach has been used for the task of language identification. Combination of MFCC feature vector and RCC feature vector has been explored for Indian Language identification (LID) task. MFCC represents the vocal tract information and RCC represent the excitation source information which are complementary to each other. To combine these two complementary features different fusion based approaches has been used. Decision level fusion based approach and feature level fusion based approach has been explored for this task. Deep neural network (DNN) framework has been used as a classifier. DNN models are very popular for the task of speaker identification and other speech signal processing related task. In decision level fusion different fusion techniques such as Min, Max, Average, and Min-Max has been used. In feature level fusion approach canonical correlation analysis(CCA) has been performed to combine MFCC feature vector and RCC feature vector. In this algorithm only one DNN classifier is trained which has improved computation time and accuracy. Several experiments has been performed on 13 Indian languages data set. This data set consists of 120 hours of training data and 30 hours of testing data. Here, EER is equal error rate which is used as the performance metric. Lower the value of EER, higher the accuracy of LID performance. Results are compared with state of art I-vector based approach. Decision level fusion based approach gives an equal error rate of 9.64% as compared to an equal error rate of 10.18% of I-vector based approach. CCA based feature level fusion gives an equal error rate of 9.19% as compared to average EER of 9.64% of decision level fusion approach. From the results it is proved that Fusion based approaches performed better as compared to the individual feature vector based approaches.

Full thesis: pdf

Centre for Language Technologies Research Centre

IIIT Hyderabad Publications

INFORMATION FUSION BASED APPROACH FOR INDIAN LANGUAGE IDENTIFICATION

Abstract