Towards improving the performance of language identification system for Indian languages

Language identification system (LID) is a system which automatically recognises the languages of short-term duration of unknown utterance of human beings. It recognises the discriminate features and reveals the language of utterance that belongs to. In this paper, we consider concatenated feature vectors of Mel Frequency Cepstral Coefficients (MFCC) and Pitch for designing LID. We design a reference model one for each language using 14-dimensional feature vectors using Hidden Markov model (HMM) then evaluate against all reference models of listed languages. The likelihood value of test sample feature vectors given in the evaluation is considered to decide the language of unknown utterance of test speech sample. In this paper we consider seven Indian languages for the experimental set up and the performance of system is evaluated. The average performance of the system is 89.31% and 90.63% for three states and four states HMM for 3sec test speech utterances respectively and also it is also observed that the system gives significant results with 3sec test speech for four state HMM even though we follow simple procedure.

Download Full-text

A hierarchical language identification system for Indian languages

Digital Signal Processing ◽

10.1016/j.dsp.2011.11.008 ◽

2012 ◽

Vol 22 (3) ◽

pp. 544-553 ◽

Cited By ~ 27

Author(s):

S. Jothilakshmi ◽

V. Ramalingam ◽

S. Palanivel

Keyword(s):

Language Identification ◽

Identification System ◽

Indian Languages

Download Full-text

Development of Indian Spoken Language Identification System for Two Languages using MFCC Feature with Deep Neural Network

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g1014.0597s20 ◽

2020 ◽

Vol 9 (7S) ◽

pp. 43-45

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Language Translation ◽

Spoken Language ◽

Language Identification ◽

Classification Model ◽

Identification System ◽

Speech Sample ◽

Indian Languages ◽

Language Recognition

Language is the ability to communicate with any person. Approximate number of spoken languages are 6500 in the world. Different regions in a world have different languages spoken. Spoken language recognition is the process to identify the language spoken in a speech sample. Most of the spoken language identification is done on languages other than Indian. There are many applications to recognize a speech like spoken language translation in which the fundamental step is to recognize the language of the speaker. This system is specifically made to identify two Indian languages. The speech data of various news channels is used that is available online. The Mel Frequency Cepstral Coefficients (MFCC) feature is used to collect features from the speech sample because it provides a particular identity to the different classes of audio. The identification is done by using MFCC feature in the Deep Neural Network. The objective of this work is to improve the accuracy of the classification model. It is done by making changes in several layers of the Deep Neural Network.

Download Full-text

A GMM-BASED HIERARCHICAL AUTOMATIC LANGUAGE IDENTIFICATION SYSTEM FOR INDIAN LANGUAGES

Applied Artificial Intelligence ◽

10.1080/08839514.2012.687659 ◽

2012 ◽

Vol 26 (6) ◽

pp. 554-570

Author(s):

S. Jothilakshmi ◽

V. Ramalingam ◽

S. Palanivel

Keyword(s):

Language Identification ◽

Identification System ◽

Indian Languages

Download Full-text

Multilingual Speech Corpus in Low-Resource Eastern and Northeastern Indian Languages for Speaker and Language Identification

Circuits Systems and Signal Processing ◽

10.1007/s00034-021-01704-x ◽

2021 ◽

Author(s):

Joyanta Basu ◽

Soma Khan ◽

Rajib Roy ◽

Tapan Kumar Basu ◽

Swanirbhar Majumder

Keyword(s):

Language Identification ◽

Indian Languages ◽

Speech Corpus ◽

Low Resource

Download Full-text

Multiclass Spoken Language Identification for Indian Languages using Deep Learning

2020 IEEE Bombay Section Signature Conference (IBSSC) ◽

10.1109/ibssc51096.2020.9332161 ◽

2020 ◽

Author(s):

Lakshmana Rao Arla ◽

Sridevi Bonthu ◽

Abhinav Dayal

Keyword(s):

Deep Learning ◽

Spoken Language ◽

Language Identification ◽

Indian Languages

Download Full-text

Performance of Speaker Independent Language Identification System Under Various Noise Environments

Advances in Intelligent Systems and Computing - Information Systems Design and Intelligent Applications ◽

10.1007/978-81-322-2755-7_33 ◽

2016 ◽

pp. 315-320

Author(s):

Phani Kumar Polasi ◽

K. Sri Rama Krishna

Keyword(s):

Language Identification ◽

Identification System ◽

Speaker Independent

Download Full-text

Comparative analysis on the use of features and models for validating language identification system

2017 International Conference on Inventive Computing and Informatics (ICICI) ◽

10.1109/icici.2017.8365224 ◽

2017 ◽

Cited By ~ 2

Author(s):

A. Revathi ◽

C. Jeyalakshmi

Keyword(s):

Comparative Analysis ◽

Language Identification ◽

Identification System

Download Full-text

Performance Evaluation of Language Identification on Emotional Speech Corpus of Three Indian Languages

Intelligence Enabled Research - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-15-9290-4_6 ◽

2020 ◽

pp. 55-63

Author(s):

Joyanta Basu ◽

Swanirbhar Majumder

Keyword(s):

Performance Evaluation ◽

Language Identification ◽

Emotional Speech ◽

Indian Languages ◽

Speech Corpus

Download Full-text

Deep neural network based two-stage Indian language identification system using glottal closure instants as anchor points

Journal of King Saud University - Computer and Information Sciences ◽

10.1016/j.jksuci.2019.07.001 ◽

2019 ◽

Author(s):

Chuya China Bhanja ◽

Mohammad Azharuddin Laskar ◽

Rabul Hussain Laskar ◽

Sivaji Bandyopadhyay

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Language Identification ◽

Identification System ◽

Two Stage ◽

Indian Language ◽

Glottal Closure Instants ◽

Glottal Closure ◽

Anchor Points

Download Full-text