perceptual linear prediction Latest Research Papers

Text Formatting Based on Keyword Detection

International Journal For Innovative Engineering and Management Research ◽

10.48047/ijiemr/v09/i12/115 ◽

2020 ◽

pp. 695-700

Keyword(s):

Medical Records ◽

Linear Prediction ◽

Personal Health Record ◽

Drug Effects ◽

Personal Health ◽

Electronic Prescription ◽

Care Givers ◽

Wrong Drug ◽

Patient Will ◽

Perceptual Linear Prediction

Adverse drug effects are a major cause of death across the world each year because of prescription errors. Many of such errors involve the administration of the wrong drug or dosage by care givers to patients due to indecipherable handwritings, drug interactions, confusing drug names etc. The adoption of voice-based prescription project could eliminate some of these errors because they allow prescription information to be captured and heard through voice response rather than in the physician’s handwriting. Our project will generate an electronic prescription using a “Speech to Text converter” (Perceptual Linear Prediction (PLP)) and capture the data from the keywords spoken by doctor(s). There won’t be any need to carry paper prescriptions on revisiting doctors. A patient will be able to share his historic medical records to a new doctor. This project also provide facility to sign the prescription and send to the patient directly on his phone and email id. The System enables the patient to manage the privacy of their personal health record. This project is proposed to target those doctors and clinics that are still using paper-based handwritten prescriptions

Download Full-text

Qualitative Analysis of PLP in LSTM for Bangla Speech Recognition

The International journal of Multimedia & Its Applications ◽

10.5121/ijma.2020.12501 ◽

2020 ◽

Vol 12 (5) ◽

pp. 1-8

Author(s):

Nahyan Al Mahmud ◽

Shahfida Amjad Munni

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Linear Prediction ◽

Short Term Memory ◽

Acoustic Features ◽

Linear Predictive Coding ◽

Acoustic Feature ◽

Mel Frequency Cepstral Coefficients ◽

Bhattacharyya Distance ◽

Perceptual Linear Prediction

The performance of various acoustic feature extraction methods has been compared in this work using Long Short-Term Memory (LSTM) neural network in a Bangla speech recognition system. The acoustic features are a series of vectors that represents the speech signals. They can be classified in either words or sub word units such as phonemes. In this work, at first linear predictive coding (LPC) is used as acoustic vector extraction technique. LPC has been chosen due to its widespread popularity. Then other vector extraction techniques like Mel frequency cepstral coefficients (MFCC) and perceptual linear prediction (PLP) have also been used. These two methods closely resemble the human auditory system. These feature vectors are then trained using the LSTM neural network. Then the obtained models of different phonemes are compared with different statistical tools namely Bhattacharyya Distance and Mahalanobis Distance to investigate the nature of those acoustic features.

Download Full-text

Automatic Content based Classification of Speech Audio using Multiple Instance Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e5616.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 410-414

Keyword(s):

Linear Prediction ◽

Research Problem ◽

Multiple Instance Learning ◽

Audio Classification ◽

Mel Frequency Cepstral Coefficients ◽

Content Understanding ◽

Novel Approach ◽

Active Research ◽

Perceptual Linear Prediction

Audio content understanding is an active research problem in the area of speech analytics. A novel approach for content-based news audio classification using Multiple Instance Learning (MIL) approach is introduced in this paper. Content-based analysis provides useful information for audio classification as well as segmentation. A key step taken in this direction is to propose a classifier that can predict the category of the input audio sample. There are two types of features used for audio content detection, namely, Perceptual Linear Prediction (PLP) coefficients and Mel-Frequency Cepstral Coefficients (MFCC). Two MIL techniques viz. mi-Graph and mi-SVM are used for classification purpose. The results obtained using these methods are evaluated using different performance matrices. From the experimental results, it is marked that the MIL demonstrates excellent audio classification capability.

Download Full-text

Perceptual Linear Prediction Feature as an Indicator of Dysphonia

Lecture Notes in Electrical Engineering - Advances in Control Instrumentation Systems ◽

10.1007/978-981-15-4676-1_5 ◽

2020 ◽

pp. 51-64 ◽

Cited By ~ 1

Author(s):

Jennifer C. Saldanha ◽

Malini Suvarna

Keyword(s):

Linear Prediction ◽

Perceptual Linear Prediction

Download Full-text

An Encapsulation of Vital Non-Linear Frequency Features for Various Speech Applications

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.8666 ◽

2020 ◽

Vol 17 (1) ◽

pp. 303-307

Author(s):

S. Lalitha ◽

Deepa Gupta

Keyword(s):

Speech Recognition ◽

Linear Prediction ◽

Performance Metrics ◽

Speaker Identification ◽

Frequency Estimation ◽

Mel Frequency Cepstral Coefficients ◽

Environment Type ◽

Perceptual Linear Prediction ◽

Frequency Features ◽

Selection Of

Mel Frequency Cepstral Coefficients (MFCCs) and Perceptual linear prediction coefficients (PLPCs) are widely casted nonlinear vocal parameters in majority of the speaker identification, speaker and speech recognition techniques as well in the field of emotion recognition. Post 1980s, significant exertions are put forth on for the progress of these features. Considerations like the usage of appropriate frequency estimation approaches, proposal of appropriate filter banks, and selection of preferred features perform a vital part for the strength of models employing these features. This article projects an overview of MFCC and PLPC features for different speech applications. The insights such as performance metrics of accuracy, background environment, type of data, and size of features are inspected and concise with the corresponding key references. Adding more to this, the advantages and shortcomings of these features have been discussed. This background work will hopefully contribute to floating a heading step in the direction of the enhancement of MFCC and PLPC with respect to novelty, raised levels of accuracy, and lesser complexity.

Download Full-text

Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition

Cognitive Analytics ◽

10.4018/978-1-7998-2460-2.ch015 ◽

2020 ◽

pp. 283-293

Author(s):

Imen Trabelsi ◽

Med Salim Bouhlel

Keyword(s):

Emotion Recognition ◽

Linear Prediction ◽

Recognition Rate ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Recognition System ◽

Speech Emotion Recognition ◽

Support Vector ◽

Emotional States ◽

Perceptual Linear Prediction

Automatic Speech Emotion Recognition (SER) is a current research topic in the field of Human Computer Interaction (HCI) with a wide range of applications. The purpose of speech emotion recognition system is to automatically classify speaker's utterances into different emotional states such as disgust, boredom, sadness, neutral, and happiness. The speech samples in this paper are from the Berlin emotional database. Mel Frequency cepstrum coefficients (MFCC), Linear prediction coefficients (LPC), linear prediction cepstrum coefficients (LPCC), Perceptual Linear Prediction (PLP) and Relative Spectral Perceptual Linear Prediction (Rasta-PLP) features are used to characterize the emotional utterances using a combination between Gaussian mixture models (GMM) and Support Vector Machines (SVM) based on the Kullback-Leibler Divergence Kernel. In this study, the effect of feature type and its dimension are comparatively investigated. The best results are obtained with 12-coefficient MFCC. Utilizing the proposed features a recognition rate of 84% has been achieved which is close to the performance of humans on this database.

Download Full-text

Feature Extraction Methods Proposed for Speech Recognition Are Effective on Road Condition Monitoring Using Smartphone Inertial Sensors

Sensors ◽

10.3390/s19163481 ◽

2019 ◽

Vol 19 (16) ◽

pp. 3481 ◽

Cited By ~ 1

Author(s):

Frederico Soares Cabral ◽

Hidekazu Fukai ◽

Satoshi Tamura

Keyword(s):

Feature Extraction ◽

Speech Recognition ◽

Condition Monitoring ◽

Linear Prediction ◽

Machine Learning Techniques ◽

Feature Extraction Method ◽

Extraction Step ◽

Road Condition ◽

Road Condition Monitoring ◽

Perceptual Linear Prediction

The objective of our project is to develop an automatic survey system for road condition monitoring using smartphone devices. One of the main tasks of our project is the classification of paved and unpaved roads. Assuming recordings will be archived by using various types of vehicle suspension system and speeds in practice, hence, we use the multiple sensors found in smartphones and state-of-the-art machine learning techniques for signal processing. Despite usually not being paid much attention, the results of the classification are dependent on the feature extraction step. Therefore, we have to carefully choose not only the classification method but also the feature extraction method and their parameters. Simple statistics-based features are most commonly used to extract road surface information from acceleration data. In this study, we evaluated the mel-frequency cepstral coefficient (MFCC) and perceptual linear prediction coefficients (PLP) as a feature extraction step to improve the accuracy for paved and unpaved road classification. Although both MFCC and PLP have been developed in the human speech recognition field, we found that modified MFCC and PLP can be used to improve the commonly used statistical method.

Download Full-text

A Speech to Machine Interface Based on Perceptual Linear Prediction and Classification

2019 Advances in Science and Engineering Technology International Conferences (ASET) ◽

10.1109/icaset.2019.8714304 ◽

2019 ◽

Cited By ~ 4

Author(s):

Saeed Mian Qaisar ◽

Noofa Hainmad ◽

Raviha Khan ◽

Rawan Asfour

Keyword(s):

Linear Prediction ◽

Machine Interface ◽

Perceptual Linear Prediction

Download Full-text

Speech/music classification using PLP and SVM

International Journal Of Engineering And Computer Science ◽

10.18535/ijecs.v8i02.4277 ◽

2019 ◽

Vol 8 (02) ◽

pp. 24469-24472

Author(s):

Thiruven Gatanadhan R

Keyword(s):

Feature Extraction ◽

Linear Prediction ◽

Classification Problem ◽

Vector Model ◽

Support Vector ◽

Audio Classification ◽

Music Classification ◽

Audio Retrieval ◽

Audio Indexing ◽

Perceptual Linear Prediction

Automatic audio classification is very useful in audio indexing; content based audio retrieval and online audio distribution. This paper deals with the Speech/Music classification problem, starting from a set of features extracted directly from audio data. Automatic audio classification is very useful in audio indexing; content based audio retrieval and online audio distribution. The accuracy of the classification relies on the strength of the features and classification scheme. In this work Perceptual Linear Prediction (PLP) features are extracted from the input signal. After feature extraction, classification is carried out, using Support Vector Model (SVM) model. The proposed feature extraction and classification models results in better accuracy in speech/music classification.

Download Full-text

Optimizing Integrated Features for Hindi Automatic Speech Recognition System

Journal of Intelligent Systems ◽

10.1515/jisys-2018-0057 ◽

2018 ◽

Vol 29 (1) ◽

pp. 959-976

Author(s):

Mohit Dua ◽

Rajesh Kumar Aggarwal ◽

Mantosh Biswas

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Linear Prediction ◽

Optimization Methods ◽

Recognition System ◽

Extraction Methods ◽

Automatic Speech Recognition System ◽

Sequential Combination ◽

Perceptual Linear Prediction ◽

Asr System

Abstract An automatic speech recognition (ASR) system translates spoken words or utterances (isolated, connected, continuous, and spontaneous) into text format. State-of-the-art ASR systems mainly use Mel frequency (MF) cepstral coefficient (MFCC), perceptual linear prediction (PLP), and Gammatone frequency (GF) cepstral coefficient (GFCC) for extracting features in the training phase of the ASR system. Initially, the paper proposes a sequential combination of all three feature extraction methods, taking two at a time. Six combinations, MF-PLP, PLP-MFCC, MF-GFCC, GF-MFCC, GF-PLP, and PLP-GFCC, are used, and the accuracy of the proposed system using all these combinations was tested. The results show that the GF-MFCC and MF-GFCC integrations outperform all other proposed integrations. Further, these two feature vector integrations are optimized using three different optimization methods, particle swarm optimization (PSO), PSO with crossover, and PSO with quadratic crossover (Q-PSO). The results demonstrate that the Q-PSO-optimized GF-MFCC integration show significant improvement over all other optimized combinations.

Download Full-text

perceptual linear prediction
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Text Formatting Based on Keyword Detection

Qualitative Analysis of PLP in LSTM for Bangla Speech Recognition

Automatic Content based Classification of Speech Audio using Multiple Instance Learning

Perceptual Linear Prediction Feature as an Indicator of Dysphonia

An Encapsulation of Vital Non-Linear Frequency Features for Various Speech Applications

Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition

Feature Extraction Methods Proposed for Speech Recognition Are Effective on Road Condition Monitoring Using Smartphone Inertial Sensors

A Speech to Machine Interface Based on Perceptual Linear Prediction and Classification

Speech/music classification using PLP and SVM

Optimizing Integrated Features for Hindi Automatic Speech Recognition System

Export Citation Format

perceptual linear predictionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Text Formatting Based on Keyword Detection

Qualitative Analysis of PLP in LSTM for Bangla Speech Recognition

Automatic Content based Classification of Speech Audio using Multiple Instance Learning

Perceptual Linear Prediction Feature as an Indicator of Dysphonia

An Encapsulation of Vital Non-Linear Frequency Features for Various Speech Applications

Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition

Feature Extraction Methods Proposed for Speech Recognition Are Effective on Road Condition Monitoring Using Smartphone Inertial Sensors

A Speech to Machine Interface Based on Perceptual Linear Prediction and Classification

Speech/music classification using PLP and SVM

Optimizing Integrated Features for Hindi Automatic Speech Recognition System

perceptual linear prediction
Recently Published Documents