Towards real-world objective speech quality and intelligibility assessment using speech-enhancement residuals and convolutional long short-term memory networks

Terrain classification is a critical component of any autonomous mobile robot system operating in unknown real-world environments. Over the years, several proprioceptive terrain classification techniques have been introduced to increase robustness or act as a fallback for traditional vision based approaches. However, they lack widespread adaptation due to various factors that include inadequate accuracy, robustness and slow run-times. In this paper, we use vehicle-terrain interaction sounds as a proprioceptive modality and propose a deep long-short term memory based recurrent model that captures both the spatial and temporal dynamics of such a problem, thereby overcoming these past limitations. Our model consists of a new convolution neural network architecture that learns deep spatial features, complemented with long-short term memory units that learn complex temporal dynamics. Experiments on two extensive datasets collected with different microphones on various indoor and outdoor terrains demonstrate state-of-the-art performance compared to existing techniques. We additionally evaluate the performance in adverse acoustic conditions with high-ambient noise and propose a noise-aware training scheme that enables learning of more generalizable models that are essential for robust real-world deployments.

Download Full-text

A Deep Neural Network Model for Short-Term Load Forecast Based on Long Short-Term Memory Network and Convolutional Neural Network

Energies ◽

10.3390/en11123493 ◽

2018 ◽

Vol 11 (12) ◽

pp. 3493 ◽

Cited By ~ 60

Author(s):

Chujie Tian ◽

Jian Ma ◽

Chunhong Zhang ◽

Panpan Zhan

Keyword(s):

Neural Network ◽

Real World ◽

Deep Neural Network ◽

Short Term Memory ◽

Electrical Load ◽

Short Term ◽

Term Memory ◽

Load Forecast ◽

Proposed Model ◽

Long Short Term Memory

Accurate electrical load forecasting is of great significance to help power companies in better scheduling and efficient management. Since high levels of uncertainties exist in the load time series, it is a challenging task to make accurate short-term load forecast (STLF). In recent years, deep learning approaches provide better performance to predict electrical load in real world cases. The convolutional neural network (CNN) can extract the local trend and capture the same pattern, and the long short-term memory (LSTM) is proposed to learn the relationship in time steps. In this paper, a new deep neural network framework that integrates the hidden feature of the CNN model and the LSTM model is proposed to improve the forecasting accuracy. The proposed model was tested in a real-world case, and detailed experiments were conducted to validate its practicality and stability. The forecasting performance of the proposed model was compared with the LSTM model and the CNN model. The Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) were used as the evaluation indexes. The experimental results demonstrate that the proposed model can achieve better and stable performance in STLF.

Download Full-text

Speech enhancement using Long Short-Term Memory based recurrent Neural Networks for noise robust Speaker Verification

2016 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt.2016.7846281 ◽

2016 ◽

Cited By ~ 10

Author(s):

Morten Kolboek ◽

Zheng-Hua Tan ◽

Jesper Jensen

Keyword(s):

Neural Networks ◽

Speech Enhancement ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Speaker Verification ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Noise Robust

Download Full-text

A Two-Stage Big Data Analytics Framework with Real World Applications Using Spark Machine Learning and Long Short-Term Memory Network

Symmetry ◽

10.3390/sym10100485 ◽

2018 ◽

Vol 10 (10) ◽

pp. 485 ◽

Cited By ~ 6

Author(s):

Muhammad Ashfaq Khan ◽

Md. Rezaul Karim ◽

Yangwoo Kim

Keyword(s):

Machine Learning ◽

Big Data ◽

Real World ◽

Data Analytics ◽

Short Term Memory ◽

Big Data Analytics ◽

Short Term ◽

Two Stage ◽

Term Memory ◽

Long Short Term Memory

Every day we experience unprecedented data growth from numerous sources, which contribute to big data in terms of volume, velocity, and variability. These datasets again impose great challenges to analytics framework and computational resources, making the overall analysis difficult for extracting meaningful information in a timely manner. Thus, to harness these kinds of challenges, developing an efficient big data analytics framework is an important research topic. Consequently, to address these challenges by exploiting non-linear relationships from very large and high-dimensional datasets, machine learning (ML) and deep learning (DL) algorithms are being used in analytics frameworks. Apache Spark has been in use as the fastest big data processing arsenal, which helps to solve iterative ML tasks, using distributed ML library called Spark MLlib. Considering real-world research problems, DL architectures such as Long Short-Term Memory (LSTM) is an effective approach to overcoming practical issues such as reduced accuracy, long-term sequence dependency, and vanishing and exploding gradient in conventional deep architectures. In this paper, we propose an efficient analytics framework, which is technically a progressive machine learning technique merged with Spark-based linear models, Multilayer Perceptron (MLP) and LSTM, using a two-stage cascade structure in order to enhance the predictive accuracy. Our proposed architecture enables us to organize big data analytics in a scalable and efficient way. To show the effectiveness of our framework, we applied the cascading structure to two different real-life datasets to solve a multiclass and a binary classification problem, respectively. Experimental results show that our analytical framework outperforms state-of-the-art approaches with a high-level of classification accuracy.

Download Full-text

Multi-Scale Residual Convolutional Encoder Decoder with Bidirectional Long Short-Term Memory for Single Channel Speech Enhancement

2020 28th European Signal Processing Conference (EUSIPCO) ◽

10.23919/eusipco47968.2020.9287618 ◽

2021 ◽

Author(s):

Yang Xian ◽

Yang Sun ◽

Wenwu Wang ◽

Syed Mohsen Naqvi

Keyword(s):

Speech Enhancement ◽

Short Term Memory ◽

Single Channel ◽

Short Term ◽

Term Memory ◽

Multi Scale ◽

Long Short Term Memory ◽

Convolutional Encoder

Download Full-text