Two-level optimization approach with accelerated proximal gradient for objective measures in sparse speech reconstruction

Hai Huyen Dam; Siow Yong Low; Sven Nordholm

doi:10.3934/jimo.2021131

Two-level optimization approach with accelerated proximal gradient for objective measures in sparse speech reconstruction

Journal of Industrial & Management Optimization ◽

10.3934/jimo.2021131 ◽

2021 ◽

Vol 0 (0) ◽

pp. 0

Author(s):

Hai Huyen Dam ◽

Siow Yong Low ◽

Sven Nordholm

Keyword(s):

Speech Enhancement ◽

Quality Measures ◽

Optimization Method ◽

Optimization Approach ◽

Reconstruction Process ◽

Time Frequency ◽

Score Improvement ◽

Frequency Representation ◽

Objective Quality ◽

Speech Reconstruction

<p style='text-indent:20px;'>Compressive speech enhancement makes use of the sparseness of speech and the non-sparseness of noise in time-frequency representation to perform speech enhancement. However, reconstructing the sparsest output may not necessarily translate to a good enhanced speech signal as speech distortion may be at risk. This paper proposes a two level optimization approach to incorporate objective quality measures in compressive speech enhancement. The proposed method combines the accelerated proximal gradient approach and a global one dimensional optimization method to solve the sparse reconstruction. By incorporating objective quality measures in the optimization process, the reconstructed output is not only sparse but also maintains the highest objective quality score possible. In other words, the sparse speech reconstruction process is now quality sparse speech reconstruction. Experimental results in a compressive speech enhancement consistently show score improvement in objectives measures in different noisy environments compared to the non-optimized method. Additionally, the proposed optimization yields a higher convergence rate with a lower computational complexity compared to the existing methods.</p>

Download Full-text

Evaluation of Objective Quality Measures for Speech Enhancement

IEEE Transactions on Audio Speech and Language Processing ◽

10.1109/tasl.2007.911054 ◽

2008 ◽

Vol 16 (1) ◽

pp. 229-238 ◽

Cited By ~ 791

Author(s):

Yi Hu ◽

Philipos C. Loizou

Keyword(s):

Speech Enhancement ◽

Quality Measures ◽

Objective Quality

Download Full-text

Telugu Speech Enhancement In Terms Of Objective Quality Measures Using Discrete Wavelet Transform With Hybrid Thresholding

International Journal of Advanced Research in Electrical Electronics and Instrumentation Engineering ◽

10.15662/ijareeie.2014.0308041 ◽

2014 ◽

Vol 03 (08) ◽

pp. 11234-11247

Author(s):

V. Harika ◽

A.SubbaRami Reddy ◽

S.China Venkateswarlu

Keyword(s):

Wavelet Transform ◽

Discrete Wavelet Transform ◽

Speech Enhancement ◽

Quality Measures ◽

Discrete Wavelet ◽

Objective Quality

Download Full-text

Wiener Filtering in Frequency Domain to Enhance Speech Corrupted by Colored Noise

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1179.0982s1119 ◽

2019 ◽

Vol 8 (2S11) ◽

pp. 1058-1062

Keyword(s):

Frequency Domain ◽

Performance Measures ◽

Speech Enhancement ◽

Colored Noise ◽

Quality Measures ◽

Wiener Filtering ◽

Spectral Distortion ◽

Gain Function ◽

Objective Quality ◽

Log Likelihood

This paper presents a method for speech enhancement to predict speech quality in presence of highly non-stationary scenarios using basic wiener filtering in frequency domain with an adaptive gain function under eight different noises at three different ranges of input SNR. Its performance is evaluated in terms of objective quality measures like LPC based spectral distortion measures are Cepstrum Distance, Itakura Saito and Log Likelihood Ratio. This method was tested using Noizeous database, its performance measures were compared against spectral subtractive type algorithms and it shows its improvements in terms of objective quality measures.

Download Full-text

A Feature Optimization Approach Based on Inter-Class and Intra-Class Distance for Ship Type Classification

Sensors ◽

10.3390/s20185429 ◽

2020 ◽

Vol 20 (18) ◽

pp. 5429

Author(s):

Chen Li ◽

Ziyuan Liu ◽

Jiawei Ren ◽

Wenchao Wang ◽

Ji Xu

Keyword(s):

Deep Learning ◽

Objective Function ◽

State Of The Art ◽

Optimization Method ◽

Classification Algorithms ◽

Optimization Approach ◽

Time Frequency ◽

Real Environment ◽

Feature Optimization ◽

Type Classification

Deep learning based methods have achieved state-of-the-art results on the task of ship type classification. However, most existing ship type classification algorithms take time–frequency (TF) features as input, the underlying discriminative information of these features has not been explored thoroughly. This paper proposes a novel feature optimization method which is designed to minimize an objective function aimed at increasing inter-class and reducing intra-class feature distance for ship type classification. The objective function we design is able to learn a center for each class and make samples from the same class closer to the corresponding center. This ensures that the features maximize underlying discriminative information involved in the data, particularly for some targets that usually confused by the conventional manual designed feature. Results on the dataset from a real environment show that the proposed feature optimization approach outperforms traditional TF features.

Download Full-text

Time-Frequency Representation Based on Robust Local Mean Decomposition

Volume 14: Emerging Technologies; Materials: Genetics to Structures; Safety Engineering and Risk Analysis ◽

10.1115/imece2016-65184 ◽

2016 ◽

Cited By ~ 1

Author(s):

Zhiliang Liu ◽

Yaqiang Jin ◽

Ming J. Zuo

Keyword(s):

Boundary Condition ◽

Moving Average ◽

Optimization Approach ◽

Stopping Criterion ◽

Improve Performance ◽

Subset Size ◽

Time Frequency ◽

Fixed Subset ◽

Frequency Representation ◽

Local Mean

Fourier transform based frequency representation makes an underlying assumption of stationarity and linearity for the target signal whose spectrum is to be computed, and thus it is unable to track time varying characteristics of non-stationary signals that also widely exist in the physical world. Time-frequency representation (TFR) is a technique to reveal useful information included in the signals, and thus the TFR methods are very attractive to the scientific and engineering world. Local mean decomposition (LMD) is a TFR technique used in many fields, e.g. machinery fault diagnosis. Similar to Hilbert-Huang transform, it is an alternative approach to demodulate amplitude-modulation (AM) and frequency-modulation (FM) signals into a set of components, each of which is the product of an instantaneous envelope signal and a pure FM signal. TFR can then be derived by the instantaneous envelope signal and the pure FM signal. However, LMD based TFR technique still has two limitations, i.e. the end effect and the mode mixing problems. Solutions for the two limitations greatly depend on three critical parameters of LMD that are boundary condition, envelope estimation, and sifting stopping criterion. Most reported studies aiming to improve performance of LMD have focused on only one parameter a time, and thus they ignore the fact that the three parameters are not independent to each other, and all of them are needed to address the end effect and the mode mixing problems in LMD. In this paper, a robust optimization approach is proposed to improve performance of LMD through an integrated framework of parameter selection in terms of boundary condition, envelope estimation, and sifting stopping criterion. The proposed optimization approach includes three components. First, the mirror extending method is employed to deal with the boundary condition problem. Second, moving average is used as the smooth algorithm for envelope estimation of local mean and local magnitude in LMD. The fixed subset size is the only parameter that usually needs to be predefined with a prior knowledge. In this step, a self-adaptive method based on the statistics theory is proposed to automatically determine a fixed subset size of moving average for accurate envelope estimation. Third, based on the first and the second steps, a soft sifting stopping criterion is proposed to enable LMD to achieve a self-adaptive stop for each sifting process. In this last step, we define an objective function that considers both global and local characteristics of a target signal. Based on the objective function, a heuristic mechanism is proposed to automatically determine the optimal number of sifting iterations in the sifting process. Finally, numerical simulation results show the effectiveness of the robust LMD in terms of mining time-frequency representation information.

Download Full-text

An Overview of Subjective and Objective Quality Measures for Noisy Speech Enhancement Algorithms

IETE Technical Review ◽

10.4103/0256-4602.83550 ◽

2011 ◽

Vol 28 (4) ◽

pp. 292 ◽

Cited By ~ 7

Author(s):

P Krishnamoorthy

Keyword(s):

Speech Enhancement ◽

Quality Measures ◽

Noisy Speech ◽

Objective Quality

Download Full-text

Performance on Speech Enhancement Objective Quality Measures Using Hybrid Wavelet Thresholding

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f9343.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 3523-3533

Keyword(s):

Speech Enhancement ◽

Speech Signal ◽

Quality Measures ◽

Wavelet Thresholding ◽

Wavelet Coefficients ◽

Hard Thresholding ◽

Improvement Set ◽

Improvement Sets ◽

Objective Quality

The quality of being easily understandable of the spe ech signals are very importantin communication and other speec h related systems. In order to improve these two in thespeech sign al, Speech improvement sets of computer instructions and devices are used so that itmay be better fully used by other speech proces sing setsof computer instructions. Most of the speech communica tion that requires atleast one microphone and the desired speech signal is usually contaminated by backgroundnoise and echo. As a result, the speech sign must be "cleaned" with advanced sign preparing devices before it is played out, transmitted, or put away. In this venture it has been investigated the required things and degree of upgrades in the field of discourse improvement utilizing discourse de-noising sets of PC directions announced in books with the fundamental intend to concentrate on the utilization of the window shape limits/rules in STSA based Speech Improvement process in which the sign destroyed by commotion is into edges and each part/segment is Windowed and the Windowed Speech pieces/parts zone connected to the Speech Improvement set of PC guidelines and the Improved Speech sign is modified in its time area. In general, the Speech Improvement methods make use theHam ming Window for this purpose. In this work an attempt has been made to study the effect of Window shape on the Speech. The Modified Improved thresholding is proposed by Asser Ghanbari and Mohammad Reza Karami and can be used like a hard thresholding limit with respect to the wavelet coefficients through and through worth progressively conspicuous than limit esteem and resembles an exponential capacity for the wavelet coefficients supreme worth not as much as edge esteem and is characterized.

Download Full-text

Multi Band Spectral Subtraction for Speech Enhancement with Different Frequency Spacing Methods and their Effect on Objective Quality Measures

International Journal of Image Graphics and Signal Processing ◽

10.5815/ijigsp.2019.05.06 ◽

2019 ◽

Vol 11 (5) ◽

pp. 54-62

Author(s):

P. Sunitha ◽

◽

K.Satya Prasad

Keyword(s):

Speech Enhancement ◽

Quality Measures ◽

Spectral Subtraction ◽

Frequency Spacing ◽

Objective Quality ◽

Multi Band

Download Full-text

Wiener Gain and Deep Neural Networks: A Well-Balanced Pair For Speech Enhancement

10.21203/rs.3.rs-900751/v1 ◽

2021 ◽

Author(s):

Dayana Ribas ◽

Antonio Miguel ◽

Alfonso Ortega ◽

Eduardo Lleida

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network ◽

Deep Neural Networks ◽

State Of The Art ◽

Quality Measures ◽

Spectral Amplitude ◽

Gain Function ◽

Estimation Algorithms ◽

Objective Quality

Abstract This paper proposes a Deep Neural Network (DNN)-based Wiener gain estimator for speech enhancement. The proposal is in the framework of the classical spectral-domain speech enhancement algorithms. In this case, we used the Optimal Modified Log-Spectral Amplitude (OMLSA), but consider that this proposal could fit many alternative speech estimation algorithms. We determined the best usage of the DNN approach at learning a robust instance of the Wiener gain estimator according to the characteristics of the SNR estimation and the gain function. To design a DNN architecture adjusted for the speech enhancement task, we study various configuration issues frequently used in DNN-based solutions, including speech representations, residual connections, and causal vs. non-causal designs. Thus, we provide conclusions for the use of DNN architectures with the enhancement purpose. Experiments show that the proposal provides results on the state-of-the-art. But beyond the objective quality measures, there are examples of noisy vs. enhanced speech available for listening to demonstrate in practice the skills of the method in real audio.

Download Full-text

The Application of the Combination of Monte Carlo Optimization Method based QSAR Modeling and Molecular Docking in Drug Design and Development

Mini-Reviews in Medicinal Chemistry ◽

10.2174/1389557520666200212111428 ◽

2020 ◽

Vol 20 (14) ◽

pp. 1389-1402 ◽

Cited By ~ 1

Author(s):

Maja Zivkovic ◽

Marko Zlatanovic ◽

Nevena Zlatanovic ◽

Mladjan Golubović ◽

Aleksandar M. Veselinović

Keyword(s):

Monte Carlo ◽

Molecular Docking ◽

Computer Calculation ◽

Molecular Graph ◽

Optimization Method ◽

Docking Studies ◽

Optimization Approach ◽

Qsar Modeling ◽

Monte Carlo Optimization ◽

Molecular Docking Studies

In recent years, one of the promising approaches in the QSAR modeling Monte Carlo optimization approach as conformation independent method, has emerged. Monte Carlo optimization has proven to be a valuable tool in chemoinformatics, and this review presents its application in drug discovery and design. In this review, the basic principles and important features of these methods are discussed as well as the advantages of conformation independent optimal descriptors developed from the molecular graph and the Simplified Molecular Input Line Entry System (SMILES) notation compared to commonly used descriptors in QSAR modeling. This review presents the summary of obtained results from Monte Carlo optimization-based QSAR modeling with the further addition of molecular docking studies applied for various pharmacologically important endpoints. SMILES notation based optimal descriptors, defined as molecular fragments, identified as main contributors to the increase/ decrease of biological activity, which are used further to design compounds with targeted activity based on computer calculation, are presented. In this mini-review, research papers in which molecular docking was applied as an additional method to design molecules to validate their activity further, are summarized. These papers present a very good correlation among results obtained from Monte Carlo optimization modeling and molecular docking studies.

Download Full-text