scholarly journals Two-level optimization approach with accelerated proximal gradient for objective measures in sparse speech reconstruction

2021 ◽  
Vol 0 (0) ◽  
pp. 0
Author(s):  
Hai Huyen Dam ◽  
Siow Yong Low ◽  
Sven Nordholm

<p style='text-indent:20px;'>Compressive speech enhancement makes use of the sparseness of speech and the non-sparseness of noise in time-frequency representation to perform speech enhancement. However, reconstructing the sparsest output may not necessarily translate to a good enhanced speech signal as speech distortion may be at risk. This paper proposes a two level optimization approach to incorporate objective quality measures in compressive speech enhancement. The proposed method combines the accelerated proximal gradient approach and a global one dimensional optimization method to solve the sparse reconstruction. By incorporating objective quality measures in the optimization process, the reconstructed output is not only sparse but also maintains the highest objective quality score possible. In other words, the sparse speech reconstruction process is now quality sparse speech reconstruction. Experimental results in a compressive speech enhancement consistently show score improvement in objectives measures in different noisy environments compared to the non-optimized method. Additionally, the proposed optimization yields a higher convergence rate with a lower computational complexity compared to the existing methods.</p>

2019 ◽  
Vol 8 (2S11) ◽  
pp. 1058-1062

This paper presents a method for speech enhancement to predict speech quality in presence of highly non-stationary scenarios using basic wiener filtering in frequency domain with an adaptive gain function under eight different noises at three different ranges of input SNR. Its performance is evaluated in terms of objective quality measures like LPC based spectral distortion measures are Cepstrum Distance, Itakura Saito and Log Likelihood Ratio. This method was tested using Noizeous database, its performance measures were compared against spectral subtractive type algorithms and it shows its improvements in terms of objective quality measures.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5429
Author(s):  
Chen Li ◽  
Ziyuan Liu ◽  
Jiawei Ren ◽  
Wenchao Wang ◽  
Ji Xu

Deep learning based methods have achieved state-of-the-art results on the task of ship type classification. However, most existing ship type classification algorithms take time–frequency (TF) features as input, the underlying discriminative information of these features has not been explored thoroughly. This paper proposes a novel feature optimization method which is designed to minimize an objective function aimed at increasing inter-class and reducing intra-class feature distance for ship type classification. The objective function we design is able to learn a center for each class and make samples from the same class closer to the corresponding center. This ensures that the features maximize underlying discriminative information involved in the data, particularly for some targets that usually confused by the conventional manual designed feature. Results on the dataset from a real environment show that the proposed feature optimization approach outperforms traditional TF features.


Author(s):  
Zhiliang Liu ◽  
Yaqiang Jin ◽  
Ming J. Zuo

Fourier transform based frequency representation makes an underlying assumption of stationarity and linearity for the target signal whose spectrum is to be computed, and thus it is unable to track time varying characteristics of non-stationary signals that also widely exist in the physical world. Time-frequency representation (TFR) is a technique to reveal useful information included in the signals, and thus the TFR methods are very attractive to the scientific and engineering world. Local mean decomposition (LMD) is a TFR technique used in many fields, e.g. machinery fault diagnosis. Similar to Hilbert-Huang transform, it is an alternative approach to demodulate amplitude-modulation (AM) and frequency-modulation (FM) signals into a set of components, each of which is the product of an instantaneous envelope signal and a pure FM signal. TFR can then be derived by the instantaneous envelope signal and the pure FM signal. However, LMD based TFR technique still has two limitations, i.e. the end effect and the mode mixing problems. Solutions for the two limitations greatly depend on three critical parameters of LMD that are boundary condition, envelope estimation, and sifting stopping criterion. Most reported studies aiming to improve performance of LMD have focused on only one parameter a time, and thus they ignore the fact that the three parameters are not independent to each other, and all of them are needed to address the end effect and the mode mixing problems in LMD. In this paper, a robust optimization approach is proposed to improve performance of LMD through an integrated framework of parameter selection in terms of boundary condition, envelope estimation, and sifting stopping criterion. The proposed optimization approach includes three components. First, the mirror extending method is employed to deal with the boundary condition problem. Second, moving average is used as the smooth algorithm for envelope estimation of local mean and local magnitude in LMD. The fixed subset size is the only parameter that usually needs to be predefined with a prior knowledge. In this step, a self-adaptive method based on the statistics theory is proposed to automatically determine a fixed subset size of moving average for accurate envelope estimation. Third, based on the first and the second steps, a soft sifting stopping criterion is proposed to enable LMD to achieve a self-adaptive stop for each sifting process. In this last step, we define an objective function that considers both global and local characteristics of a target signal. Based on the objective function, a heuristic mechanism is proposed to automatically determine the optimal number of sifting iterations in the sifting process. Finally, numerical simulation results show the effectiveness of the robust LMD in terms of mining time-frequency representation information.


The quality of being easily understandable of the spe ech signals are very importantin communication and other speec h related systems. In order to improve these two in thespeech sign al, Speech improvement sets of computer instructions and devices are used so that itmay be better fully used by other speech proces sing setsof computer instructions. Most of the speech communica tion that requires atleast one microphone and the desired speech signal is usually contaminated by backgroundnoise and echo. As a result, the speech sign must be "cleaned" with advanced sign preparing devices before it is played out, transmitted, or put away. In this venture it has been investigated the required things and degree of upgrades in the field of discourse improvement utilizing discourse de-noising sets of PC directions announced in books with the fundamental intend to concentrate on the utilization of the window shape limits/rules in STSA based Speech Improvement process in which the sign destroyed by commotion is into edges and each part/segment is Windowed and the Windowed Speech pieces/parts zone connected to the Speech Improvement set of PC guidelines and the Improved Speech sign is modified in its time area. In general, the Speech Improvement methods make use theHam ming Window for this purpose. In this work an attempt has been made to study the effect of Window shape on the Speech. The Modified Improved thresholding is proposed by Asser Ghanbari and Mohammad Reza Karami and can be used like a hard thresholding limit with respect to the wavelet coefficients through and through worth progressively conspicuous than limit esteem and resembles an exponential capacity for the wavelet coefficients supreme worth not as much as edge esteem and is characterized.


2021 ◽  
Author(s):  
Dayana Ribas ◽  
Antonio Miguel ◽  
Alfonso Ortega ◽  
Eduardo Lleida

Abstract This paper proposes a Deep Neural Network (DNN)-based Wiener gain estimator for speech enhancement. The proposal is in the framework of the classical spectral-domain speech enhancement algorithms. In this case, we used the Optimal Modified Log-Spectral Amplitude (OMLSA), but consider that this proposal could fit many alternative speech estimation algorithms. We determined the best usage of the DNN approach at learning a robust instance of the Wiener gain estimator according to the characteristics of the SNR estimation and the gain function. To design a DNN architecture adjusted for the speech enhancement task, we study various configuration issues frequently used in DNN-based solutions, including speech representations, residual connections, and causal vs. non-causal designs. Thus, we provide conclusions for the use of DNN architectures with the enhancement purpose. Experiments show that the proposal provides results on the state-of-the-art. But beyond the objective quality measures, there are examples of noisy vs. enhanced speech available for listening to demonstrate in practice the skills of the method in real audio.


2020 ◽  
Vol 20 (14) ◽  
pp. 1389-1402 ◽  
Author(s):  
Maja Zivkovic ◽  
Marko Zlatanovic ◽  
Nevena Zlatanovic ◽  
Mladjan Golubović ◽  
Aleksandar M. Veselinović

In recent years, one of the promising approaches in the QSAR modeling Monte Carlo optimization approach as conformation independent method, has emerged. Monte Carlo optimization has proven to be a valuable tool in chemoinformatics, and this review presents its application in drug discovery and design. In this review, the basic principles and important features of these methods are discussed as well as the advantages of conformation independent optimal descriptors developed from the molecular graph and the Simplified Molecular Input Line Entry System (SMILES) notation compared to commonly used descriptors in QSAR modeling. This review presents the summary of obtained results from Monte Carlo optimization-based QSAR modeling with the further addition of molecular docking studies applied for various pharmacologically important endpoints. SMILES notation based optimal descriptors, defined as molecular fragments, identified as main contributors to the increase/ decrease of biological activity, which are used further to design compounds with targeted activity based on computer calculation, are presented. In this mini-review, research papers in which molecular docking was applied as an additional method to design molecules to validate their activity further, are summarized. These papers present a very good correlation among results obtained from Monte Carlo optimization modeling and molecular docking studies.


Sign in / Sign up

Export Citation Format

Share Document