Unsupervised Learning for Monaural Source Separation Using Maximization–Minimization Algorithm with Time–Frequency Deconvolution

Wai Lok Woo; Bin Gao; Ahmed Bouridane; Bingo Wing-Kuen Ling; Cheng Siong Chin

doi:10.3390/s18051371

Unsupervised Learning for Monaural Source Separation Using Maximization–Minimization Algorithm with Time–Frequency Deconvolution

Sensors ◽

10.3390/s18051371 ◽

2018 ◽

Vol 18 (5) ◽

pp. 1371 ◽

Cited By ~ 5

Author(s):

Wai Lok Woo ◽

Bin Gao ◽

Ahmed Bouridane ◽

Bingo Wing-Kuen Ling ◽

Cheng Siong Chin

Keyword(s):

Unsupervised Learning ◽

Single Channel ◽

Learning Algorithm ◽

Source Separation ◽

Nonnegative Matrix ◽

Least Square ◽

Separation Performance ◽

Time Frequency ◽

Special Cases ◽

Leibler Divergence

This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time–frequency deconvolution with optimized fractional β-divergence. The β-divergence is a group of cost functions parametrized by a single parameter β. The Itakura–Saito divergence, Kullback–Leibler divergence and Least Square distance are special cases that correspond to β=0, 1, 2, respectively. This paper presents a generalized algorithm that uses a flexible range of β that includes fractional values. It describes a maximization–minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time–frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional β value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy.

Download Full-text

Hidden Markov models as priors for regularized nonnegative matrix factorization in single-channel source separation

10.21437/interspeech.2012-433 ◽

2012 ◽

Author(s):

Emad M. Grais ◽

Hakan Erdogan

Keyword(s):

Hidden Markov Models ◽

Matrix Factorization ◽

Nonnegative Matrix Factorization ◽

Markov Models ◽

Single Channel ◽

Hidden Markov ◽

Source Separation ◽

Nonnegative Matrix

Download Full-text

Low Latency Convolutive Blind Source Separation

10.26686/wgtn.17136158 ◽

2021 ◽

Author(s):

◽

Jiawen Chua

Keyword(s):

Frequency Domain ◽

Real Time ◽

Impulse Response ◽

Source Separation ◽

Frequency Resolution ◽

Separation Performance ◽

Window Length ◽

Time Frequency ◽

Time Systems ◽

Separation Parameters

<p>In most real-time systems, particularly for applications involving system identification, latency is a critical issue. These applications include, but are not limited to, blind source separation (BSS), beamforming, speech dereverberation, acoustic echo cancellation and channel equalization. The system latency consists of an algorithmic delay and an estimation computational time. The latter can be avoided by using a multi-thread system, which runs the estimation process and the processing procedure simultaneously. The former, which consists of a delay of one window length, is usually unavoidable for the frequency-domain approaches. For frequency-domain approaches, a block of data is acquired by using a window, transformed and processed in the frequency domain, and recovered back to the time domain by using an overlap-add technique. In the frequency domain, the convolutive model, which is usually used to describe the process of a linear time-invariant (LTI) system, can be represented by a series of multiplicative models to facilitate estimation. To implement frequency-domain approaches in real-time applications, the short-time Fourier transform (STFT) is commonly used. The window used in the STFT must be at least twice the room impulse response which is long, so that the multiplicative model is sufficiently accurate. The delay constraint caused by the associated blockwise processing window length makes most the frequency-domain approaches inapplicable for real-time systems. This thesis aims to design a BSS system that can be used in a real-time scenario with minimal latency. Existing BSS approaches can be integrated into our system to perform source separation with low delay without affecting the separation performance. The second goal is to design a BSS system that can perform source separation in a non-stationary environment. We first introduce a subspace approach to directly estimate the separation parameters in the low-frequency-resolution time-frequency (LFRTF) domain. In the LFRTF domain, a shorter window is used to reduce the algorithmic delay of the system during the signal acquisition, e.g., the window length is shorter than the room impulse response. The subspace method facilitates the deconvolution of a convolutive mixture to a new instantaneous mixture and simplifies the estimation process. Second, we propose an alternative approach to address the algorithmic latency problem. The alternative method enables us to obtain the separation parameters in the LFRTF domain based on parameters estimated in the high-frequency-resolution time-frequency (HFRTF) domain, where the window length is longer than the room impulse response, without affecting the separation performance. The thesis also provides a solution to address the BSS problem in a non-stationary environment. We utilize the ``meta-information" that is obtained from previous BSS operations to facilitate the separation in the future without performing the entire BSS process again. Repeating a BSS process can be computationally expensive. Most conventional BSS algorithms require sufficient signal samples to perform analysis and this prolongs the estimation delay. By utilizing information from the entire spectrum, our method enables us to update the separation parameters with only a single snapshot of observation data. Hence, our method minimizes the estimation period, reduces the redundancy and improves the efficacy of the system. The final contribution of the thesis is a non-iterative method for impulse response shortening. This method allows us to use a shorter representation to approximate the long impulse response. It further improves the computational efficiency of the algorithm and yet achieves satisfactory performance.</p>

Download Full-text

An adaptive time-frequency resolution approach for Non-negative Matrix Factorization based single channel sound source separation

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2011.5946388 ◽

2011 ◽

Cited By ~ 7

Author(s):

Serap Kirbiz ◽

Paris Smaragdis

Keyword(s):

Sound Source ◽

Matrix Factorization ◽

Single Channel ◽

Source Separation ◽

Frequency Resolution ◽

Time Frequency ◽

Sound Source Separation ◽

Adaptive Time ◽

Non Negative Matrix Factorization

Download Full-text

Nonnegative Matrix Factorization of time frequency representation of vibration signal for local damage detection – comparison of algorithms

E3S Web of Conferences ◽

10.1051/e3sconf/20182900010 ◽

2018 ◽

Vol 29 ◽

pp. 00010

Author(s):

Jacek Wodecki

Keyword(s):

Damage Detection ◽

Matrix Factorization ◽

Nonnegative Matrix Factorization ◽

Single Channel ◽

Classical Method ◽

Nonnegative Matrix ◽

Vibration Signal ◽

Local Damage ◽

Time Frequency ◽

Frequency Representation

Local damage detection in rotating machine elements is very important problem widely researched in the literature. One of the most common approaches is the vibration signal analysis. Since time domain processing is often insufficient, other representations are frequently favored. One of the most common one is time-frequency representation hence authors propose to separate internal processes occurring in the vibration signal by spectrogram matrix factorization. In order to achieve this, it is proposed to use the approach of Nonnegative Matrix Factorization (NMF). In this paper three NMF algorithms are tested using real and simulated data describing single-channel vibration signal acquired on damaged rolling bearing operating in drive pulley in belt conveyor driving station. Results are compared with filtration using Spectral Kurtosis, which is currently recognized as classical method for impulsive information extraction, to verify the validity of presented methodology.

Download Full-text

An adaptive time-frequency resolution framework for single channel source separation based on non-negative tensor factorization

2013 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2013.6637780 ◽

2013 ◽

Cited By ~ 1

Author(s):

S. Kirbiz ◽

B. Gunsel

Keyword(s):

Single Channel ◽

Source Separation ◽

Frequency Resolution ◽

Tensor Factorization ◽

Time Frequency ◽

Adaptive Time

Download Full-text

A Generalized Divergence Measure for Nonnegative Matrix Factorization

Neural Computation ◽

10.1162/neco.2007.19.3.780 ◽

2007 ◽

Vol 19 (3) ◽

pp. 780-791 ◽

Cited By ~ 92

Author(s):

Raul Kompass

Keyword(s):

Matrix Factorization ◽

Nonnegative Matrix Factorization ◽

Nonnegative Matrix ◽

Auxiliary Function ◽

Divergence Measure ◽

Factorization Problem ◽

Special Cases ◽

Leibler Divergence ◽

Update Rules ◽

Quadratic Distance

This letter presents a general parametric divergence measure. The metric includes as special cases quadratic error and Kullback-Leibler divergence. A parametric generalization of the two different multiplicative update rules for nonnegative matrix factorization by Lee and Seung (2001) is shown to lead to locally optimal solutions of the nonnegative matrix factorization problem with this new cost function. Numeric simulations demonstrate that the new update rule may improve the quadratic distance convergence speed. A proof of convergence is given that, as in Lee and Seung, uses an auxiliary function known from the expectation-maximization theoretical framework.

Download Full-text

Unsupervised Single Channel Source Separation with Nonnegative Matrix Factorization

The 7th International Conference on Information Technology ◽

10.15849/icit.2015.0073 ◽

2015 ◽

Author(s):

A.M. Darsono ◽

Shakir Saat ◽

N.M. Z. Hashim ◽

A.A.M ISA

Keyword(s):

Matrix Factorization ◽

Nonnegative Matrix Factorization ◽

Single Channel ◽

Source Separation ◽

Nonnegative Matrix

Download Full-text

ICA-Based Single Channel Source Separation With Time-Frequency Decomposition

2020 IEEE 7th International Workshop on Metrology for AeroSpace (MetroAeroSpace) ◽

10.1109/metroaerospace48742.2020.9160264 ◽

2020 ◽

Author(s):

Dariusz Mika ◽

Grzegorz Budzik ◽

Jerzy Jozwik

Keyword(s):

Single Channel ◽

Source Separation ◽

Time Frequency ◽

Frequency Decomposition

Download Full-text

Variational regularized two-dimensional nonnegative matrix factorization with the flexible [szlig ]-divergence for single channel source separation

2nd IET International Conference on Intelligent Signal Processing 2015 (ISP) ◽

10.1049/cp.2015.1788 ◽

2015 ◽

Author(s):

Kaiwen Yu ◽

W.L. Woo ◽

S.S. Dlay

Keyword(s):

Matrix Factorization ◽

Nonnegative Matrix Factorization ◽

Single Channel ◽

Source Separation ◽

Nonnegative Matrix ◽

Two Dimensional

Download Full-text

Pattern Expression Nonnegative Matrix Factorization: Algorithm and Applications to Blind Source Separation

Computational Intelligence and Neuroscience ◽

10.1155/2008/168769 ◽

2008 ◽

Vol 2008 ◽

pp. 1-10 ◽

Cited By ~ 12

Author(s):

Junying Zhang ◽

Le Wei ◽

Xuerong Feng ◽

Zhen Ma ◽

Yue Wang

Keyword(s):

Blind Source Separation ◽

Matrix Factorization ◽

Nonnegative Matrix Factorization ◽

Learning Algorithm ◽

Source Separation ◽

Nonnegative Matrix ◽

Heterogeneity Correction ◽

Two Parameters ◽

Common Situation ◽

Basis Vectors

Independent component analysis (ICA) is a widely applicable and effective approach in blind source separation (BSS), with limitations that sources are statistically independent. However, more common situation is blind source separation for nonnegative linear model (NNLM) where the observations are nonnegative linear combinations of nonnegative sources, and the sources may be statistically dependent. We propose a pattern expression nonnegative matrix factorization (PE-NMF) approach from the view point of using basis vectors most effectively to express patterns. Two regularization or penalty terms are introduced to be added to the original loss function of a standard nonnegative matrix factorization (NMF) for effective expression of patterns with basis vectors in the PE-NMF. Learning algorithm is presented, and the convergence of the algorithm is proved theoretically. Three illustrative examples on blind source separation including heterogeneity correction for gene microarray data indicate that the sources can be successfully recovered with the proposed PE-NMF when the two parameters can be suitably chosen from prior knowledge of the problem.

Download Full-text