Audio source separation with time-frequency velocities

Benchmarking flexible adaptive time-frequency transforms for underdetermined audio source separation

2009 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2009.4959514 ◽

2009 ◽

Cited By ~ 3

Author(s):

Andrew Nesbit ◽

Emmanuel Vincent ◽

Mark D. Plumbley

Keyword(s):

Source Separation ◽

Time Frequency ◽

Audio Source Separation ◽

Adaptive Time

Download Full-text

Sensing ecosystem dynamics via audio source separation: A case study of marine soundscapes off northeastern Taiwan

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008698 ◽

2021 ◽

Vol 17 (2) ◽

pp. e1008698

Author(s):

Tzu-Hao Lin ◽

Tomonari Akamatsu ◽

Yu Tsao

Keyword(s):

Conservation Management ◽

Marine Ecosystem ◽

Source Separation ◽

Ecosystem Dynamics ◽

Marine Animals ◽

Time Frequency ◽

Audio Source Separation ◽

Diversity Assessment ◽

Separation Model

Remote acquisition of information on ecosystem dynamics is essential for conservation management, especially for the deep ocean. Soundscape offers unique opportunities to study the behavior of soniferous marine animals and their interactions with various noise-generating activities at a fine temporal resolution. However, the retrieval of soundscape information remains challenging owing to limitations in audio analysis techniques that are effective in the face of highly variable interfering sources. This study investigated the application of a seafloor acoustic observatory as a long-term platform for observing marine ecosystem dynamics through audio source separation. A source separation model based on the assumption of source-specific periodicity was used to factorize time-frequency representations of long-duration underwater recordings. With minimal supervision, the model learned to discriminate source-specific spectral features and prove to be effective in the separation of sounds made by cetaceans, soniferous fish, and abiotic sources from the deep-water soundscapes off northeastern Taiwan. Results revealed phenological differences among the sound sources and identified diurnal and seasonal interactions between cetaceans and soniferous fish. The application of clustering to source separation results generated a database featuring the diversity of soundscapes and revealed a compositional shift in clusters of cetacean vocalizations and fish choruses during diurnal and seasonal cycles. The source separation model enables the transformation of single-channel audio into multiple channels encoding the dynamics of biophony, geophony, and anthropophony, which are essential for characterizing the community of soniferous animals, quality of acoustic habitat, and their interactions. Our results demonstrated the application of source separation could facilitate acoustic diversity assessment, which is a crucial task in soundscape-based ecosystem monitoring. Future implementation of soundscape information retrieval in long-term marine observation networks will lead to the use of soundscapes as a new tool for conservation management in an increasingly noisy ocean.

Download Full-text

Stereo audio source separation based on time–frequency masking and multilevel thresholding

Digital Signal Processing ◽

10.1016/j.dsp.2008.06.004 ◽

2008 ◽

Vol 18 (6) ◽

pp. 960-976 ◽

Cited By ~ 12

Author(s):

Maximo Cobos ◽

José J. López

Keyword(s):

Source Separation ◽

Multilevel Thresholding ◽

Time Frequency ◽

Audio Source Separation

Download Full-text

Audio source separation with multiple microphones on time-frequency representations

10.1117/12.2018632 ◽

2013 ◽

Author(s):

Hiroshi Sawada

Keyword(s):

Source Separation ◽

Time Frequency ◽

Audio Source Separation

Download Full-text

Blind Criterion and Oracle Bound for Instantaneous Audio Source Separation using Adaptive Time-Frequency Representations

2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics ◽

10.1109/aspaa.2007.4392974 ◽

2007 ◽

Cited By ~ 4

Author(s):

Emmanuel Vincent ◽

Remi Gribonval

Keyword(s):

Source Separation ◽

Time Frequency ◽

Audio Source Separation ◽

Adaptive Time

Download Full-text

Under-Determined Reverberant Audio Source Separation Using Local Observed Covariance and Auditory-Motivated Time-Frequency Representation

Latent Variable Analysis and Signal Separation - Lecture Notes in Computer Science ◽

10.1007/978-3-642-15995-4_10 ◽

2010 ◽

pp. 73-80 ◽

Cited By ~ 11

Author(s):

Ngoc Q. K. Duong ◽

Emmanuel Vincent ◽

Rémi Gribonval

Keyword(s):

Source Separation ◽

Time Frequency ◽

Audio Source Separation ◽

Frequency Representation

Download Full-text

Maximum likelihood approach for blind audio source separation using time-frequency Gaussian source models

IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005. ◽

10.1109/aspaa.2005.1540173 ◽

2006 ◽

Cited By ~ 25

Author(s):

C. Fevotte ◽

J. Cardoso

Keyword(s):

Maximum Likelihood ◽

Source Separation ◽

Time Frequency ◽

Maximum Likelihood Approach ◽

Audio Source Separation ◽

Gaussian Source ◽

Source Models ◽

Likelihood Approach

Download Full-text

Robust underdetermined blind audio source separation of sparse signals in the time-frequency domain

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2011.5947158 ◽

2011 ◽

Cited By ~ 3

Author(s):

Si Mohamed Aziz Sbai ◽

Abdeldjalil Aissa-El-Bey ◽

Dominique Pastor

Keyword(s):

Frequency Domain ◽

Source Separation ◽

Sparse Signals ◽

Time Frequency ◽

Audio Source Separation

Download Full-text

Integrating Dilated Convolution into DenseLSTM for Audio Source Separation

Applied Sciences ◽

10.3390/app11020789 ◽

2021 ◽

Vol 11 (2) ◽

pp. 789

Author(s):

Woon-Haeng Heo ◽

Hyemi Kim ◽

Oh-Wook Kwon

Keyword(s):

Deep Learning ◽

Speech Signal ◽

Source Separation ◽

Series Data ◽

Separation Performance ◽

Time Frequency ◽

Dilated Convolution ◽

Audio Source Separation ◽

Music Signal ◽

Learning Architectures

Herein, we proposed a multi-scale multi-band dilated time-frequency densely connected convolutional network (DenseNet) with long short-term memory (LSTM) for audio source separation. Because the spectrogram of the acoustic signal can be thought of as images as well as time series data, it is suitable for convolutional recurrent neural network (CRNN) architecture. We improved the audio source separation performance by applying the dilated block with a dilated convolution to CRNN architecture. The dilated block has the role of effectively increasing the receptive field in the spectrogram. In addition, it was designed in consideration of the acoustic characteristics that the frequency axis and the time axis in the spectrogram are changed by independent influences such as speech rate and pitch. In speech enhancement experiments, we estimated the speech signal using various deep learning architectures from a signal in which the music, noise, and speech were mixed. We conducted the subjective evaluation on the estimated speech signal. In addition, speech quality, intelligibility, separation, and speech recognition performance were also measured. In music signal separation, we estimated the music signal using several deep learning architectures from the mixture of the music and speech signal. After that, the separation performance and music identification accuracy were measured using the estimated music signal. Overall, the proposed architecture shows the best performance compared to other deep learning architectures not only in speech experiments but also in music experiments.

Download Full-text

Multichannel audio source separation: Variational inference of time-frequency sources from time-domain observations

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2017.7951791 ◽

2017 ◽

Cited By ~ 5

Author(s):

Simon Leglaive ◽

Roland Badeau ◽

Gael Richard

Keyword(s):

Time Domain ◽

Source Separation ◽

Variational Inference ◽

Time Frequency ◽

Multichannel Audio ◽

Audio Source Separation

Download Full-text