Model-based Bayesian direction of arrival analysis for sound sources using a spherical microphone array

2019 ◽  
Vol 146 (6) ◽  
pp. 4936-4946 ◽  
Author(s):  
Christopher R. Landschoot ◽  
Ning Xiang
2018 ◽  
Vol 144 (3) ◽  
pp. 1882-1882
Author(s):  
Christopher R. Landschoot ◽  
Jonathan Mathews ◽  
Jonas Braasch ◽  
Ning Xiang

Acta Acustica ◽  
2021 ◽  
Vol 5 ◽  
pp. 18
Author(s):  
Gabriela Dantas Rocha ◽  
Julio Cesar B. Torres ◽  
Mariane Rembold Petraglia ◽  
Michael Vorländer

The generalized cross-correlation with phase transform (GCC-PHAT) algorithm has proved to be useful for blindly estimating the direction of arrival of compact sound sources from microphone array recordings. In applications with distributions of partial sources, such as the tires of vehicles in urban environments, the GCC-PHAT needs to be improved, otherwise the detected sound directions change values between directions of the main sources or correspond to an intermediate value between these directions. This paper presents an extension of the GCC-PHAT, based on post-processing of the output delay matrix and on image processing techniques, in order to separately identify directions of the sound produced by the front and rear tires of moving vehicles. The proposed approach can be extended to identify the tire noise directions produced by vehicles with multiple axles. The algorithm performance is analyzed using pass-by measurements of two-axle vehicles, acquired by a two-microphone array. The experiments were conducted with passenger vehicles of four distinct models, running at different speeds. The experimental results show that the proposed method is able to estimate the vehicle speed with an average error of 10.8 km/h and the vehicle wheelbase with 26 cm on average. A possible application is multiple source characterization for parametric vehicle sound synthesis in auralization.


2020 ◽  
Vol 12 (0) ◽  
pp. 1-8
Author(s):  
Saulius Sakavičius

For the development and evaluation of a sound source localization and separation methods, a concise audio dataset with complete geometrical information about the room, the positions of the sound sources, and the array of microphones is needed. Computer simulation of such audio and geometrical data often relies on simplifications and are sufficiently accurate only for a specific set of conditions. It is generally desired to evaluate algorithms on real-world data. For a three-dimensional sound source localization or direction of arrival estimation, a non-coplanar microphone array is needed.Simplest and most general type of non-coplanar array is a tetrahedral array. There is a lack of openly accessible realworld audio datasets obtained using such arrays. We present an audio dataset for the evaluation of sound source localization algorithms, which involve tetrahedral microphone arrays. The dataset is complete with the geometrical information of the room, the positions of the sound sources and the microphone array. Array audio data was captured for two tetrahedral microphone arrays with different distances between microphones and one or two active sound sources. The dataset is suitable for speech recognition and direction-of-arrival estimation, as the signals used for sound sources were speech signals.


2019 ◽  
Vol 146 ◽  
pp. 295-309 ◽  
Author(s):  
Cui Qing Zhang ◽  
Zhi Ying Gao ◽  
Yong Yan Chen ◽  
Yuan Jun Dai ◽  
Jian Wen Wang ◽  
...  

2013 ◽  
Author(s):  
Alan W. Boyd ◽  
William M. Whitmer ◽  
W. Owen Brimijoin ◽  
Michael A. Akeroyd

Author(s):  
Qiang Yang ◽  
Yuanqing Zheng

Voice interaction is friendly and convenient for users. Smart devices such as Amazon Echo allow users to interact with them by voice commands and become increasingly popular in our daily life. In recent years, research works focus on using the microphone array built in smart devices to localize the user's position, which adds additional context information to voice commands. In contrast, few works explore the user's head orientation, which also contains useful context information. For example, when a user says, "turn on the light", the head orientation could infer which light the user is referring to. Existing model-based works require a large number of microphone arrays to form an array network, while machine learning-based approaches need laborious data collection and training workload. The high deployment/usage cost of these methods is unfriendly to users. In this paper, we propose HOE, a model-based system that enables Head Orientation Estimation for smart devices with only two microphone arrays, which requires a lower training overhead than previous approaches. HOE first estimates the user's head orientation candidates by measuring the voice energy radiation pattern. Then, the voice frequency radiation pattern is leveraged to obtain the final result. Real-world experiments are conducted, and the results show that HOE can achieve a median estimation error of 23 degrees. To the best of our knowledge, HOE is the first model-based attempt to estimate the head orientation by only two microphone arrays without the arduous data training overhead.


Sign in / Sign up

Export Citation Format

Share Document