Improving speaker verification performance against long-term speaker variability

2016 ◽  
Vol 79 ◽  
pp. 14-29 ◽  
Author(s):  
Linlin Wang ◽  
Jun Wang ◽  
Lantian Li ◽  
Thomas Fang Zheng ◽  
Frank K. Soong
2017 ◽  
Vol 17 (4) ◽  
pp. 114-133
Author(s):  
Atanas Ouzounov

AbstractThis paper proposes a new contour-based speech endpoint detector which combines the log-Group Delay Mean-Delta (log-GDMD) feature, an adaptive twothreshold scheme and an eight-state automaton. The adaptive thresholds scheme uses two pairs of thresholds - for the starting and for the ending points, respectively. Each pair of thresholds is calculated by using the contour characteristics in the corresponded region of the utterance. The experimental results have shown that the proposed detector demonstrates better performance compared to the Long-Term Spectral Divergence (LTSD) one in terms of endpoint accuracy. Additional fixed-text speaker verification tests with short phrases of telephone speech based on the Dynamic Time Warping (DTW) and left-to-right Hidden Markov Model (HMM) frameworks confirm the improvements of the verification rate due to the better endpoint accuracy.


2014 ◽  
Author(s):  
Finnian Kelly ◽  
Rahim Saeidi ◽  
Naomi Harte ◽  
David A. van Leeuwen
Keyword(s):  

2009 ◽  
Author(s):  
A. D. Lawson ◽  
A. R. Stauffer ◽  
B. Y. Smolenski ◽  
B. B. Pokines ◽  
M. Leonard ◽  
...  
Keyword(s):  

2006 ◽  
Author(s):  
Claudio Garreton ◽  
Nestor Becerra Yoma ◽  
Carlos Molina ◽  
Fernando Huenupan

Author(s):  
Tuan Pham ◽  
◽  
Michael Wagner ◽  

Most speaker verification systems are based on similarity or likelihood normalization techniques as they help to better cope with speaker variability. In the conventional normalization, the it a priori probabilities of the cohort speakers are assumed to be equal. From this standpoint, we apply the fuzzy integral and genetic algorithms to combine the likelihood values of the cohort speakers in which the assumption of equal <I>a priori</I> probabilities is relaxed. This approach replaces the conventional normalization term by the fuzzy integral which acts as a non-linear fusion of the similarity measures of an utterance assigned to the cohort speakers. Furthermore, genetic algorithms are applied to find optimal fuzzy densities which are very important for the fuzzy fusion. We illustrate the performance of the proposed approach by testing the speaker verification system with both the conventional and the proposed algorithms using the commercial speech corpus TI46. The results in terms of the equal error rates show that the speaker verification system using the fuzzy integral is more favorable than the conventional normalization method.


Sign in / Sign up

Export Citation Format

Share Document