Tree-based context clustering using speech recognition features for acoustic model training of speech synthesis

Author(s):  
Supadaech Chanjaradwichai ◽  
Atiwong Suchato ◽  
Proadpran Punyabukkana
Author(s):  
Dinkar Sitaram ◽  
Haripriya Srinivasaraghavan ◽  
Kapish Agarwal ◽  
Amritanshu Agrawal ◽  
Neha Joshi ◽  
...  

Author(s):  
Piotr Kozierski ◽  
Talar Sadalla ◽  
Szymon Drgas ◽  
Adam Dąbrowski ◽  
Joanna Ziętkiewicz ◽  
...  

Author(s):  
Askars Salimbajevs

Automatic Speech Recognition (ASR) requires huge amounts of real user speech data to reach state-of-the-art performance. However, speech data conveys sensitive speaker attributes like identity that can be inferred and exploited for malicious purposes. Therefore, there is an interest in the collection of anonymized speech data that is processed by some voice conversion method. In this paper, we evaluate one of the voice conversion methods on Latvian speech data and also investigate if privacy-transformed data can be used to improve ASR acoustic models. Results show the effectiveness of voice conversion against state-of-the-art speaker verification models on Latvian speech and the effectiveness of using privacy-transformed data in ASR training.


Sign in / Sign up

Export Citation Format

Share Document