Acoustic Word Embeddings for End-to-End Speech Synthesis

Feiyu Shen; Chenpeng Du; Kai Yu

doi:10.3390/app11199010

Acoustic Word Embeddings for End-to-End Speech Synthesis

Applied Sciences ◽

10.3390/app11199010 ◽

2021 ◽

Vol 11 (19) ◽

pp. 9010

Author(s):

Feiyu Shen ◽

Chenpeng Du ◽

Kai Yu

Keyword(s):

Quality Improvement ◽

Speech Synthesis ◽

Word Embedding ◽

Word Embeddings ◽

Linguistic Information ◽

Subjective Evaluations ◽

System Input ◽

End To End ◽

Prosody Prediction

The most recent end-to-end speech synthesis systems use phonemes as acoustic input tokens and ignore the information about which word the phonemes come from. However, many words have their specific prosody type, which may significantly affect the naturalness. Prior works have employed pre-trained linguistic word embeddings as TTS system input. However, since linguistic information is not directly relevant to how words are pronounced, TTS quality improvement of these systems is mild. In this paper, we propose a novel and effective way of jointly training acoustic phone and word embeddings for end-to-end TTS systems. Experiments on the LJSpeech dataset show that the acoustic word embeddings dramatically decrease both the training and validation loss in phone-level prosody prediction. Subjective evaluations on naturalness demonstrate that the incorporation of acoustic word embeddings can significantly outperform both pure phone-based system and the TTS system with pre-trained linguistic word embedding.

Download Full-text

Subset Selection, Adaptation, Gemination and Prosody Prediction for Amharic Text-to-Speech Synthesis

10.21437/ssw.2019-37 ◽

2019 ◽

Author(s):

Elshadai Tesfaye Biru ◽

Yishak Tofik Mohammed ◽

David Tofu ◽

Erica Cooper ◽

Julia Hirschberg

Keyword(s):

Speech Synthesis ◽

Subset Selection ◽

Text To Speech ◽

Text To Speech Synthesis ◽

Prosody Prediction

Download Full-text

Incorporating LDA With Word Embedding for Web Service Clustering

International Journal of Web Services Research ◽

10.4018/ijwsr.2018100102 ◽

2018 ◽

Vol 15 (4) ◽

pp. 29-44 ◽

Cited By ~ 4

Author(s):

Yi Zhao ◽

Chong Wang ◽

Jian Wang ◽

Keqing He

Keyword(s):

Web Service ◽

Service Discovery ◽

Word Embedding ◽

The Internet ◽

Word Embeddings ◽

Training Process ◽

Web Service Discovery ◽

Processing Data ◽

Clustering Approach ◽

Service Clustering

With the rapid growth of web services on the internet, web service discovery has become a hot topic in services computing. Faced with the heterogeneous and unstructured service descriptions, many service clustering approaches have been proposed to promote web service discovery, and many other approaches leveraged auxiliary features to enhance the classical LDA model to achieve better clustering performance. However, these extended LDA approaches still have limitations in processing data sparsity and noise words. This article proposes a novel web service clustering approach by incorporating LDA with word embedding, which leverages relevant words obtained based on word embedding to improve the performance of web service clustering. Especially, the semantically relevant words of service keywords by Word2vec were used to train the word embeddings and then incorporated into the LDA training process. Finally, experiments conducted on a real-world dataset published on ProgrammableWeb show that the authors' proposed approach can achieve better clustering performance than several classical approaches.

Download Full-text

Developing new approaches for software design quality improvement based on subjective evaluations

Proceedings. 26th International Conference on Software Engineering ◽

10.1109/icse.2004.1317418 ◽

2004 ◽

Cited By ~ 1

Author(s):

M.V. Mantyla

Keyword(s):

Quality Improvement ◽

Software Design ◽

Design Quality ◽

New Approaches ◽

Subjective Evaluations

Download Full-text

The lexical context in a style analysis: A word embeddings approach

Corpus Linguistics and Linguistic Theory ◽

10.1515/cllt-2018-0003 ◽

2018 ◽

Vol 0 (0) ◽

Cited By ~ 1

Author(s):

Miroslav Kubát ◽

Jan Hůla ◽

Xinying Chen ◽

Radek Čech ◽

Jiří Milička

Keyword(s):

Pilot Study ◽

Independent Method ◽

Word Embedding ◽

Style Analysis ◽

Word Embeddings ◽

Context Specificity ◽

Non Fiction ◽

Context Similarity ◽

Corpus Size ◽

Specificity Measure

AbstractThis is a pilot study of usability of Context Specificity measure for stylometric purposes. Specifically, the word embedding Word2vec approach based on measuring lexical context similarity between lemmas is applied to the analysis of texts that belong to different styles. Three types of Czech texts are investigated: fiction, non-fiction, and journalism. Specifically, forty lemmas were observed (10 lemmas each for verbs, nouns, adjectives, and adverbs). The aim of the present study is to introduce a concept of the Context Specificity and to test whether this measurement is sensitive to different styles. The results show that the proposed method Closest Context Specificity (CCS) is a corpus size independent method which has a promising potential in analyzing different styles.

Download Full-text

Specializing Word Embeddings (for Parsing) by Information Bottleneck (Extended Abstract)

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/658 ◽

2020 ◽

Author(s):

Xiang Lisa Li ◽

Jason Eisner

Keyword(s):

Dimensionality Reduction ◽

Semantic Information ◽

State Of The Art ◽

Word Embedding ◽

Discrete Version ◽

Word Embeddings ◽

Continuous Version ◽

Continuous Vector ◽

Information Bottleneck ◽

Art Performance

Pre-trained word embeddings like ELMo and BERT contain rich syntactic and semantic information, resulting in state-of-the-art performance on various tasks. We propose a very fast variational information bottleneck (VIB) method to nonlinearly compress these embeddings, keeping only the information that helps a discriminative parser. We compress each word embedding to either a discrete tag or a continuous vector. In the discrete version, our automatically compressed tags form an alternative tag set: we show experimentally that our tags capture most of the information in traditional POS tag annotations, but our tag sequences can be parsed more accurately at the same level of tag granularity. In the continuous version, we show experimentally that moderately compressing the word embeddings by our method yields a more accurate parser in 8 of 9 languages, unlike simple dimensionality reduction.

Download Full-text

Deep Learning End to End Speech Synthesis: A Review

2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC) ◽

10.1109/icsccc51823.2021.9478125 ◽

2021 ◽

Author(s):

Owais Nazir ◽

Aruna Malik

Keyword(s):

Deep Learning ◽

Speech Synthesis ◽

End To End

Download Full-text

End-to-End Korean Speech Synthesis System Using Reformer Network

The Journal of Korean Institute of Communications and Information Sciences ◽

10.7840/kics.2021.46.2.217 ◽

2021 ◽

Vol 46 (2) ◽

pp. 217-224

Author(s):

Hyeong Rae Ihm ◽

Sung Jun Cheon ◽

Byoung Jin Choi ◽

Min Chan Kim ◽

Nam Soo Kim

Keyword(s):

Speech Synthesis ◽

Synthesis System ◽

End To End

Download Full-text

Convolution–deconvolution word embedding: An end-to-end multi-prototype fusion embedding method for natural language processing

Information Fusion ◽

10.1016/j.inffus.2019.06.009 ◽

2020 ◽

Vol 53 ◽

pp. 112-122 ◽

Cited By ~ 9

Author(s):

Kai Shuang ◽

Zhixuan Zhang ◽

Jonathan Loo ◽

Sen Su

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Word Embedding ◽

Embedding Method ◽

End To End

Download Full-text

Word Embeddings as Metric Recovery in Semantic Spaces

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00098 ◽

2016 ◽

Vol 4 ◽

pp. 273-286 ◽

Cited By ~ 15

Author(s):

Tatsunori B. Hashimoto ◽

David Alvarez-Melis ◽

Tommi S. Jaakkola

Keyword(s):

Random Walks ◽

Manifold Learning ◽

State Of The Art ◽

Inductive Reasoning ◽

Semantic Space ◽

Word Embedding ◽

Word Embeddings ◽

Recovery Algorithm ◽

Series Completion ◽

Semantic Spaces

Continuous word representations have been remarkably useful across NLP tasks but remain poorly understood. We ground word embeddings in semantic spaces studied in the cognitive-psychometric literature, taking these spaces as the primary objects to recover. To this end, we relate log co-occurrences of words in large corpora to semantic similarity assessments and show that co-occurrences are indeed consistent with an Euclidean semantic space hypothesis. Framing word embedding as metric recovery of a semantic space unifies existing word embedding algorithms, ties them to manifold learning, and demonstrates that existing algorithms are consistent metric recovery methods given co-occurrence counts from random walks. Furthermore, we propose a simple, principled, direct metric recovery algorithm that performs on par with the state-of-the-art word embedding and manifold learning methods. Finally, we complement recent focus on analogies by constructing two new inductive reasoning datasets—series completion and classification—and demonstrate that word embeddings can be used to solve them as well.

Download Full-text

A Method of Subtopic Classification of Search Engine Suggests by Integrating a Topic Model and Word Embeddings

International Journal of Software Innovation ◽

10.4018/ijsi.2018070105 ◽

2018 ◽

Vol 6 (3) ◽

pp. 67-78

Author(s):

Tian Nie ◽

Yi Ding ◽

Chen Zhao ◽

Youchao Lin ◽

Takehito Utsuro

Keyword(s):

Search Engine ◽

Information Needs ◽

Web Search ◽

Topic Model ◽

Japanese Version ◽

Word Embedding ◽

Coarse Grained ◽

Web Pages ◽

Word Embeddings

The background of this article is the issue of how to overview the knowledge of a given query keyword. Especially, the authors focus on concerns of those who search for web pages with a given query keyword. The Web search information needs of a given query keyword is collected through search engine suggests. Given a query keyword, the authors collect up to around 1,000 suggests, while many of them are redundant. They classify redundant search engine suggests based on a topic model. However, one limitation of the topic model based classification of search engine suggests is that the granularity of the topics, i.e., the clusters of search engine suggests, is too coarse. In order to overcome the problem of the coarse-grained classification of search engine suggests, this article further applies the word embedding technique to the webpages used during the training of the topic model, in addition to the text data of the whole Japanese version of Wikipedia. Then, the authors examine the word embedding based similarity between search engines suggests and further classify search engine suggests within a single topic into finer-grained subtopics based on the similarity of word embeddings. Evaluation results prove that the proposed approach performs well in the task of subtopic classification of search engine suggests.

Download Full-text