scholarly journals MAKEDONKA: Applied Deep Learning Model for Text-to-Speech Synthesis in Macedonian Language

2020 ◽  
Vol 10 (19) ◽  
pp. 6882
Author(s):  
Kostadin Mishev ◽  
Aleksandra Karovska Ristovska ◽  
Dimitar Trajanov ◽  
Tome Eftimov ◽  
Monika Simjanoska

This paper presents MAKEDONKA, the first open-source Macedonian language synthesizer that is based on the Deep Learning approach. The paper provides an overview of the numerous attempts to achieve a human-like reproducible speech, which has unfortunately shown to be unsuccessful due to the work invisibility and lack of integration examples with real software tools. The recent advances in Machine Learning, the Deep Learning-based methodologies, provide novel methods for feature engineering that allow for smooth transitions in the synthesized speech, making it sound natural and human-like. This paper presents a methodology for end-to-end speech synthesis that is based on a fully-convolutional sequence-to-sequence acoustic model with a position-augmented attention mechanism—Deep Voice 3. Our model directly synthesizes Macedonian speech from characters. We created a dataset that contains approximately 20 h of speech from a native Macedonian female speaker, and we use it to train the text-to-speech (TTS) model. The achieved MOS score of 3.93 makes our model appropriate for application in any kind of software that needs text-to-speech service in the Macedonian language. Our TTS platform is publicly available for use and ready for integration.

Author(s):  
Beiming Cao ◽  
Myungjong Kim ◽  
Jan van Santen ◽  
Ted Mau ◽  
Jun Wang

Author(s):  
Yogita Hande ◽  
Akkalashmi Muddana

Presently, the advances of the internet towards a wide-spread growth and the static nature of traditional networks has limited capacity to cope with organizational business needs. The new network architecture software defined networking (SDN) appeared to address these challenges and provides distinctive features. However, these programmable and centralized approaches of SDN face new security challenges which demand innovative security mechanisms like intrusion detection systems (IDS's). The IDS of SDN are designed currently with a machine learning approach; however, a deep learning approach is also being explored to achieve better efficiency and accuracy. In this article, an overview of the SDN with its security concern and IDS as a security solution is explained. A survey of existing security solutions designed to secure the SDN, and a comparative study of various IDS approaches based on a deep learning model and machine learning methods are discussed in the article. Finally, we describe future directions for SDN security.


Symmetry ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 819
Author(s):  
Alakbar Valizada ◽  
Sevil Jafarova ◽  
Emin Sultanov ◽  
Samir Rustamov

This study concentrates on the investigation, development, and evaluation of Text-to-Speech Synthesis systems based on Deep Learning models for the Azerbaijani Language. We have selected and compared state-of-the-art models-Tacotron and Deep Convolutional Text-to-Speech (DC TTS) systems to achieve the most optimal model. Both systems were trained on the 24 h speech dataset of the Azerbaijani language collected and processed from the news website. To analyze the quality and intelligibility of the speech signals produced by two systems, 34 listeners participated in an online survey containing subjective evaluation tests. The results of the study indicated that according to the Mean Opinion Score, Tacotron demonstrated better results for the In-Vocabulary words; however, DC TTS indicated a higher performance of the Out-Of-Vocabulary words synthesis.


IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 60478-60488 ◽  
Author(s):  
Yi Zhao ◽  
Shinji Takaki ◽  
Hieu-Thi Luong ◽  
Junichi Yamagishi ◽  
Daisuke Saito ◽  
...  

Author(s):  
Katrin Donetski

The rapid infiltration of fake news is a flaw to the otherwise valuable internet, a virtually global network that allows for the simultaneous exchange of information. While a common, and normally effective, approach to such classification tasks is designing a deep learning-based model, the subjectivity behind the writing and production of misleading news invalidates this technique. Deep learning models are unexplainable in nature, making the contextualization of results impossible because it lacks explicit features used in traditional machine learning. This paper emphasizes the need for feature engineering to effectively address this problem: containing the spread of fake news at the source, not after it has become globally prevalent. Insights from extracted features were used to manipulate the text, which was then tested on deep learning models. The original unknown yet substantial impact that the original features had on deep learning models was successfully depicted in this study.


2020 ◽  
pp. 1-12
Author(s):  
Li Dongmei

English text-to-speech conversion is the key content of modern computer technology research. Its difficulty is that there are large errors in the conversion process of text-to-speech feature recognition, and it is difficult to apply the English text-to-speech conversion algorithm to the system. In order to improve the efficiency of the English text-to-speech conversion, based on the machine learning algorithm, after the original voice waveform is labeled with the pitch, this article modifies the rhythm through PSOLA, and uses the C4.5 algorithm to train a decision tree for judging pronunciation of polyphones. In order to evaluate the performance of pronunciation discrimination method based on part-of-speech rules and HMM-based prosody hierarchy prediction in speech synthesis systems, this study constructed a system model. In addition, the waveform stitching method and PSOLA are used to synthesize the sound. For words whose main stress cannot be discriminated by morphological structure, label learning can be done by machine learning methods. Finally, this study evaluates and analyzes the performance of the algorithm through control experiments. The results show that the algorithm proposed in this paper has good performance and has a certain practical effect.


Sign in / Sign up

Export Citation Format

Share Document