scholarly journals On the Effects of Automatic Transcription and Segmentation Errors in Hungarian Spoken Language Processing

2019 ◽  
Vol 63 (4) ◽  
pp. 254-262
Author(s):  
Máté Ákos Tündik ◽  
Valér Kaszás ◽  
György Szaszák

Emerging Artificial Intelligence (AI) technology has brought machines to reach an equal or even superior level compared to human capabilities in several fields; nevertheless, among many other fields, making a computer able to understand human language still remains a challenge. When dealing with speech understanding, Automatic Speech Recognition (ASR) is used to generate transcripts, which are processed with text-based tools targeting Spoken Language Understanding (SLU). Depending on the ASR quality (which further depends on speech quality, the complexity of the topic, environment etc.), transcripts contain errors, which propagate further into the processing pipeline. Subjective tests show on the other hand, that humans understand quite well ASR-closed captions, despite the word and punctuation errors. Through word embedding based semantic parsing, the present paper is interested in quantifying the semantic bias introduced by ASR error propagation. As a special use case, speech summarization is also evaluated with regard to ASR error propagation. We show, that despite the higher word error rates seen with the highly inflectional Hungarian, the semantic space suffers least impact than the difference in Word Error Rate would suggest.

1998 ◽  
Vol 21 (2) ◽  
pp. 267-268 ◽  
Author(s):  
Steven Greenberg

Understanding spoken language involves far more than decoding a linear sequence of phonetic elements. In view of the inherent variability of the acoustic signal in spontaneous speech, it is not entirely clear that the sort of representation derived from locus equations is sufficient to account for the robustness of spoken language understanding under real-world conditions. An alternative representation, based on the low-frequency modulation spectrum, provides a more plausible neural foundation for spoken language processing.


1991 ◽  
Author(s):  
Lynette Hirschman ◽  
Stephanie Seneff ◽  
David Goodine ◽  
Michael Phillips

2020 ◽  
Author(s):  
Saad Ghojaria ◽  
Rahul Kotian ◽  
Yash Sawant ◽  
Suresh Mestry

Author(s):  
Timnit Gebru

This chapter discusses the role of race and gender in artificial intelligence (AI). The rapid permeation of AI into society has not been accompanied by a thorough investigation of the sociopolitical issues that cause certain groups of people to be harmed rather than advantaged by it. For instance, recent studies have shown that commercial automated facial analysis systems have much higher error rates for dark-skinned women, while having minimal errors on light-skinned men. Moreover, a 2016 ProPublica investigation uncovered that machine learning–based tools that assess crime recidivism rates in the United States are biased against African Americans. Other studies show that natural language–processing tools trained on news articles exhibit societal biases. While many technical solutions have been proposed to alleviate bias in machine learning systems, a holistic and multifaceted approach must be taken. This includes standardization bodies determining what types of systems can be used in which scenarios, making sure that automated decision tools are created by people from diverse backgrounds, and understanding the historical and political factors that disadvantage certain groups who are subjected to these tools.


Author(s):  
Prashanth Gurunath Shivakumar ◽  
Naveen Kumar ◽  
Panayiotis Georgiou ◽  
Shrikanth Narayanan

Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1372
Author(s):  
Sanjanasri JP ◽  
Vijay Krishna Menon ◽  
Soman KP ◽  
Rajendran S ◽  
Agnieszka Wolk

Linguists have been focused on a qualitative comparison of the semantics from different languages. Evaluation of the semantic interpretation among disparate language pairs like English and Tamil is an even more formidable task than for Slavic languages. The concept of word embedding in Natural Language Processing (NLP) has enabled a felicitous opportunity to quantify linguistic semantics. Multi-lingual tasks can be performed by projecting the word embeddings of one language onto the semantic space of the other. This research presents a suite of data-efficient deep learning approaches to deduce the transfer function from the embedding space of English to that of Tamil, deploying three popular embedding algorithms: Word2Vec, GloVe and FastText. A novel evaluation paradigm was devised for the generation of embeddings to assess their effectiveness, using the original embeddings as ground truths. Transferability across other target languages of the proposed model was assessed via pre-trained Word2Vec embeddings from Hindi and Chinese languages. We empirically prove that with a bilingual dictionary of a thousand words and a corresponding small monolingual target (Tamil) corpus, useful embeddings can be generated by transfer learning from a well-trained source (English) embedding. Furthermore, we demonstrate the usability of generated target embeddings in a few NLP use-case tasks, such as text summarization, part-of-speech (POS) tagging, and bilingual dictionary induction (BDI), bearing in mind that those are not the only possible applications.


Sign in / Sign up

Export Citation Format

Share Document