Design and Testing of Automatic Machine Translation System Based on Chinese-English Phrase Translation

Mobile Information Systems ◽

10.1155/2021/3539155 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Jing Ning ◽

Haidong Ban

Keyword(s):

Machine Translation ◽

Language Processing ◽

Evaluation System ◽

Large Scale ◽

Automatic Machine ◽

Translation System ◽

Automatic Translation ◽

Translation Methods ◽

Translation Systems ◽

Design And Testing

With the development of linguistics and the improvement of computer performance, the effect of machine translation is getting better and better, and it is widely used. The automatic expression translation method based on the Chinese-English machine takes short sentences as the basic translation unit and makes full use of the order of short sentences. Compared with word-based statistical machine translation methods, the effect is greatly improved. The performance of machine translation is constantly improving. This article aims to study the design of phrase-based automatic machine translation systems by introducing machine translation methods and Chinese-English phrase translation, explore the design and testing of machine automatic translation systems based on the combination of Chinese-English phrase translation, and explain the role of machine automatic translation in promoting the development of translation. In this article, through the combination of machine translation experiments and machine automatic translation system design methods, the design and testing of machine automatic translation systems based on Chinese-English phrase translation combinations are studied to cultivate people's understanding of language, knowledge, and intelligence and then help solve other problems. Language processing issues promote the development of corpus linguistics. The experimental results in this article show that when the Chinese-English phrase translation probability table is changed from 82% to 51%, the BLEU translation evaluation system for the combination of Chinese-English phrases is improved. Automatic machine translation saves time and energy of translation work, which shows that machine translation shows its advantages due to its short development cycle and easy processing of large-scale corpora.

Download Full-text

A Survey of Orthographic Information in Machine Translation

SN Computer Science ◽

10.1007/s42979-021-00723-4 ◽

2021 ◽

Vol 2 (4) ◽

Author(s):

Bharathi Raja Chakravarthi ◽

Priya Rani ◽

Mihael Arcan ◽

John P. McCrae

Keyword(s):

Machine Translation ◽

Language Processing ◽

Orthographic Knowledge ◽

Translation System ◽

Neural Machine Translation ◽

Machine Translation System ◽

Translation Methods ◽

Traditional Approaches ◽

Translation Systems ◽

Different Levels

AbstractMachine translation is one of the applications of natural language processing which has been explored in different languages. Recently researchers started paying attention towards machine translation for resource-poor languages and closely related languages. A widespread and underlying problem for these machine translation systems is the linguistic difference and variation in orthographic conventions which causes many issues to traditional approaches. Two languages written in two different orthographies are not easily comparable but orthographic information can also be used to improve the machine translation system. This article offers a survey of research regarding orthography’s influence on machine translation of under-resourced languages. It introduces under-resourced languages in terms of machine translation and how orthographic information can be utilised to improve machine translation. We describe previous work in this area, discussing what underlying assumptions were made, and showing how orthographic knowledge improves the performance of machine translation of under-resourced languages. We discuss different types of machine translation and demonstrate a recent trend that seeks to link orthographic information with well-established machine translation methods. Considerable attention is given to current efforts using cognate information at different levels of machine translation and the lessons that can be drawn from this. Additionally, multilingual neural machine translation of closely related languages is given a particular focus in this survey. This article ends with a discussion of the way forward in machine translation with orthographic information, focusing on multilingual settings and bilingual lexicon induction.

Download Full-text

Machine Translation

The Oxford Handbook of Computational Linguistics 2nd edition ◽

10.1093/oxfordhb/9780199573691.013.26 ◽

2016 ◽

Author(s):

Lucia Specia ◽

Yorick Wilks

Keyword(s):

Machine Translation ◽

Language Processing ◽

State Of The Art ◽

Research Area ◽

Rule Based ◽

Active Research ◽

Translation Methods ◽

The Cost ◽

Translation Systems ◽

Active Research Area

Machine Translation (MT) is and always has been a core application in the field of natural-language processing. It is a very active research area and it has been attracting significant commercial interest, most of which has been driven by the deployment of corpus-based, statistical approaches, which can be built in a much shorter time and at a fraction of the cost of traditional, rule-based approaches, and yet produce translations of comparable or superior quality. This chapter aims at introducing MT and its main approaches. It provides a historical overview of the field, an introduction to different translation methods, both rationalist (rule-based) and empirical, and a more in depth description of state-of-the-art statistical methods. Finally, it covers popular metrics to evaluate the output of machine translation systems.

Download Full-text

On the Statistical Machine Translation Studies

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.3262 ◽

2013 ◽

Vol 347-350 ◽

pp. 3262-3266

Author(s):

Ai Ling Wang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Large Scale ◽

Statistical Machine Translation ◽

Translation Model ◽

Statistical Natural Language Processing ◽

Translation Methods ◽

Important Branch

Machine translation (MT) is one of the core application of natural language processing and an important branch of artificial intelligence research; statistical methods have already become the mainstream of machine translation. This paper explores the comparative analysis on the translation model of statistical natural language processing based on the large-scale corpus; discusses word-based, phrase-based and syntax-based machine translation methods respectively, summarizes the evaluation factors of machine translation and analyzes evaluation methods of machine translation.

Download Full-text

Methodology for the Evaluation of Machine Translation Quality

Translation Studies: Theory and Practice ◽

10.46991/tstp/2021.1.1.133 ◽

2021 ◽

Vol 1 (1) ◽

pp. 124-133

Author(s):

Ani Ananyan ◽

Roza Avagyan

Keyword(s):

Artificial Intelligence ◽

Machine Translation ◽

Evaluation Method ◽

Evaluation Process ◽

Translation System ◽

Translation Quality ◽

Automatic Translation ◽

Methods Of Evaluation ◽

Translation Systems ◽

Widespread Dissemination

Along with the development and widespread dissemination of translation by artificial intelligence, it is becoming increasingly important to continuously evaluate and improve its quality and to use it as a tool for the modern translator. In our research, we compared five sentences translated from Armenian into Russian and English by Google Translator, Yandex Translator and two models of the translation system of the Armenian company Avromic to find out how effective these translation systems are when working in Armenian. It was necessary to find out how effective it would be to use them as a translation tool and in the learning process by further editing the translation. As there is currently no comprehensive and successful method of human metrics for machine translation, we have developed our own evaluation method and criteria by studying the world's most well-known methods of evaluation for automatic translation. We have used the post-editorial distance evaluation criterion as well. In the example of one sentence in the article, we have presented in detail the evaluation process according to the selected and developed criteria. At the end we have presented the results of the research and made appropriate conclusions.

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text

English-Dogri Translation System using MOSES

Circulation in Computer Science ◽

10.22632/ccs-2016-251-25 ◽

2016 ◽

Vol 1 (1) ◽

pp. 45-49

Author(s):

Avinash Singh ◽

Asmeet Kour ◽

Shubhnandan S. Jamwal

Keyword(s):

Natural Language Processing ◽

Machine Translation ◽

Language Processing ◽

Statistical Machine Translation ◽

Translation System ◽

Parallel Corpus ◽

English System ◽

Machine Translation System ◽

Translation Machine ◽

Language Pair

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.

Download Full-text

Efficient Embedded Decoding of Neural Network Language Models in a Machine Translation System

International Journal of Neural Systems ◽

10.1142/s0129065718500077 ◽

2018 ◽

Vol 28 (09) ◽

pp. 1850007

Author(s):

Francisco Zamora-Martinez ◽

Maria Jose Castro-Bleda

Keyword(s):

Neural Network ◽

Machine Translation ◽

Language Processing ◽

Traditional Approach ◽

Computational Cost ◽

Integrated Approach ◽

Language Models ◽

Translation System ◽

Neural Net ◽

Network Language

Neural Network Language Models (NNLMs) are a successful approach to Natural Language Processing tasks, such as Machine Translation. We introduce in this work a Statistical Machine Translation (SMT) system which fully integrates NNLMs in the decoding stage, breaking the traditional approach based on [Formula: see text]-best list rescoring. The neural net models (both language models (LMs) and translation models) are fully coupled in the decoding stage, allowing to more strongly influence the translation quality. Computational issues were solved by using a novel idea based on memorization and smoothing of the softmax constants to avoid their computation, which introduces a trade-off between LM quality and computational cost. These ideas were studied in a machine translation task with different combinations of neural networks used both as translation models and as target LMs, comparing phrase-based and [Formula: see text]-gram-based systems, showing that the integrated approach seems more promising for [Formula: see text]-gram-based systems, even with nonfull-quality NNLMs.

Download Full-text

Development of a method for assessing the quality of machine translation based on ensemble methods in machine learning

Science Intensive Technologies ◽

10.18127/j19998465-202102-06 ◽

2021 ◽

Author(s):

A.V. Kozina ◽

Yu.S. Belov

Keyword(s):

Machine Learning ◽

Random Forest ◽

Quality Assessment ◽

Machine Translation ◽

Translation System ◽

Human Judgment ◽

Translation Quality ◽

Machine Translation System ◽

Translation Systems

Automatically assessing the quality of machine translation is an important yet challenging task for machine translation research. Translation quality assessment is understood as predicting translation quality without reference to the source text. Translation quality depends on the specific machine translation system and often requires post-editing. Manual editing is a long and expensive process. Since the need to quickly determine the quality of translation increases, its automation is required. In this paper, we propose a quality assessment method based on ensemble supervised machine learning methods. The bilingual corpus WMT 2019 for the EnglishRussian language pair was used as data. The text data volume is 17089 sentences, 85% of the data was used for training, and 15% for testing the model. Linguistic functions extracted from the text in the source and target languages were used as features for training the system, since it is these characteristics that can most accurately characterize the translation in terms of quality. The following tools were used for feature extraction: a free language modeling tool based on SRILM and a Stanford POS Tagger parts of speech tagger. Before training the system, the text was preprocessed. The model was trained using three regression methods: Bagging, Extra Tree, and Random Forest. The algorithms were implemented in the Python programming language using the Scikit learn library. The parameters of the random forest method have been optimized using a grid search. The performance of the model was assessed by the mean absolute error MAE and the root mean square error RMSE, as well as by the Pearsоn coefficient, which determines the correlation with human judgment. Testing was carried out using three machine translation systems: Google and Bing neural systems, Mouses statistical machine translation systems based on phrases and based on syntax. Based on the results of the work, the method of additional trees showed itself best. In addition, for all categories of indicators under consideration, the best results are achieved using the Google machine translation system. The developed method showed good results close to human judgment. The system can be used for further research in the task of assessing the quality of translation.

Download Full-text

A Survey on Hybrid Machine Translation

E3S Web of Conferences ◽

10.1051/e3sconf/202018401061 ◽

2020 ◽

Vol 184 ◽

pp. 01061

Author(s):

Anusha Anugu ◽

Gajula Ramesh

Keyword(s):

Machine Translation ◽

Language Processing ◽

Literature Survey ◽

Neural Machine Translation ◽

Translation Tools ◽

Translation Techniques ◽

Hybrid Machine ◽

Hybrid Machine Translation ◽

Translation Systems ◽

Evaluation Techniques

Machine translation has gradually developed in past 1940’s.It has gained more and more attention because of effective and efficient nature. As it makes the translation automatically without the involvement of human efforts. The distinct models of machine translation along with “Neural Machine Translation (NMT)” is summarized in this paper. Researchers have previously done lots of work on Machine Translation techniques and their evaluation techniques. Thus, we want to demonstrate an analysis of the existing techniques for machine translation including Neural Machine translation, their differences and the translation tools associated with them. Now-a-days the combination of two Machine Translation systems has the full advantage of using features from both the systems which attracts in the domain of natural language processing. So, the paper also includes the literature survey of the Hybrid Machine Translation (HMT).

Download Full-text

Translation of Medical Texts using Neural Networks

International Journal of Reliable and Quality E-Healthcare ◽

10.4018/ijrqeh.2016100104 ◽

2016 ◽

Vol 5 (4) ◽

pp. 51-66 ◽

Cited By ~ 5

Author(s):

Krzysztof Wolk ◽

Krzysztof P. Marasek

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

European Medicines Agency ◽

Translation System ◽

Training Methods ◽

Neural Machine Translation ◽

Machine Translation System ◽

Source Sentence ◽

Parallel Text ◽

Translation Systems

The quality of machine translation is rapidly evolving. Today one can find several machine translation systems on the web that provide reasonable translations, although the systems are not perfect. In some specific domains, the quality may decrease. A recently proposed approach to this domain is neural machine translation. It aims at building a jointly-tuned single neural network that maximizes translation performance, a very different approach from traditional statistical machine translation. Recently proposed neural machine translation models often belong to the encoder-decoder family in which a source sentence is encoded into a fixed length vector that is, in turn, decoded to generate a translation. The present research examines the effects of different training methods on a Polish-English Machine Translation system used for medical data. The European Medicines Agency parallel text corpus was used as the basis for training of neural and statistical network-based translation systems. A comparison and implementation of a medical translator is the main focus of our experiments.

Download Full-text