Incremental Learning of Transfer Rules for Customized Machine Translation

Author(s):  
Werner Winiwarter
2018 ◽  
Vol 6 (3) ◽  
pp. 79-92
Author(s):  
Sahar A. El-Rahman ◽  
Tarek A. El-Shishtawy ◽  
Raafat A. El-Kammar

This article presents a realistic technique for the machine aided translation system. In this technique, the system dictionary is partitioned into a multi-module structure for fast retrieval of Arabic features of English words. Each module is accessed through an interface that includes the necessary morphological rules, which directs the search toward the proper sub-dictionary. Another factor that aids fast retrieval of Arabic features of words is the prediction of the word category, and accesses its sub-dictionary to retrieve the corresponding attributes. The system consists of three main parts, which are the source language analysis, the transfer rules between source language (English) and target language (Arabic), and the generation of the target language. The proposed system is able to translate, some negative forms, demonstrations, and conjunctions, and also adjust nouns, verbs, and adjectives according their attributes. Then, it adds the symptom of Arabic words to generate a correct sentence.


1992 ◽  
Vol 2 (1) ◽  
pp. 1-32 ◽  
Author(s):  
Laurence Danlos

AbstractThis article deals with constructions such as Jean a fait une promenade or Jean a soif which contain verbs called here ‘support verbs’. These structures are known to pose immense difficulties for the translator (whether human or automatic) and part oif the purpose of this paper is to suggest representations which render their translation easier on the basis of work carried out by the author within the EC Eurotra Machine Translation project. First of all, it is argued on linguistic grounds that support verb constructions behave differently from constructions containing ‘ordinary’ verbs such as lire or ouvrir. In particular, it is claimed that the syntactic and semantic head of Jean a fait une promenade is the noun promenade and not the verb faire which is a mere carrier of tense and aspect. We then raise the question of the representation of support verb constructions for the purposes of machine translation and examine several alternative possibilities. The representations adopted below are shown to lead to simple transfer rules limited to the substitution of lexical items which do not entail complex structural changes between source and target sentences. The linguistic ideas presented here have been implemented in nine languages within the Eurotra project but most of the discussion is based on contrastive evidence between French and English.


2019 ◽  
Vol 8 (2S8) ◽  
pp. 1324-1330

The Bicolano-Tagalog Transfer-based Machine Translation System is a unidirectional machine translator for languages Bicolano and Tagalog. The transfer-based approach is divided into three phase: Pre-Processing Analysis, Morphological Transfer, and Sentence Generation. The system analyze first the source language (Bicolano) input to create some internal representation. This includes the tokenizer, stemmer, POS tag and parser. Through transfer rules, it then typically manipulates this internal representation to transfer parsed source language syntactic structure into target language syntactic structure. Finally, the system generates Tagalog sentence from own morphological and syntactic information. Each phase will undergo training and evaluation test for the competence of end-results. Overall performance shows a 71.71% accuracy rate.


2016 ◽  
Vol 106 (1) ◽  
pp. 193-204
Author(s):  
Víctor M. Sánchez-Cartagena ◽  
Juan Antonio Pérez-Ortiz ◽  
Felipe Sánchez-Martínez

Abstract This paper presents ruLearn, an open-source toolkit for the automatic inference of rules for shallow-transfer machine translation from scarce parallel corpora and morphological dictionaries. ruLearn will make rule-based machine translation a very appealing alternative for under-resourced language pairs because it avoids the need for human experts to handcraft transfer rules and requires, in contrast to statistical machine translation, a small amount of parallel corpora (a few hundred parallel sentences proved to be sufficient). The inference algorithm implemented by ruLearn has been recently published by the same authors in Computer Speech & Language (volume 32). It is able to produce rules whose translation quality is similar to that obtained by using hand-crafted rules. ruLearn generates rules that are ready for their use in the Apertium platform, although they can be easily adapted to other platforms. When the rules produced by ruLearn are used together with a hybridisation strategy for integrating linguistic resources from shallow-transfer rule-based machine translation into phrase-based statistical machine translation (published by the same authors in Journal of Artificial Intelligence Research, volume 55), they help to mitigate data sparseness. This paper also shows how to use ruLearn and describes its implementation.


2010 ◽  
Vol 93 (1) ◽  
pp. 17-26 ◽  
Author(s):  
Yvette Graham

Sulis: An Open Source Transfer Decoder for Deep Syntactic Statistical Machine Translation In this paper, we describe an open source transfer decoder for Deep Syntactic Transfer-Based Statistical Machine Translation. Transfer decoding involves the application of transfer rules to a SL structure. The N-best TL structures are found via a beam search of TL hypothesis structures which are ranked via a log-linear combination of feature scores, such as translation model and dependency-based language model.


2009 ◽  
Vol 34 ◽  
pp. 605-635 ◽  
Author(s):  
F. Sánchez-Martínez ◽  
M. L. Forcada

This paper describes a method for the automatic inference of structural transfer rules to be used in a shallow-transfer machine translation (MT) system from small parallel corpora. The structural transfer rules are based on alignment templates, like those used in statistical MT. Alignment templates are extracted from sentence-aligned parallel corpora and extended with a set of restrictions which are derived from the bilingual dictionary of the MT system and control their application as transfer rules. The experiments conducted using three different language pairs in the free/open-source MT platform Apertium show that translation quality is improved as compared to word-for-word translation (when no transfer rules are used), and that the resulting translation quality is close to that obtained using hand-coded transfer rules. The method we present is entirely unsupervised and benefits from information in the rest of modules of the MT system in which the inferred rules are applied.


2016 ◽  
Vol 55 ◽  
pp. 17-61 ◽  
Author(s):  
Víctor M. Sánchez-Cartagena ◽  
Juan Antonio Pérez-Ortiz ◽  
Felipe Sánchez-Martínez

We describe a hybridisation strategy whose objective is to integrate linguistic resources from shallow-transfer rule-based machine translation (RBMT) into phrase-based statistical machine translation (PBSMT). It basically consists of enriching the phrase table of a PBSMT system with bilingual phrase pairs matching transfer rules and dictionary entries from a shallow-transfer RBMT system. This new strategy takes advantage of how the linguistic resources are used by the RBMT system to segment the source-language sentences to be translated, and overcomes the limitations of existing hybrid approaches that treat the RBMT systems as a black box. Experimental results confirm that our approach delivers translations of higher quality than existing ones, and that it is specially useful when the parallel corpus available for training the SMT system is small or when translating out-of-domain texts that are well covered by the RBMT dictionaries. A combination of this approach with a recently proposed unsupervised shallow-transfer rule inference algorithm results in a significantly greater translation quality than that of a baseline PBSMT; in this case, the only hand-crafted resource used are the dictionaries commonly used in RBMT. Moreover, the translation quality achieved by the hybrid system built with automatically inferred rules is similar to that obtained by those built with hand-crafted rules.


Sign in / Sign up

Export Citation Format

Share Document