The Language Demographics of Amazon Mechanical Turk

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00167 ◽

2014 ◽

Vol 2 ◽

pp. 79-92 ◽

Cited By ~ 19

Author(s):

Ellie Pavlick ◽

Matt Post ◽

Ann Irvine ◽

Dmitry Kachaev ◽

Chris Callison-Burch

Keyword(s):

Large Scale ◽

Language Skills ◽

Statistical Machine Translation ◽

Mechanical Turk ◽

Amazon Mechanical Turk ◽

Indian Languages ◽

Parallel Corpora ◽

Bilingual Dictionaries ◽

Bilingual Speakers ◽

Translation Systems

We present a large scale study of the languages spoken by bilingual workers on Mechanical Turk (MTurk). We establish a methodology for determining the language skills of anonymous crowd workers that is more robust than simple surveying. We validate workers’ self-reported language skill claims by measuring their ability to correctly translate words, and by geolocating workers to see if they reside in countries where the languages are likely to be spoken. Rather than posting a one-off survey, we posted paid tasks consisting of 1,000 assignments to translate a total of 10,000 words in each of 100 languages. Our study ran for several months, and was highly visible on the MTurk crowdsourcing platform, increasing the chances that bilingual workers would complete it. Our study was useful both to create bilingual dictionaries and to act as census of the bilingual speakers on MTurk. We use this data to recommend languages with the largest speaker populations as good candidates for other researchers who want to develop crowdsourced, multilingual technologies. To further demonstrate the value of creating data via crowdsourcing, we hire workers to create bilingual parallel corpora in six Indian languages, and use them to train statistical machine translation systems.

Download Full-text

Testing the Waters: Behavior across Participant Pools

The American Economic Review ◽

10.1257/aer.20181065 ◽

2021 ◽

Vol 111 (2) ◽

pp. 687-719

Author(s):

Erik Snowberg ◽

Leeat Yariv

Keyword(s):

External Validity ◽

Large Scale ◽

Student Population ◽

University Student ◽

Mechanical Turk ◽

Amazon Mechanical Turk ◽

Lab Experiments ◽

The Us ◽

Small Set ◽

Observer Effects

We leverage a large-scale incentivized survey eliciting behaviors from (almost) an entire undergraduate university student population, a representative sample of the US population, and Amazon Mechanical Turk (MTurk) to address concerns about the external validity of experiments with student participants. Behavior in the student population offers bounds on behaviors in other populations, and correlations between behaviors are similar across samples. Furthermore, non-student samples exhibit higher levels of noise. Adding historical lab participation data, we find a small set of attributes over which lab participants differ from non-lab participants. An additional set of lab experiments shows no evidence of observer effects. (JEL C83, D90, D91)

Download Full-text

Learning to be selfish? A large-scale longitudinal analysis of Dictator games played on Amazon Mechanical Turk

10.31234/osf.io/87e4y ◽

2019 ◽

Cited By ~ 1

Author(s):

Antonio Alonso Arechar ◽

David Gertler Rand

Keyword(s):

Longitudinal Analysis ◽

Large Scale ◽

Dictator Game ◽

Mechanical Turk ◽

Amazon Mechanical Turk ◽

Social Setting ◽

Negative Effects ◽

Dictator Games

We investigate whether experience playing the Dictator Game (DG) affects prosociality by aggregating data from 37 experiments run on Amazon Mechanical Turk over a six-year period. While prior evidence has shown a correlation between experience on Amazon Mechanical Turk and selfishness, it is unclear to what extent this is the result of selection versus learning. Examining a total of 27,266 decisions made by 17,791 unique individuals, our data shows evidence of significant negative effects of both selection and learning. First, people who participated in a greater total number of our experiments were more selfish, even in their first game – indicating that people who are more likely to select into our experiments are more selfish. Second, a given individual tends to transfer less money over successive experiments – indicating that experience with the DG leads to greater selfishness. These results provide clear evidence of learning even in this non-strategic social setting.

Download Full-text

Improving Statistical Machine Translation by Adapting Translation Models to Translationese

Computational Linguistics ◽

10.1162/coli_a_00159 ◽

2013 ◽

Vol 39 (4) ◽

pp. 999-1023 ◽

Cited By ~ 5

Author(s):

Gennadi Lembersky ◽

Noam Ordan ◽

Shuly Wintner

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Target Language ◽

Common Assumption ◽

Parallel Corpora ◽

The Common ◽

The Right ◽

Translation Systems ◽

Parallel Texts

Translation models used for statistical machine translation are compiled from parallel corpora that are manually translated. The common assumption is that parallel texts are symmetrical: The direction of translation is deemed irrelevant and is consequently ignored. Much research in Translation Studies indicates that the direction of translation matters, however, as translated language (translationese) has many unique properties. It has already been shown that phrase tables constructed from parallel corpora translated in the same direction as the translation task outperform those constructed from corpora translated in the opposite direction. We reconfirm that this is indeed the case, but emphasize the importance of also using texts translated in the “wrong” direction. We take advantage of information pertaining to the direction of translation in constructing phrase tables by adapting the translation model to the special properties of translationese. We explore two adaptation techniques: First, we create a mixture model by interpolating phrase tables trained on texts translated in the “right” and the “wrong” directions. The weights for the interpolation are determined by minimizing perplexity. Second, we define entropy-based measures that estimate the correspondence of target-language phrases to translationese, thereby eliminating the need to annotate the parallel corpus with information pertaining to the direction of translation. We show that incorporating these measures as features in the phrase tables of statistical machine translation systems results in consistent, statistically significant improvement in the quality of the translation.

Download Full-text

Development of English-Hindi Interactive Machine Translation

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1532.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 4185-4189

Keyword(s):

Machine Translation ◽

Error Rate ◽

Statistical Machine Translation ◽

Indian Languages ◽

Word Error Rate ◽

Translation Error ◽

Translation Systems

Machine Translation systems are still far from being perfect and to improve their performance the concept of Interactive Machine Translation (IMT) was introduced. This paper proposes an IMT system, which uses Statistical Machine Translation and a bilingual corpus on which several algorithms (Word error rate, Position Independent Error Rate, Translation Error Rate, n-grams) are implemented to translate text from English to Indian languages. The proposed system improves both the speed and productivity of the human translators as found through experiments.

Download Full-text

Source-Side Discontinuous Phrases for Machine Translation: A Comparative Study on Phrase Extraction and Search

Prague Bulletin of Mathematical Linguistics ◽

10.2478/pralin-2013-0002 ◽

2013 ◽

Vol 99 (1) ◽

pp. 17-38

Author(s):

Matthias Huck ◽

Erik Scharwächter ◽

Hermann Ney

Keyword(s):

Machine Translation ◽

Large Scale ◽

Search Algorithm ◽

Statistical Machine Translation ◽

Empirical Evaluation ◽

Training Data ◽

Beam Search ◽

Phrase Extraction ◽

System Configurations ◽

Translation Systems

Abstract Standard phrase-based statistical machine translation systems generate translations based on an inventory of continuous bilingual phrases. In this work, we extend a phrase-based decoder with the ability to make use of phrases that are discontinuous in the source part. Our dynamic programming beam search algorithm supports separate pruning of coverage hypotheses per cardinality and of lexical hypotheses per coverage, as well as coverage constraints that impose restrictions on the possible reorderings. In addition to investigating these aspects, which are related to the decoding procedure, we also concentrate our attention on the question of how to obtain source-side discontinuous phrases from parallel training data. Two approaches (hierarchical and discontinuous extraction) are presented and compared. On a large-scale Chinese!English translation task, we conduct a thorough empirical evaluation in order to study a number of system configurations with source-side discontinuous phrases, and to compare them to setups which employ continuous phrases only.

Download Full-text

Developing Statistical Machine Translation System for English and Nigerian Languages

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2018/v1i424761 ◽

2018 ◽

pp. 1-8

Author(s):

Ignatius Ikechukwu Ayogu ◽

Adebayo Olusola Adetunmbi ◽

Bolanle Adefowoke Ojokoh

Keyword(s):

Error Analysis ◽

Machine Translation ◽

Statistical Machine Translation ◽

Translation System ◽

Parallel Corpora ◽

Translation Tools ◽

Global Demand ◽

Machine Translation System ◽

Translation Systems

The global demand for translation and translation tools currently surpasses the capacity of available solutions. Besides, there is no one-solution-fits-all, off-the-shelf solution for all languages. Thus, the need and urgency to increase the scale of research for the development of translation tools and devices continue to grow, especially for languages suffering under the pressure of globalisation. This paper discusses our experiments on translation systems between English and two Nigerian languages: Igbo and Yorùbá. The study is setup to build parallel corpora, train and experiment English-to-Igbo, (), English-to-Yorùbá, () and Igbo-to-Yorùbá, () phrase-based statistical machine translation systems. The systems were trained on parallel corpora that were created for each language pair using text from the religious domain in the course of this research. A BLEU score of 30.04, 29.01 and 18.72 respectively was recorded for the English-to-Igbo, English-to-Yorùbá and Igbo-to-Yorùbá MT systems. An error analysis of the systems’ outputs was conducted using a linguistically motivated MT error analysis approach and it showed that errors occurred mostly at the lexical, grammatical and semantic levels. While the study reveals the potentials of our corpora, it also shows that the size of the corpora is yet an issue that requires further attention. Thus an important target in the immediate future is to increase the quantity and quality of the data.

Download Full-text

Scalable Reordering Models for SMT based on Multiclass SVM

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2015-0004 ◽

2015 ◽

Vol 103 (1) ◽

pp. 65-84 ◽

Cited By ~ 1

Author(s):

Abdullah Alrajeh ◽

Mahesan Niranjan

Keyword(s):

Large Scale ◽

Statistical Machine Translation ◽

Classification Problem ◽

Training Data ◽

Support Vector ◽

Recent Developments ◽

Dual Coordinate Descent ◽

Baseline System ◽

Multiclass Svm ◽

Translation Systems

Abstract In state-of-the-art phrase-based statistical machine translation systems, modelling phrase reorderings is an important need to enhance naturalness of the translated outputs, particularly when the grammatical structures of the language pairs differ significantly. Posing phrase movements as a classification problem, we exploit recent developments in solving large-scale multiclass support vector machines. Using dual coordinate descent methods for learning, we provide a mechanism to shrink the amount of training data required for each iteration. Hence, we produce significant computational saving while preserving the accuracy of the models. Our approach is a couple of times faster than maximum entropy approach and more memory-efficient (50% reduction). Experiments were carried out on an Arabic-English corpus with more than a quarter of a billion words. We achieve BLEU score improvements on top of a strong baseline system with sparse reordering features.

Download Full-text

Can machine translation systems be evaluated by the crowd alone

Natural Language Engineering ◽

10.1017/s1351324915000339 ◽

2015 ◽

Vol 23 (1) ◽

pp. 3-30 ◽

Cited By ~ 10

Author(s):

YVETTE GRAHAM ◽

TIMOTHY BALDWIN ◽

ALISTAIR MOFFAT ◽

JUSTIN ZOBEL

Keyword(s):

Machine Translation ◽

Large Scale ◽

Statistical Machine Translation ◽

Crowd Sourcing ◽

Direct Estimate ◽

Translation Quality ◽

Relative Preference ◽

Human Evaluation ◽

Estimate Method ◽

Translation Systems

AbstractCrowd-sourced assessments of machine translation quality allow evaluations to be carried out cheaply and on a large scale. It is essential, however, that the crowd's work be filtered to avoid contamination of results through the inclusion of false assessments. One method is to filter via agreement with experts, but even amongst experts agreement levels may not be high. In this paper, we present a new methodology for crowd-sourcing human assessments of translation quality, which allows individual workers to develop their own individual assessment strategy. Agreement with experts is no longer required, and a worker is deemed reliable if they are consistent relative to their own previous work. Individual translations are assessed in isolation from all others in the form of direct estimates of translation quality. This allows more meaningful statistics to be computed for systems and enables significance to be determined on smaller sets of assessments. We demonstrate the methodology's feasibility in large-scale human evaluation through replication of the human evaluation component of Workshop on Statistical Machine Translation shared translation task for two language pairs, Spanish-to-English and English-to-Spanish. Results for measurement based solely on crowd-sourced assessments show system rankings in line with those of the original evaluation. Comparison of results produced by the relative preference approach and the direct estimate method described here demonstrate that the direct estimate method has a substantially increased ability to identify significant differences between translation systems.

Download Full-text

Neural Network Machine Translation Method Based on Unsupervised Domain Adaptation

Complexity ◽

10.1155/2020/6657344 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Rui Wang

Keyword(s):

Neural Network ◽

Machine Translation ◽

Large Scale ◽

Domain Adaptation ◽

Structural Information ◽

Statistical Machine Translation ◽

Target Language ◽

Great Success ◽

Parallel Corpora ◽

Translation Rule

Relying on large-scale parallel corpora, neural machine translation has achieved great success in certain language pairs. However, the acquisition of high-quality parallel corpus is one of the main difficulties in machine translation research. In order to solve this problem, this paper proposes unsupervised domain adaptive neural network machine translation. This method can be trained using only two unrelated monolingual corpora and obtain a good translation result. This article first measures the matching degree of translation rules by adding relevant subject information to the translation rules and dynamically calculating the similarity between each translation rule and the document to be translated during the decoding process. Secondly, through the joint training of multiple training tasks, the source language can learn useful semantic and structural information from the monolingual corpus of a third language that is not parallel to the current two languages during the process of translation into the target language. Experimental results show that better results can be obtained than traditional statistical machine translation.

Download Full-text

A positive effect of flowers rather than eye images in a large-scale, cross-cultural dictator game

Proceedings of The Royal Society B Biological Sciences ◽

10.1098/rspb.2012.0758 ◽

2012 ◽

Vol 279 (1742) ◽

pp. 3556-3564 ◽

Cited By ~ 42

Author(s):

Nichola J. Raihani ◽

Redouan Bshary

Keyword(s):

Labour Market ◽

Large Scale ◽

Dictator Game ◽

Social Cues ◽

Cross Cultural ◽

Systematic Variation ◽

Mechanical Turk ◽

Amazon Mechanical Turk ◽

Cooperative Behaviour ◽

Positive Effect

People often consider how their behaviour will be viewed by others, and may cooperate to avoid gaining a bad reputation. Sensitivity to reputation may be elicited by subtle social cues of being watched: previous studies have shown that people behave more cooperatively when they see images of eyes rather than control images. Here, we tested whether eye images enhance cooperation in a dictator game, using the online labour market Amazon Mechanical Turk (AMT). In contrast to our predictions and the results of most previous studies, dictators gave away more money when they saw images of flowers rather than eye images. Donations in response to eye images were not significantly different to donations under control treatments. Dictator donations varied significantly across cultures but there was no systematic variation in responses to different image types across cultures. Unlike most previous studies, players interacting via AMT may feel truly anonymous when making decisions and, as such, may not respond to subtle social cues of being watched. Nevertheless, dictators gave away similar amounts as in previous studies, so anonymity did not erase helpfulness. We suggest that eye images might only promote cooperative behaviour in relatively public settings and that people may ignore these cues when they know their behaviour is truly anonymous.

Download Full-text