scholarly journals Entropy Rate Estimation for English via a Large Cognitive Experiment Using Mechanical Turk

Entropy ◽  
2019 ◽  
Vol 21 (12) ◽  
pp. 1201 ◽  
Author(s):  
Geng Ren ◽  
Shuntaro Takahashi ◽  
Kumiko Tanaka-Ishii

The entropy rate h of a natural language quantifies the complexity underlying the language. While recent studies have used computational approaches to estimate this rate, their results rely fundamentally on the performance of the language model used for prediction. On the other hand, in 1951, Shannon conducted a cognitive experiment to estimate the rate without the use of any such artifact. Shannon’s experiment, however, used only one subject, bringing into question the statistical validity of his value of h = 1.3 bits per character for the English language entropy rate. In this study, we conducted Shannon’s experiment on a much larger scale to reevaluate the entropy rate h via Amazon’s Mechanical Turk, a crowd-sourcing service. The online subjects recruited through Mechanical Turk were each asked to guess the succeeding character after being given the preceding characters until obtaining the correct answer. We collected 172,954 character predictions and analyzed these predictions with a bootstrap technique. The analysis suggests that a large number of character predictions per context length, perhaps as many as 10 3 , would be necessary to obtain a convergent estimate of the entropy rate, and if fewer predictions are used, the resulting h value may be underestimated. Our final entropy estimate was h ≈ 1.22 bits per character.

PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0256632
Author(s):  
Sumeyye Aslan ◽  
Greta Fastrich ◽  
Ed Donnellan ◽  
Daniel J. W. Jones ◽  
Kou Murayama

The purpose of this study was to critically examine how people perceive the definitions, differences and similarities of interest and curiosity, and address the subjective boundaries between interest and curiosity. We used a qualitative research approach given the research questions and the goal to develop an in-depth understanding of people’s meaning of interest and curiosity. We used data from a sample of 126 U.S. adults (48.5% male) recruited through Amazon’s Mechanical Turk (Mage = 40.7, SDage = 11.7). Semi-structured questions were used and thematic analysis was applied. The results showed two themes relating to differences between curiosity and interest; active/stable feelings and certainty/uncertainty. Curiosity was defined as an active feeling (more specifically a first, fleeting feeling) and a child-like emotion that often involves a strong urge to think actively and differently, whereas interest was described as stable and sustainable feeling, which is characterized as involved engagement and personal preferences (e.g., hobbies). In addition, participants related curiosity to uncertainty, e.g., trying new things and risk-taking behaviour. Certainty, on the other hand, was deemed as an important component in the definition of interest, which helps individuals acquire deep knowledge. Both curiosity and interest were reported to be innate and positive feelings that support motivation and knowledge-seeking during the learning process.


2020 ◽  
Vol 84 (1) ◽  
pp. 49-73 ◽  
Author(s):  
Kevin Munger ◽  
Mario Luca ◽  
Jonathan Nagler ◽  
Joshua Tucker

Abstract “Clickbait” headlines designed to entice people to click are frequently used by both legitimate and less-than-legitimate news sources. Contemporary clickbait headlines tend to use emotional partisan appeals, raising concerns about their impact on consumers of online news. This article reports the results of a pair of experiments with different sets of subject pools: one conducted using Facebook ads that explicitly target people with a high preference for clickbait, the other using a sample recruited from Amazon’s Mechanical Turk. We estimate subjects’ individual-level preference for clickbait, and randomly assign sets of subjects to read either clickbait or traditional headlines. Findings show that older people and non-Democrats have a higher “preference for clickbait,” but reading clickbait headlines does not drive affective polarization, information retention, or trust in media.


2016 ◽  
Vol 1 (16) ◽  
pp. 15-27 ◽  
Author(s):  
Henriette W. Langdon ◽  
Terry Irvine Saenz

The number of English Language Learners (ELL) is increasing in all regions of the United States. Although the majority (71%) speak Spanish as their first language, the other 29% may speak one of as many as 100 or more different languages. In spite of an increasing number of speech-language pathologists (SLPs) who can provide bilingual services, the likelihood of a match between a given student's primary language and an SLP's is rather minimal. The second best option is to work with a trained language interpreter in the student's language. However, very frequently, this interpreter may be bilingual but not trained to do the job.


2017 ◽  
Vol 30 (1) ◽  
pp. 111-122 ◽  
Author(s):  
Steve Buchheit ◽  
Marcus M. Doxey ◽  
Troy Pollard ◽  
Shane R. Stinson

ABSTRACT Multiple social science researchers claim that online data collection, mainly via Amazon's Mechanical Turk (MTurk), has revolutionized the behavioral sciences (Gureckis et al. 2016; Litman, Robinson, and Abberbock 2017). While MTurk-based research has grown exponentially in recent years (Chandler and Shapiro 2016), reasonable concerns have been raised about online research participants' ability to proxy for traditional research participants (Chandler, Mueller, and Paolacci 2014). This paper reviews recent MTurk research and provides further guidance for recruiting samples of MTurk participants from populations of interest to behavioral accounting researchers. First, we provide guidance on the logistics of using MTurk and discuss the potential benefits offered by TurkPrime, a third-party service provider. Second, we discuss ways to overcome challenges related to targeted participant recruiting in an online environment. Finally, we offer suggestions for disclosures that authors may provide about their efforts to attract participants and analyze responses.


2021 ◽  
pp. 003435522110142
Author(s):  
Deniz Aydemir-Döke ◽  
James T. Herbert

Microaggressions are daily insults to minority individuals such as people with disabilities (PWD) that communicate messages of exclusion, inferiority, and abnormality. In this study, we developed a new scale, the Ableist Microaggressions Impact Questionnaire (AMIQ), which assesses ableist microaggression experiences of PWD. Data from 245 PWD were collected using Amazon’s Mechanical Turk (MTurk) platform. An exploratory factor analysis of the 25-item AMIQ revealed a three-factor structure with internal consistency reliability ranging between .87 and .92. As a more economical and psychometrically sound instrument assessing microaggression impact as it pertains to disability, the AMIQ offers promise for rehabilitation counselor research and practice.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Jon Agley ◽  
Yunyu Xiao ◽  
Esi E. Thompson ◽  
Lilian Golzarri-Arroyo

Abstract Objective This study describes the iterative process of selecting an infographic for use in a large, randomized trial related to trust in science, COVID-19 misinformation, and behavioral intentions for non-pharmaceutical prevenive behaviors. Five separate concepts were developed based on underlying subcomponents of ‘trust in science and scientists’ and were turned into infographics by media experts and digital artists. Study participants (n = 100) were recruited from Amazon’s Mechanical Turk and randomized to five different arms. Each arm viewed a different infographic and provided both quantitative (narrative believability scale and trust in science and scientists inventory) and qualitative data to assist the research team in identifying the infographic most likely to be successful in a larger study. Results Data indicated that all infographics were perceived to be believable, with means ranging from 5.27 to 5.97 on a scale from one to seven. No iatrogenic outcomes were observed for within-group changes in trust in science. Given equivocal believability outcomes, and after examining confidence intervals for data on trust in science and then the qualitative responses, we selected infographic 3, which addressed issues of credibility and consensus by illustrating changing narratives on butter and margarine, as the best candidate for use in the full study.


English Today ◽  
2002 ◽  
Vol 18 (2) ◽  
pp. 33-38 ◽  
Author(s):  
Chia Boh Peng ◽  
Adam Brown

A consideration of whether EE could conceivably be an alternative to RP as a teaching model.Since David Rosewarne first coined the term in 1984, much has been written about Estuary English (EE). The definition usually given of Estuary English is that if we can imagine a continuum with Received Pronunciation (RP) at one end and Cockney (an urban accent of London) at the other, then Estuary English is in the middle. This definition is restated by Wells (1998-9) as ‘Standard English spoken with the accent of the southeast of England. This highlights two chief points: that it is standard (unlike Cockney) and that it is localized in the southeast (unlike RP)’. The book English Language for Beginners (Lowe & Graham 1998) contains on p. 156 a diagram giving the actress Joanna Lumley as an example of RP, the boxer Frank Bruno for Cockney, and the comedian and writer Ben Elton for EE. This is ironic, in that Ben Elton himself denies that he is a speaker of EE (John Wells, personal communication).


Sign in / Sign up

Export Citation Format

Share Document