frequency word
Recently Published Documents


TOTAL DOCUMENTS

55
(FIVE YEARS 10)

H-INDEX

8
(FIVE YEARS 0)

2021 ◽  
Author(s):  
◽  
Yen Dang

<p>Understanding academic spoken English is challenging for second language (L2) learners at English-medium universities. A lack of vocabulary is a major reason for this difficulty. To help these learners overcome this challenge, it is important to examine the nature of vocabulary in academic spoken English.  This thesis presents three linked studies which were conducted to address this need. Study 1 examined the lexical coverage in nine spoken and nine written corpora of four well-known general high-frequency word lists: West’s (1953) General Service List (GSL), Nation’s (2006) BNC2000, Nation’s (2012) BNC/COCA2000, and Brezina and Gablasova’s (2015) New-GSL.  Study 2 further compared the BNC/COCA2000 and the New-GSL, which had the highest coverage in Study 1. It involved 25 English first language (L1) teachers, 26 Vietnamese L1 teachers, 27 various L1 teachers, and 275 Vietnamese English as a Foreign Language learners. The teachers completed 10 surveys in which they rated the usefulness of 973 non-overlapping items between the BNC/COCA2000 and the New-GSL for their learners in a five-point Likert scale. The learners took the Vocabulary Levels Test (Nation, 1983, 1990; Schmitt, Schmitt, & Clapham, 2001), and 15 Yes/No tests which measured their knowledge of the 973 words.  Study 3 involved compiling two academic spoken corpora, one academic written corpus, and one non-academic spoken corpus. Each contains approximately 13-million running words. The academic spoken corpora contained four equally-sized sub-corpora. From the first academic spoken corpus, 1,741 word families were selected for the Academic Spoken Word List (ASWL). The coverage of the ASWL and the BNC/COCA2000 in the four corpora and the potential coverage of the ASWL for learners of different vocabulary levels were determined.  Six main findings were drawn from these studies. First, in the first academic spoken corpus, the ASWL and its levels had slightly higher coverage in certain disciplinary sub-corpora than in the others. Yet, the list provided around 90% coverage of each sub-corpus. It helps learners to achieve 92%-96% coverage of academic speech depending on their levels. Second, the BNC/COCA2000 is the most suitable general high-frequency word list for L2 learners from the perspectives of corpus linguistics, teachers, and learners. It provided higher coverage than the GSL and the BNC2000, and had more words known by learners and perceived as being useful by teachers than the New-GSL. Third, general high-frequency words, especially the most frequent 1,000 words, provided much higher coverage in spoken corpora than written corpora in both academic and non-academic discourse. Fourth, despite the importance of general high-frequency words, a reasonable proportion of the learners had insufficient knowledge of these words, which highlights the importance of a word list which is adaptable to learners’ proficiency like the ASWL. Fifth, lexical coverage had significant but small correlations with teacher perception of word usefulness and learner vocabulary knowledge. Sixth, the Vietnamese L1 teachers had the highest correlation between the teacher ratings of word usefulness and the learner vocabulary knowledge. Next came the various L1 teachers, and then the English L1 teachers.  This thesis also provides theoretical, pedagogical, and methodological implications of these findings so that L2 learners can gain better support in their vocabulary development and achieve better comprehension of academic spoken English.</p>


2021 ◽  
Author(s):  
◽  
Yen Dang

<p>Understanding academic spoken English is challenging for second language (L2) learners at English-medium universities. A lack of vocabulary is a major reason for this difficulty. To help these learners overcome this challenge, it is important to examine the nature of vocabulary in academic spoken English.  This thesis presents three linked studies which were conducted to address this need. Study 1 examined the lexical coverage in nine spoken and nine written corpora of four well-known general high-frequency word lists: West’s (1953) General Service List (GSL), Nation’s (2006) BNC2000, Nation’s (2012) BNC/COCA2000, and Brezina and Gablasova’s (2015) New-GSL.  Study 2 further compared the BNC/COCA2000 and the New-GSL, which had the highest coverage in Study 1. It involved 25 English first language (L1) teachers, 26 Vietnamese L1 teachers, 27 various L1 teachers, and 275 Vietnamese English as a Foreign Language learners. The teachers completed 10 surveys in which they rated the usefulness of 973 non-overlapping items between the BNC/COCA2000 and the New-GSL for their learners in a five-point Likert scale. The learners took the Vocabulary Levels Test (Nation, 1983, 1990; Schmitt, Schmitt, & Clapham, 2001), and 15 Yes/No tests which measured their knowledge of the 973 words.  Study 3 involved compiling two academic spoken corpora, one academic written corpus, and one non-academic spoken corpus. Each contains approximately 13-million running words. The academic spoken corpora contained four equally-sized sub-corpora. From the first academic spoken corpus, 1,741 word families were selected for the Academic Spoken Word List (ASWL). The coverage of the ASWL and the BNC/COCA2000 in the four corpora and the potential coverage of the ASWL for learners of different vocabulary levels were determined.  Six main findings were drawn from these studies. First, in the first academic spoken corpus, the ASWL and its levels had slightly higher coverage in certain disciplinary sub-corpora than in the others. Yet, the list provided around 90% coverage of each sub-corpus. It helps learners to achieve 92%-96% coverage of academic speech depending on their levels. Second, the BNC/COCA2000 is the most suitable general high-frequency word list for L2 learners from the perspectives of corpus linguistics, teachers, and learners. It provided higher coverage than the GSL and the BNC2000, and had more words known by learners and perceived as being useful by teachers than the New-GSL. Third, general high-frequency words, especially the most frequent 1,000 words, provided much higher coverage in spoken corpora than written corpora in both academic and non-academic discourse. Fourth, despite the importance of general high-frequency words, a reasonable proportion of the learners had insufficient knowledge of these words, which highlights the importance of a word list which is adaptable to learners’ proficiency like the ASWL. Fifth, lexical coverage had significant but small correlations with teacher perception of word usefulness and learner vocabulary knowledge. Sixth, the Vietnamese L1 teachers had the highest correlation between the teacher ratings of word usefulness and the learner vocabulary knowledge. Next came the various L1 teachers, and then the English L1 teachers.  This thesis also provides theoretical, pedagogical, and methodological implications of these findings so that L2 learners can gain better support in their vocabulary development and achieve better comprehension of academic spoken English.</p>


2021 ◽  
Author(s):  
Mark Koranda ◽  
Martin Zettersten ◽  
Maryellen MacDonald

While many implicit decisions are the result of a trade-off, trade-offs in word use, such as whether a producer meant to convey a message more aligned with kitten despite saying a more accessible word like cat, are difficult to measure. To test the trade-off between message alignment and accessibility, we designed an artificial lexicon where word meanings corresponded to angles on a compass. In a novel language communication game, participants trained on some words more than others (high- vs low-frequency), and then earned points by producing words, often requiring an implicit decision between a high- vs low-frequency word. A trade-off was observed across four experiments, such that high-frequency words were produced even when less aligned with messages. Since high-frequency words are more accessible, these results suggest that implicit decisions between words are impacted by accessibility. Of all the times that people have said cat, many times they likely meant kitten.


SAGE Open ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 215824402110361
Author(s):  
Shang-Yu Wu ◽  
Shanju Lin ◽  
Rei-Jane Huang ◽  
I-Fang Tsai

The purpose of this study was to provide high frequency word lists for Mandarin-speaking children between 3 and 6 years of age and to explore the differences between each part of speech (POS) category among different age groups. Participants were 209 typically developing native Mandarin speakers aged between 3 and 6 years, born in Taiwan, and recruited from Mandarin-language preschools in Taipei, New Taipei City, and Miaoli. Language samples were collected through conversations, free play, and story retelling. The researchers then transcribed the samples, segment utterances, and words, tagging the POS corresponding to each word. The frequencies of word occurrences were then analyzed and ranked to generate a high frequency word list. The mean frequency of each POS category was calculated to identify significant differences between age groups. The results showed high frequency word lists, including the corresponding POS tagging. Significant differences were found in 10 of the 11 POS categories among age groups. The results of this study presented preliminary information concerning high frequency words produced by Mandarin-speaking children aged between 3 and 6 years and the development of their use of each POS category.


2021 ◽  
Author(s):  
Sarah Ostapchuk

This study analyses how popular communication mediums over the past century have changed the form and content of poetry. A periodical and small magazine published in 1912 are assessed and compared, as well as an anthology and several poems from Instagram published in 2014. All poems are also briefly compared to get an understanding of change over time. Medium affordances are considered, especially with respect to multimodal capacities. By assessing vocabulary density, word frequency, word distinctiveness, and visual formatting, characteristics of poetry from specific mediums arise, leading to a conclusion that mediums have an effect on the evolution of poetry.


2021 ◽  
Author(s):  
Sarah Ostapchuk

This study analyses how popular communication mediums over the past century have changed the form and content of poetry. A periodical and small magazine published in 1912 are assessed and compared, as well as an anthology and several poems from Instagram published in 2014. All poems are also briefly compared to get an understanding of change over time. Medium affordances are considered, especially with respect to multimodal capacities. By assessing vocabulary density, word frequency, word distinctiveness, and visual formatting, characteristics of poetry from specific mediums arise, leading to a conclusion that mediums have an effect on the evolution of poetry.


2021 ◽  
pp. 136700692110008
Author(s):  
Allie Patterson

Aims and Objectives: Embodiment is a major paradigm of first language (L1) research but has not yet been widely adopted in second language (L2) research. The main objective of this research was to find evidence for the effects of sensorimotor embodiment on L2 listening functor comprehension rates. Research Hypothesis: Frequency, word length, and Minkowski3 sensorimotor norms are significantly predictive of functor comprehension probability in an L2 listening task. Methodology: 129 Japanese participants were administered a paused transcription test that contained twelve target phrases. Data and analysis: Transcription of functors was the dependent variable. The independent variables were frequency, word length, and Minkowski3 sensorimotor ratings. These variables were analyzed with logit mixed-effects regressions. Findings/conclusions: Greater frequency, longer word length, and higher Minkowski3 ratings were found to facilitate comprehension and significantly increase the probability that a functor was transcribed. Frequency rates derived from spontaneous L1 oration and conversations were found to be significant, whereas frequency derived from written texts was not significant despite being from a much larger corpus. Originality: No L2 study has used Minkowski3 sensorimotor ratings to predict L2 performance. Minkowski3 ratings quantify the relationship between language and the body. Few researchers have yet to incorporate embodiment theories into models of L2 comprehension. Implications: Embodiment theories complement usage-based approaches and should be incorporated into existing L2 theories. Researchers should be aware of textual differences between corpora and choose corpora appropriate for their analyses.


2021 ◽  
Vol 5 (3) ◽  
Author(s):  
Hieu Manh Do

This study aims to find out the frequency word lists in the TED talks in the education field as well as the comparison of the language used by native speakers (NS) and non-native speakers (NNS). The researcher collected four transcripts (two from NS and the others two from NNS) from the TED talks. AntConc is the main software that would be used to investigate the frequency word lists. Data collection includes two steps: (1) collecting the four transcripts of TED talks and (2) listing top 10, 20, and 100 frequency word lists of TED talks corpus of NS and NNS, separately. The findings found that both speakers usually use functional words more than content words. However, content words play a pivotal role in making a full meaning sentence. <p> </p><p><strong> Article visualizations:</strong></p><p><img src="/-counters-/edu_01/0787/a.php" alt="Hit counter" /></p>


Author(s):  
Bapuji Rao

The chapter is about the clustering of text documents based on the input of the n-number of words on the m-number of text documents using graph mining techniques. The author has proposed an algorithm for clustering of text documents by inputting n-number of words on m-number of text documents. First of all the proposed algorithm starts the selection of documents with extension name “.txt” from m-numbers of documents having various types of extension names. The n-number of words are input on the selected “.txt” documents, the algorithm starts n-clustering of text documents based on an n-input word. This is possible by way of creation of a document-word frequency matrix in the memory. Then the frequency-word table is converted into the un-oriented document-word incidence matrix by replacing all non-zeros with 1s. Using the un-oriented document-word incidence matrix, the algorithm starts the creation of n-number of clusters of text documents having the presence of words ranging from 1 to n respectively. Finally, these n-clusters based on word-wise as well as 1 to n word-wise.


Author(s):  
Ana Luiza Pires de Freitas ◽  
Ana Eliza Pereira Bocorny

Introduction: Abstracts are critical in medical contexts. They contain formulaic building blocks called Lexical Frames (LFs), which are high-frequency word sequences with variable slots that can be formed around collocation nodes. LFs are abundant in written academic discourse, and , for this reason, have great importance for the production of abstracts. Extensive research has been conducted on formulaic language, especially on medical genres. Fewer studies, however, have focused on LFs from specialty-specific corpora (.e.g., epidemiology) and their relationship with the rhetorical structure of abstracts. Objective: This study aims to fill this gap by describing the structure of epidemiology abstracts, presenting their rhetorical functions, and identifying the LFs that linguistically realize these functions to help researchers write more conventional abstracts. Methods: We put together three corpora of abstracts in the field, published in English in peer-reviewed journals, and combined genre analysis and Corpus Linguistics principles to identify the linguistic realizations of the rhetorical functions in the texts. First, the rhetorical structure was described; then, the LFs were identified and analyzed. Results: 92% of the texts follow a pre-established pattern, whose structure consists of five to nine sections. Eight saliently frequent nodes (study, result, method, conclusion, review, analysis, patients, and findings) around which the LFs are constructed were identified. Conclusion: Even though both the content and function words that make up the LFs show some variation, it is possible to notice that the LFs elicited typify the linguistic realizations of the corresponding sections' rhetorical functions and, thus, are suitable to the observation of a pattern. For that reason, the data obtained in this study were used to inform the creation of a support framework for the writing of specialty-specific medical abstracts.


Sign in / Sign up

Export Citation Format

Share Document