Chinese Unknown Word Recognition for PCFG-LA Parsing

The Scientific World JOURNAL ◽

10.1155/2014/959328 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7

Author(s):

Qiuping Huang ◽

Liangye He ◽

Derek F. Wong ◽

Lidia S. Chao

Keyword(s):

Word Recognition ◽

Semisupervised Learning ◽

Unlabeled Data ◽

Emission Probability ◽

Lexical Knowledge ◽

Unknown Word ◽

Empirical Results ◽

Novel Method ◽

Syntax Parsing ◽

Unknown Words

This paper investigates the recognition of unknown words in Chinese parsing. Two methods are proposed to handle this problem. One is the modification of a character-based model. We model the emission probability of an unknown word using the first and last characters in the word. It aims to reduce the POS tag ambiguities of unknown words to improve the parsing performance. In addition, a novel method, using graph-based semisupervised learning (SSL), is proposed to improve the syntax parsing of unknown words. Its goal is to discover additional lexical knowledge from a large amount of unlabeled data to help the syntax parsing. The method is mainly to propagate lexical emission probabilities to unknown words by building the similarity graphs over the words of labeled and unlabeled data. The derived distributions are incorporated into the parsing process. The proposed methods are effective in dealing with the unknown words to improve the parsing. Empirical results for Penn Chinese Treebank and TCT Treebank revealed its effectiveness.

Download Full-text

INFERENCING FAKE WORDS’ MEANING BY MOROCCAN EFL LEARNERS

The International Journal of Applied Language Studies and Culture ◽

10.34301/alsc.v3i1.25 ◽

2020 ◽

Vol 3 (1) ◽

pp. 5-10

Author(s):

Fatima Zahrae El Malaki

Keyword(s):

Lexical Knowledge ◽

Advanced Learners ◽

Efl Learners ◽

Unknown Word ◽

Significant Difference ◽

L2 Learners ◽

Unknown Words ◽

The Way

Do Moroccan EFL learners depend on the context to infer the meaning of unknown words occurring in sentences? This study investigates the way intermediate and advanced learners infer the meaning of fake words. To this end, the subjects took a test consisting of 60 items with three multiple choices. Subjects were asked to provide appropriate, inappropriate meanings of the unknown word or none of the choices without using dictionaries. The Chi-2 tests were adopted to determine whether there is a) a statistically significant difference between the three categories and b) a statistically significant difference between intermediate and advanced learners’ inferencing results. The findings demonstrate that the context along with the lexical knowledge of the L2 learners play the most important role in understanding vocabulary.

Download Full-text

Exploring the impact of word embeddings for disjoint semisupervised Spanish verb sense disambiguation

INTELIGENCIA ARTIFICIAL ◽

10.4114/intartif.vol21iss61pp67-81 ◽

2018 ◽

Vol 21 (61) ◽

pp. 67

Author(s):

Cristian Cardellino ◽

Laura Alonso Alemany

Keyword(s):

Semisupervised Learning ◽

Unlabeled Data ◽

Word Embeddings ◽

Specific Domain ◽

Sense Disambiguation ◽

Learning Technique ◽

The Impact

This work explores the use of word embeddings as features for Spanish verb sense disambiguation (VSD). This type of learning technique is nameddisjoint semisupervised learning: an unsupervised algorithm (i.e. the word embeddings) is trained on unlabeled data separately as a first step, and then its results are used by a supervised classifier. In this work we primarily focus on two aspects of VSD trained with unsupervised word representations. First, we show how the domain where the word embeddings are trained affects the performance of the supervised task. A specific domain can improve the results if this domain is shared with the domain of the supervised task, even if the word embeddings are trained with smaller corpora. Second, we show that the use of word embeddings can help the model generalize when compared to not using word embeddings. This means embeddings help by decreasing the model tendency to overfit.

Download Full-text

Transfer Effect of Word Recognition Strategies

Journal of Reading Behavior ◽

10.1080/10862968009547360 ◽

1980 ◽

Vol 12 (2) ◽

pp. 97-103

Author(s):

John D. Mcneil ◽

Lisbeth Donant

Keyword(s):

Word Recognition ◽

Fourth Grade ◽

Transfer Effect ◽

Control Group ◽

New Words ◽

Number Of Children ◽

Instructional Control ◽

Data Support ◽

Unknown Words

The present study was an investigation of the transfer effect from training in three word recognition strategies. Results are based upon a test of ability to decode unknown words classified as graphophonic, structural, and contextual. Ninety second, third, and fourth grade children and an equal number of children in a non-instructional control group provided the data. Significant results are discussed in light of the effectiveness of the training in helping pupils decode new words and the specific rather than the general values of each strategy. The data support the efficacy of children learning multiple word recognition strategies for decoding purposes. The study does not, however, treat the relation of word recognition strategies to comprehension of text.

Download Full-text

Semisupervised Learning for Seismic Monitoring Applications

Seismological Research Letters ◽

10.1785/0220200195 ◽

2020 ◽

Vol 92 (1) ◽

pp. 388-395

Author(s):

Lisa Linville ◽

Dylan Anderson ◽

Joshua Michalenko ◽

Jennifer Galasso ◽

Timothy Draelos

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Seismic Event ◽

Model Performance ◽

Seismic Monitoring ◽

Semisupervised Learning ◽

Unlabeled Data ◽

Hybrid Technique ◽

Source Type ◽

Adversarial Training

Abstract The impressive performance that deep neural networks demonstrate on a range of seismic monitoring tasks depends largely on the availability of event catalogs that have been manually curated over many years or decades. However, the quality, duration, and availability of seismic event catalogs vary significantly across the range of monitoring operations, regions, and objectives. Semisupervised learning (SSL) enables learning from both labeled and unlabeled data and provides a framework to leverage the abundance of unreviewed seismic data for training deep neural networks on a variety of target tasks. We apply two SSL algorithms (mean-teacher and virtual adversarial training) as well as a novel hybrid technique (exponential average adversarial training) to seismic event classification to examine how unlabeled data with SSL can enhance model performance. In general, we find that SSL can perform as well as supervised learning with fewer labels. We also observe in some scenarios that almost half of the benefits of SSL are the result of the meaningful regularization enforced through SSL techniques and may not be attributable to unlabeled data directly. Lastly, the benefits from unlabeled data scale with the difficulty of the predictive task when we evaluate the use of unlabeled data to characterize sources in new geographic regions. In geographic areas where supervised model performance is low, SSL significantly increases the accuracy of source-type classification using unlabeled data.

Download Full-text

Representations and representational specificity in speech perception and spoken word recognition

The Oxford Handbook of Psycholinguistics ◽

10.1093/oxfordhb/9780198568971.013.0001 ◽

2007 ◽

pp. 2-18 ◽

Cited By ~ 2

Author(s):

David B. Pisoni ◽

Susannah V. Levi

Keyword(s):

Speech Perception ◽

Word Recognition ◽

Spoken Word Recognition ◽

Spoken Word ◽

Lexical Knowledge ◽

New Approaches ◽

Abstract Representations ◽

New Directions ◽

Learning And Adaptation ◽

New Framework

This article examines how new approaches—coupled with previous insights—provide a new framework for questions that deal with the nature of phonological and lexical knowledge and representation, processing of stimulus variability, and perceptual learning and adaptation. First, it outlines the traditional view of speech perception and identifies some problems with assuming such a view, in which only abstract representations exist. The article then discusses some new approaches to speech perception that retain detailed information in the representations. It also considers a view which rejects abstraction altogether, but shows that such a view has difficulty dealing with a range of linguistic phenomena. After providing a brief discussion of some new directions in linguistics that encode both detailed information and abstraction, the article concludes by discussing the coupling of speech perception and spoken word recognition.

Download Full-text

Verschillen Tussen Leerlingen ten Aanzien Van Woordverwerving en Leesstrategieen

Toegepaste Taalwetenschap in Artikelen ◽

10.1075/ttwia.37.06par ◽

1990 ◽

Vol 37 ◽

pp. 51-58

Author(s):

Carolien Schouten-van Parreren

Keyword(s):

Reading Strategies ◽

Mother Tongue ◽

Vocabulary Learning ◽

Target Language ◽

Sentence Structure ◽

Unknown Word ◽

Word Forms ◽

Unknown Words ◽

The Individual ◽

Intensive Reading

Within the larger framework of a project on Mixed Ability Teaching, a qualitative experiment was carried out with respect to the individual differences between pupils of very different ability ranges, when learning French. This experiment was meant to gain insight into the nature of the differences concerning vocabulary learning and reading strategies. 69 pupils (12-15 year) pupils of very different ability ranges (but being educated together) were presented with a variety of vocabulary learning and reading tasks. They worked individually or in pairs and were requested to think aloud. The following tasks were used: 1) while reading a story, guessing the meaning of unknown words from the context, 2) after having read a story, memorizing the meaning of unknown words by means of vocabulary cards, 3) intensive reading of a relatively difficult illustrated story, 4) recalling the meaning of new words incidentally acquired (or not), while reading a story, 5) doing an exercise, involving different reading strategies. The analysis of the protocol records focused on the causes of the differences between weak and strong pupils. The differences which were found could be related to two relevant general strategies: guessing the meaning of an unknown word from the context and analyzing the word form of an unknown word. The main results were the following: 1) the attention of weak pupils tends to be exclusively drawn by one source of information; weak pupils are not able to integrate information from different sources (advance knowledge, text, word forms, context, illustrations, cues), 2) weak pupils take no account whatsoever of the sentence structure, 3) weak pupils have difficulties in generalizing from a new word to an already known word (in the target language or in the mother tongue). The article concludes with some implications for foreign language teaching.

Download Full-text

Chinese Unknown Word Recognition Based on Functional Applications of Type Theory

2008 Second International Symposium on Intelligent Information Technology Application ◽

10.1109/iita.2008.378 ◽

2008 ◽

Cited By ~ 1

Author(s):

Dongping Gao ◽

Zhendong Niu ◽

Lening Lv ◽

Peng Jiang ◽

Xiao Qin ◽

...

Keyword(s):

Word Recognition ◽

Type Theory ◽

Unknown Word ◽

Functional Applications

Download Full-text

An Improved Unknown Word Recognition Model based on Multi-Knowledge Source Method

Sixth International Conference on Intelligent Systems Design and Applications ◽

10.1109/isda.2006.253719 ◽

2006 ◽

Cited By ~ 2

Author(s):

Wei Jiang ◽

Yi Guan ◽

Xiao-long Wang

Keyword(s):

Word Recognition ◽

Knowledge Source ◽

Unknown Word ◽

Recognition Model ◽

Model Based ◽

Source Method

Download Full-text

Improving Korean verb–verb morphological disambiguation using lexical knowledge from unambiguous unlabeled data and selective web counts

Pattern Recognition Letters ◽

10.1016/j.patrec.2011.09.003 ◽

2012 ◽

Vol 33 (1) ◽

pp. 62-70

Author(s):

Seonho Kim ◽

Juntae Yoon ◽

Jungyun Seo ◽

Seog Park

Keyword(s):

Unlabeled Data ◽

Lexical Knowledge

Download Full-text

Lexical knowledge in word recognition: Word length and word frequency in naming and lexical decision tasks

Journal of Memory and Language ◽

10.1016/0749-596x(85)90015-4 ◽

1985 ◽

Vol 24 (1) ◽

pp. 46-58 ◽

Cited By ~ 89

Author(s):

Patrick T.W Hudson ◽

Marijke W Bergman

Keyword(s):

Word Recognition ◽

Lexical Decision ◽

Word Frequency ◽

Word Length ◽

Lexical Knowledge ◽

Lexical Decision Tasks ◽

Decision Tasks

Download Full-text