scholarly journals Chinese Unknown Word Recognition for PCFG-LA Parsing

2014 ◽  
Vol 2014 ◽  
pp. 1-7
Author(s):  
Qiuping Huang ◽  
Liangye He ◽  
Derek F. Wong ◽  
Lidia S. Chao

This paper investigates the recognition of unknown words in Chinese parsing. Two methods are proposed to handle this problem. One is the modification of a character-based model. We model the emission probability of an unknown word using the first and last characters in the word. It aims to reduce the POS tag ambiguities of unknown words to improve the parsing performance. In addition, a novel method, using graph-based semisupervised learning (SSL), is proposed to improve the syntax parsing of unknown words. Its goal is to discover additional lexical knowledge from a large amount of unlabeled data to help the syntax parsing. The method is mainly to propagate lexical emission probabilities to unknown words by building the similarity graphs over the words of labeled and unlabeled data. The derived distributions are incorporated into the parsing process. The proposed methods are effective in dealing with the unknown words to improve the parsing. Empirical results for Penn Chinese Treebank and TCT Treebank revealed its effectiveness.

Author(s):  
Fatima Zahrae El Malaki

Do Moroccan EFL learners depend on the context to infer the meaning of unknown words occurring in sentences? This study investigates the way intermediate and advanced learners infer the meaning of fake words. To this end, the subjects took a test consisting of 60 items with three multiple choices. Subjects were asked to provide appropriate, inappropriate meanings of the unknown word or none of the choices without using dictionaries. The Chi-2 tests were adopted to determine whether there is a) a statistically significant difference between the three categories and b) a statistically significant difference between intermediate and advanced learners’ inferencing results. The findings demonstrate that the context along with the lexical knowledge of the L2 learners play the most important role in understanding vocabulary.


2018 ◽  
Vol 21 (61) ◽  
pp. 67
Author(s):  
Cristian Cardellino ◽  
Laura Alonso Alemany

This work explores the use of word embeddings as features for Spanish verb sense disambiguation (VSD). This type of learning technique is nameddisjoint semisupervised learning: an unsupervised algorithm (i.e. the word embeddings) is trained on unlabeled data separately as a first step, and then its results are used by a supervised classifier. In this work we primarily focus on two aspects of VSD trained with unsupervised word representations. First, we show how the domain where the word embeddings are trained affects the performance of the supervised task. A specific domain can improve the results if this domain is shared with the domain of the supervised task, even if the word embeddings are trained with smaller corpora. Second, we show that the use of word embeddings can help the model generalize when compared to not using word embeddings. This means embeddings help by decreasing the model tendency to overfit.


1980 ◽  
Vol 12 (2) ◽  
pp. 97-103
Author(s):  
John D. Mcneil ◽  
Lisbeth Donant

The present study was an investigation of the transfer effect from training in three word recognition strategies. Results are based upon a test of ability to decode unknown words classified as graphophonic, structural, and contextual. Ninety second, third, and fourth grade children and an equal number of children in a non-instructional control group provided the data. Significant results are discussed in light of the effectiveness of the training in helping pupils decode new words and the specific rather than the general values of each strategy. The data support the efficacy of children learning multiple word recognition strategies for decoding purposes. The study does not, however, treat the relation of word recognition strategies to comprehension of text.


2020 ◽  
Vol 92 (1) ◽  
pp. 388-395
Author(s):  
Lisa Linville ◽  
Dylan Anderson ◽  
Joshua Michalenko ◽  
Jennifer Galasso ◽  
Timothy Draelos

Abstract The impressive performance that deep neural networks demonstrate on a range of seismic monitoring tasks depends largely on the availability of event catalogs that have been manually curated over many years or decades. However, the quality, duration, and availability of seismic event catalogs vary significantly across the range of monitoring operations, regions, and objectives. Semisupervised learning (SSL) enables learning from both labeled and unlabeled data and provides a framework to leverage the abundance of unreviewed seismic data for training deep neural networks on a variety of target tasks. We apply two SSL algorithms (mean-teacher and virtual adversarial training) as well as a novel hybrid technique (exponential average adversarial training) to seismic event classification to examine how unlabeled data with SSL can enhance model performance. In general, we find that SSL can perform as well as supervised learning with fewer labels. We also observe in some scenarios that almost half of the benefits of SSL are the result of the meaningful regularization enforced through SSL techniques and may not be attributable to unlabeled data directly. Lastly, the benefits from unlabeled data scale with the difficulty of the predictive task when we evaluate the use of unlabeled data to characterize sources in new geographic regions. In geographic areas where supervised model performance is low, SSL significantly increases the accuracy of source-type classification using unlabeled data.


Author(s):  
David B. Pisoni ◽  
Susannah V. Levi

This article examines how new approaches—coupled with previous insights—provide a new framework for questions that deal with the nature of phonological and lexical knowledge and representation, processing of stimulus variability, and perceptual learning and adaptation. First, it outlines the traditional view of speech perception and identifies some problems with assuming such a view, in which only abstract representations exist. The article then discusses some new approaches to speech perception that retain detailed information in the representations. It also considers a view which rejects abstraction altogether, but shows that such a view has difficulty dealing with a range of linguistic phenomena. After providing a brief discussion of some new directions in linguistics that encode both detailed information and abstraction, the article concludes by discussing the coupling of speech perception and spoken word recognition.


1990 ◽  
Vol 37 ◽  
pp. 51-58
Author(s):  
Carolien Schouten-van Parreren

Within the larger framework of a project on Mixed Ability Teaching, a qualitative experiment was carried out with respect to the individual differences between pupils of very different ability ranges, when learning French. This experiment was meant to gain insight into the nature of the differences concerning vocabulary learning and reading strategies. 69 pupils (12-15 year) pupils of very different ability ranges (but being educated together) were presented with a variety of vocabulary learning and reading tasks. They worked individually or in pairs and were requested to think aloud. The following tasks were used: 1) while reading a story, guessing the meaning of unknown words from the context, 2) after having read a story, memorizing the meaning of unknown words by means of vocabulary cards, 3) intensive reading of a relatively difficult illustrated story, 4) recalling the meaning of new words incidentally acquired (or not), while reading a story, 5) doing an exercise, involving different reading strategies. The analysis of the protocol records focused on the causes of the differences between weak and strong pupils. The differences which were found could be related to two relevant general strategies: guessing the meaning of an unknown word from the context and analyzing the word form of an unknown word. The main results were the following: 1) the attention of weak pupils tends to be exclusively drawn by one source of information; weak pupils are not able to integrate information from different sources (advance knowledge, text, word forms, context, illustrations, cues), 2) weak pupils take no account whatsoever of the sentence structure, 3) weak pupils have difficulties in generalizing from a new word to an already known word (in the target language or in the mother tongue). The article concludes with some implications for foreign language teaching.


Sign in / Sign up

Export Citation Format

Share Document