test languages Latest Research Papers

How well can intelligibility of closely related languages in Europe be predicted by linguistic and non-linguistic variables?

Linguistic Approaches to Bilingualism ◽

10.1075/lab.17084.goo ◽

2019 ◽

Vol 10 (3) ◽

pp. 351-379 ◽

Cited By ~ 1

Author(s):

Charlotte Gooskens ◽

Vincent J. van Heuven

Keyword(s):

Regression Analysis ◽

Stepwise Regression ◽

Important Variable ◽

Stepwise Regression Analysis ◽

Substantial Part ◽

Linguistic Variables ◽

Previous Exposure ◽

Test Languages ◽

Cloze Test

Abstract We measured mutual intelligibility of 16 closely related spoken languages in Europe. Intelligibility was determined for all 70 language combinations using the same uniform methodology (a cloze test). We analysed the results of 1833 listeners representing the mutual intelligibility between young, educated Europeans from the same 16 countries. Lexical, phonological, orthographic, morphological and syntactic distances were computed as linguistic variables. We also quantified non-linguistic variables (e.g. exposure, attitudes towards the test languages). Using stepwise regression analysis the importance of linguistic and non-linguistic predictors for the mutual intelligibility in the 70 language pairs was assessed. Exposure to the test language was the most important variable, overriding all other variables. Then, limiting the analysis to the prediction of inherent intelligibility, we analysed the results for a subset of listeners with no or little previous exposure to the test language. Linguistic distances, especially lexical distance, now explain a substantial part of the variance.

Download Full-text

Surface Statistics of an Unknown Language Indicate How to Parse It

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00248 ◽

2018 ◽

Vol 6 ◽

pp. 667-685 ◽

Cited By ~ 2

Author(s):

Dingquan Wang ◽

Jason Eisner

Keyword(s):

Target Language ◽

Dependency Parsing ◽

Grammar Induction ◽

Past Work ◽

Part Of Speech ◽

Percentage Points ◽

Feature Extractor ◽

Test Languages ◽

Surface Statistics ◽

Multiple Languages

We introduce a novel framework for delexicalized dependency parsing in a new language. We show that useful features of the target language can be extracted automatically from an unparsed corpus, which consists only of gold part-of-speech (POS) sequences. Providing these features to our neural parser enables it to parse sequences like those in the corpus. Strikingly, our system has no supervision in the target language. Rather, it is a multilingual system that is trained end-to-end on a variety of other languages, so it learns a feature extractor that works well. We show experimentally across multiple languages: (1) Features computed from the unparsed corpus improve parsing accuracy. (2) Including thousands of synthetic languages in the training yields further improvement. (3) Despite being computed from unparsed corpora, our learned task-specific features beat previous work’s interpretable typological features that require parsed corpora or expert categorization of the language. Our best method improved attachment scores on held-out test languages by an average of 5.6 percentage points over past work that does not inspect the unparsed data (McDonald et al., 2011), and by 20.7 points over past “grammar induction” work that does not use training languages (Naseem et al., 2010).

Download Full-text

Orchestration of Domain Specific Test Languages with a Behavior Driven Development approach

2018 13th Annual Conference on System of Systems Engineering (SoSE) ◽

10.1109/sysose.2018.8428788 ◽

2018 ◽

Author(s):

Robin Bussenot ◽

Herve Leblanc ◽

Christian Percebois

Keyword(s):

Specific Test ◽

Domain Specific ◽

Development Approach ◽

Test Languages

Download Full-text

Linguistic and extra-linguistic predictors of mutual intelligibility between Germanic languages

Nordic Journal of Linguistics ◽

10.1017/s0332586517000099 ◽

2017 ◽

Vol 40 (2) ◽

pp. 123-147 ◽

Cited By ~ 2

Author(s):

Charlotte Gooskens ◽

Femke Swarte

Keyword(s):

Regression Analysis ◽

Large Scale ◽

Spoken Language ◽

Relative Importance ◽

Linguistic Distance ◽

Language Area ◽

Test Languages ◽

Germanic Languages ◽

First Time

We report on a large-scale investigation of the mutual intelligibility between five Germanic languages: Danish, Dutch, English, German and Swedish. We tested twenty language combinations using the same uniform methodology, making the results commensurable for the first time. We first tested both written and spoken language by means of cloze tests. Next we calculated linguistic distance at the levels of lexicon, orthography, phonology, morphology and syntax. We also quantified exposure and attitudes towards the test languages. Finally, we carried out a regression analysis to determine the relative importance of these linguistic and extra-linguistic predictors for the mutual intelligibility between Germanic languages. The extra-linguistic predictor exposure was the most significant factor in predicting intelligibility in the Germanic language area. The effect of attitude was very small. Lexical distance, orthographic and phonetic distances were the most important linguistic predictors of intelligibility.

Download Full-text

Language performance of sequential bilinguals on an Irish and English sentence repetition task

Linguistic Approaches to Bilingualism ◽

10.1075/lab.15026.ant ◽

2017 ◽

Vol 7 (3-4) ◽

pp. 359-393 ◽

Cited By ~ 4

Author(s):

Stanislava Antonijevic ◽

Ruth Durham ◽

Íde Ní Chonghaile

Keyword(s):

School Age Children ◽

Educational Setting ◽

Typically Developing ◽

Language Performance ◽

English Sentence ◽

Sentence Repetition ◽

Language Assessments ◽

Test Languages ◽

Bilingual School ◽

Repetition Task

Abstract Currently there are no standardized language assessments for English-Irish bilingual school age children that would test languages in a comparable way. There are also no standardized language assessments of Irish for this age group. The current study aimed to design comparable language assessments in both languages targeting structures known to be challenging for children with language impairments. A sentence repetition (SRep) task equivalent to the English SRep task (Marinis, Chiat, Armon-Lotem, Piper, & Roy, 2011) was designed for Irish. Twenty-four typically developing, sequential bilingual children immersed in Irish in the educational setting performed better on the English SRep task than on the Irish SRep task. Different patterns were observed in language performance across sentence types with performance on relative clauses being particularly poor in Irish. Similarly, differences were observed in error patterns with the highest number of errors of omission in Irish, and the highest number of substitution errors in English.

Download Full-text

Multilingual Projection for Parsing Truly Low-Resource Languages

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00100 ◽

2016 ◽

Vol 4 ◽

pp. 301-312 ◽

Cited By ~ 12

Author(s):

Željko Agić ◽

Anders Johannsen ◽

Barbara Plank ◽

Héctor Martínez Alonso ◽

Natalie Schluter ◽

...

Keyword(s):

Empirical Evaluation ◽

Upper Bounds ◽

Low Resource ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Novel Approach ◽

Cross Lingual ◽

Test Languages ◽

Speech Tagging ◽

Parallel Texts

We propose a novel approach to cross-lingual part-of-speech tagging and dependency parsing for truly low-resource languages. Our annotation projection-based approach yields tagging and parsing models for over 100 languages. All that is needed are freely available parallel texts, and taggers and parsers for resource-rich languages. The empirical evaluation across 30 test languages shows that our method consistently provides top-level accuracies, close to established upper bounds, and outperforms several competitive baselines.

Download Full-text

Test Languages for In-the-Loop Avionics Tests

Journal of Aerospace Information Systems ◽

10.2514/1.i010151 ◽

2015 ◽

Vol 12 (4) ◽

pp. 374-391 ◽

Cited By ~ 1

Author(s):

Alexandru-Robert Guduvan ◽

Hélène Waeselynck ◽

Virginie Wiels ◽

Guy Durrieu ◽

Yann Fusero ◽

...

Keyword(s):

Test Languages

Download Full-text

Abstraction and unified access to test equipments in spacecraft test languages

2012 IEEE International Conference on Oxide Materials for Electronic Engineering (OMEE) ◽

10.1109/omee.2012.6343641 ◽

2012 ◽

Author(s):

Sun Bo ◽

Li Xianjun ◽

Ma Shilong

Keyword(s):

Test Languages

Download Full-text

Bridging test languages to reuse test information: A metamodel for test information

2009 IEEE International Conference on Information Reuse & Integration ◽

10.1109/iri.2009.5211596 ◽

2009 ◽

Author(s):

Shuai Wang ◽

Yindong Ji ◽

Wei Dong ◽

Shiyuan Yang

Keyword(s):

Test Information ◽

Test Languages

Download Full-text

Entry Generation by Analogy – Encoding New Words for Morphological Lexicons

Northern European Journal of Language Technology ◽

10.3384/nejlt.2000-1533.09111 ◽

2009 ◽

Vol 1 ◽

pp. 1-25 ◽

Cited By ~ 5

Author(s):

Krister Lindén

Keyword(s):

Native Speaker ◽

Combined Model ◽

New Words ◽

Loan Words ◽

Software Applications ◽

Base Form ◽

Finite State ◽

Finite State Transducer ◽

Test Languages ◽

Lexical Data

Language software applications encounter new words, e.g., acronyms, technical terminology, loan words, names or compounds of such words. To add new words to a lexicon, we need to indicate their base form and inflectional paradigm. In this article, we evaluate a combination of corpus-based and lexicon-based methods for assigning the base form and inflectional paradigm to new words in Finnish, Swedish and English finite-state transducer lexicons. The methods have been implemented with the open-source Helsinki Finite-State Technology (Lindén & al., 2009). As an entry generator often produces numerous suggestions, it is important that the best suggestions be among the first few, otherwise it may become more efficient to create the entries by hand. By combining the probabilities calculated from corpus data and from lexical data, we get a more precise combined model. The combined method has 77-81 % precision and 89-97 % recall, i.e. the first correctly generated entry is on the average found as the first or second candidate for the test languages. A further study demonstrated that a native speaker could revise suggestions from the entry generator at a speed of 300-400 entries per hour.

Download Full-text

test languages
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

How well can intelligibility of closely related languages in Europe be predicted by linguistic and non-linguistic variables?

Surface Statistics of an Unknown Language Indicate How to Parse It

Orchestration of Domain Specific Test Languages with a Behavior Driven Development approach

Linguistic and extra-linguistic predictors of mutual intelligibility between Germanic languages

Language performance of sequential bilinguals on an Irish and English sentence repetition task

Multilingual Projection for Parsing Truly Low-Resource Languages

Test Languages for In-the-Loop Avionics Tests

Abstraction and unified access to test equipments in spacecraft test languages

Bridging test languages to reuse test information: A metamodel for test information

Entry Generation by Analogy – Encoding New Words for Morphological Lexicons

Export Citation Format

test languagesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

How well can intelligibility of closely related languages in Europe be predicted by linguistic and non-linguistic variables?

Surface Statistics of an Unknown Language Indicate How to Parse It

Orchestration of Domain Specific Test Languages with a Behavior Driven Development approach

Linguistic and extra-linguistic predictors of mutual intelligibility between Germanic languages

Language performance of sequential bilinguals on an Irish and English sentence repetition task

Multilingual Projection for Parsing Truly Low-Resource Languages

Test Languages for In-the-Loop Avionics Tests

Abstraction and unified access to test equipments in spacecraft test languages

Bridging test languages to reuse test information: A metamodel for test information

Entry Generation by Analogy – Encoding New Words for Morphological Lexicons

test languages
Recently Published Documents