CLICS2: An improved database of cross-linguistic colexifications assembling lexical data with the help of cross-linguistic data formats

Johann-Mattis List; Simon J. Greenhill; Cormac Anderson; Thomas Mayer; Tiago Tresoldi; Robert Forkel

doi:10.1515/lingty-2018-0010

CLICS2: An improved database of cross-linguistic colexifications assembling lexical data with the help of cross-linguistic data formats

Linguistic Typology ◽

10.1515/lingty-2018-0010 ◽

2018 ◽

Vol 22 (2) ◽

pp. 277-306 ◽

Cited By ~ 3

Author(s):

Johann-Mattis List ◽

Simon J. Greenhill ◽

Cormac Anderson ◽

Thomas Mayer ◽

Tiago Tresoldi ◽

...

Keyword(s):

Data Aggregation ◽

Reliable Data ◽

Computer Assisted ◽

Semantic Change ◽

Current Form ◽

Linguistic Data ◽

Data Formats ◽

Semantic Associations ◽

Novel Approaches ◽

Lexical Data

Abstract The Database of Cross-Linguistic Colexifications (CLICS), has established a computer-assisted framework for the interactive representation of cross-linguistic colexification patterns. In its current form, it has proven to be a useful tool for various kinds of investigation into cross-linguistic semantic associations, ranging from studies on semantic change, patterns of conceptualization, and linguistic paleontology. But CLICS has also been criticized for obvious shortcomings, ranging from the underlying dataset, which still contains many errors, up to the limits of cross-linguistic colexification studies in general. Building on recent standardization efforts reflected in the Cross-Linguistic Data Formats initiative (CLDF) and novel approaches for fast, efficient, and reliable data aggregation, we have created a new database for cross-linguistic colexifications, which not only supersedes the original CLICS database in terms of coverage but also offers a much more principled procedure for the creation, curation and aggregation of datasets. The paper presents the new database and discusses its major features.

Download Full-text

Lexibank: A public repository of standardized wordlists with computed phonological and lexical features

10.21203/rs.3.rs-870835/v1 ◽

2021 ◽

Author(s):

Johann-Mattis List ◽

Robert Forkel ◽

Simon J. Greenhill ◽

Christoph Rzymski ◽

Johannes Englisch ◽

...

Keyword(s):

Digital Data ◽

Human Cognition ◽

Computer Assisted ◽

Public Repository ◽

Linguistic Data ◽

Language Varieties ◽

Substantial Growth ◽

The Past ◽

Data Formats ◽

Lexical Data

Abstract The past decades have seen substantial growth in digital data on the world's languages. At the same time, the demand for cross-linguistic datasets has been increasing, as witnessed by numerous studies devoted to diverse questions on human prehistory, cultural evolution, and human cognition. Unfortunately, the majority of published datasets lack standardization which makes their comparison difficult. Here, we present the first step to increase the comparability of cross-linguistic lexical data. We have designed workflows for the computer-assisted lifting of datasets to Cross-Linguistic Data Formats, a collection of standards that increase the FAIRness of linguistic data. We test the Lexibank workflow on a collection of 100 lexical datasets from which we derive an aggregated database of wordlists in unified phonetic transcriptions covering more than 2000 language varieties. We illustrate the benefits of our approach by showing how phonological and lexical features can be automatically inferred, complementing and expanding existing cross-linguistic datasets.

Download Full-text

A Corpus of Writing, Pronunciation, Reading, and Listening by Learners of English as a Foreign Language

English Language Teaching ◽

10.5539/elt.v9n9p139 ◽

2016 ◽

Vol 9 (9) ◽

pp. 139 ◽

Cited By ~ 1

Author(s):

Katsunori Kotani ◽

Takehiko Yoshimi ◽

Hiroaki Nanjo ◽

Hitoshi Isahara

Keyword(s):

Foreign Language ◽

Case Studies ◽

Teaching Methods ◽

Language Teaching ◽

Reliability And Validity ◽

Computer Assisted ◽

Practical Application ◽

Linguistic Data ◽

Learner Corpus ◽

Learner Corpora

<p>In order to develop effective teaching methods and computer-assisted language teaching systems for learners of English as a foreign language who need to study the basic linguistic competences for writing, pronunciation, reading, and listening, it is necessary to first investigate which vocabulary and grammar they have or have not yet learned. Identifying such vocabulary and grammar requires a learner corpus for analyzing the accuracy and fluency of learners’ linguistic competences. However, it is difficult to use previous learner corpora for this purpose because they have not compiled all the types of linguistic data that we need. Therefore, this study aimed to solve this problem by designing and developing a new learner corpus that compiles linguistic data regarding the accuracy and fluency of the four basic linguistic competences of writing, pronunciation, reading, and listening. The reliability and validity of the learner corpus were partially confirmed, and practical application of the learner corpus is reported here as case studies.</p>

Download Full-text

Reliable Data Aggregation Protocol for Wireless Sensor Networks

Risk Engineering - Security in Wireless Sensor Networks ◽

10.1007/978-3-319-21269-2_6 ◽

2015 ◽

pp. 77-84

Author(s):

George S. Oreku ◽

Tamara Pazynyuk

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Data Aggregation ◽

Reliable Data ◽

Wireless Sensor

Download Full-text

A Delay-Aware and Reliable Data Aggregation for Cyber-Physical Sensing

Sensors ◽

10.3390/s17020395 ◽

2017 ◽

Vol 17 (2) ◽

pp. 395 ◽

Cited By ~ 6

Author(s):

Jinhuan Zhang ◽

Jun Long ◽

Chengyuan Zhang ◽

Guihu Zhao

Keyword(s):

Data Aggregation ◽

Reliable Data ◽

Physical Sensing

Download Full-text

RDA: Reliable Data Aggregation Protocol for WSNs

2006 International Conference on Wireless Communications, Networking and Mobile Computing ◽

10.1109/wicom.2006.252 ◽

2006 ◽

Cited By ~ 1

Author(s):

Hong Luo ◽

Qi Li ◽

Wei Guo

Keyword(s):

Data Aggregation ◽

Reliable Data

Download Full-text

Reflex prediction

Diachronica ◽

10.1075/dia.20009.bod ◽

2021 ◽

Author(s):

Timotheus A. Bodt¹ ◽

Johann-Mattis List²

Keyword(s):

Sound Change ◽

Computer Assisted ◽

Linguistic Research ◽

Word Forms ◽

Lexical Data ◽

Educational Aspect ◽

Actual Word

Abstract While analysing lexical data of Western Kho-Bwa languages of the Sino-Tibetan or Trans-Himalayan family with the help of a computer-assisted approach for historical language comparison, we observed gaps in the data where one or more varieties lacked forms for certain concepts. We employed a new workflow, combining manual and automated steps, to predict the most likely phonetic realisations of the missing forms in our data, by making systematic use of the information on sound correspondences in words that were potentially cognate with the missing forms. This procedure yielded a list of hypothetical reflexes of previously identified cognate sets, which we first preregistered as an experiment on the prediction of unattested word forms and then compared with actual word forms elicited during secondary fieldwork. In this study we first describe the workflow which we used to predict hypothetical reflexes and the process of elicitation of actual word forms during fieldwork. We then present the results of our reflex prediction experiment. Based on this experiment, we identify four general benefits of reflex prediction in historical language comparison. These comprise (1) an increased transparency of linguistic research, (2) an increased efficiency of field and source work, (3) an educational aspect which offers teachers and learners a wide plethora of linguistic phenomena, including the regularity of sound change, and (4) the possibility of kindling speakers’ interest in their own linguistic heritage.

Download Full-text

Secure and Reliable Data Aggregation for Wireless Sensor Networks

Ubiquitous Computing Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-540-76772-5_8 ◽

2007 ◽

pp. 102-109 ◽

Cited By ~ 23

Author(s):

Suat Ozdemir

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Data Aggregation ◽

Reliable Data ◽

Wireless Sensor

Download Full-text

Reliable Data Aggregation for Real-Time Queries in Wireless Sensor Systems

Lecture Notes in Computer Science - Network and Parallel Computing ◽

10.1007/978-3-540-30141-7_89 ◽

2004 ◽

pp. 601-610

Author(s):

Kam-Yiu Lam ◽

Henry C. W. Pang ◽

Sang H. Son ◽

BiYu Liang

Keyword(s):

Real Time ◽

Data Aggregation ◽

Reliable Data ◽

Wireless Sensor ◽

Sensor Systems

Download Full-text

Energy Efficient Reliable Data Aggregation Technique for Wireless Sensor Networks

2012 International Conference on Computing Sciences ◽

10.1109/iccs.2012.34 ◽

2012 ◽

Cited By ~ 7

Author(s):

Basavaraj S. Mathapati ◽

Siddarama R. Patil ◽

V.D. Mytri

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Data Aggregation ◽

Energy Efficient ◽

Reliable Data ◽

Wireless Sensor ◽

Aggregation Technique

Download Full-text

Managing Historical Linguistic Data for Computational Phylogenetics and Computer-Assisted Language Comparison

10.7551/mitpress/12200.003.0033 ◽

2022 ◽

Keyword(s):

Historical Linguistic ◽

Computer Assisted ◽

Linguistic Data ◽

Computational Phylogenetics

Download Full-text