Digital Text Collections, Linguistic Research Data, and Mashups: Notes on the Legal Situation

2008 ◽  
Vol 57 (1) ◽  
pp. 52-71 ◽  
Author(s):  
Timm Lehmberg ◽  
Dr. Georg Rehm ◽  
Dr. Andreas Witt ◽  
Felix Zimmermann
Author(s):  
Truus Kruyt

This paper discusses the advantages of encoded digital text over printed text,from a researcher's perspective. The traditional notion of text corpus as a well-consideredcollection of texts is related to the huge amounts of digital textsthat are currently available on the web. After examples of useful digitalizationinitiatives and available digital resources, information is given about the usersand uses of the text corpora stored at the lnstitute for Dutch Lexicology.Attention is paid to some obstacles in building or using text collections. Theconclusion is that up till now the digital medium primarily facilitates researchrather than evokes new linguistic research questions.


HUMANIKA ◽  
2015 ◽  
Vol 22 (2) ◽  
pp. 92
Author(s):  
Suharyo Suharyo ◽  
Surono Surono ◽  
Mujid F Amin

This article is based on the assumption that language is not in a social vacuum. Language is more than a set of words that merely linguistic, but also social. Therefore, the current linguistic research should take into account the social dimension in the analysis are critical, such as van Dijk’s critical discourse analysis (CDA) research model. The critical discourse analysis research  considering the text, context, social cognition, and analysis/social context. Research steps include: exposing the macro structure (thematic), superstructure (schematic), and microstructure consisting of semantics, syntax, stylistic, and rhetoric. Accordingly, this study uses the method read and record while research data has been collected from Suara Merdeka and Kompas newspaper. Finally concluded that the language represents the ideology and power (symbolic) both individual and communal.


Author(s):  
رضوان اسخيطة

This Research – Data Protection Laws and Ways to Conformity – talked about the data protection laws in Arabic countries and take the general data protection regulation as research main example. The main purpose of that is to show which point has developed the GDPR and to make a comparison with the legal situation in Arabic countries, the UAE was used as the main example of the Arabic countries in this research due to the advanced development level in technical and legal fields.This research uses mainly the books, electronic contribution and online guides as references and use the comparison as main research method. The new points in the European laws and the expanding of GDPR application to every country in specific cases related to the main discussion points in this research. The research comes to the results with recommendations about the best ways to be conform with the protection laws and the features behind this conformity.


Publications ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 33
Author(s):  
Hanna Hedeland

This article describes the development of the digital infrastructure at a research data centre for audio-visual linguistic research data, the Hamburg Centre for Language Corpora (HZSK) at the University of Hamburg in Germany, over the past ten years. The typical resource hosted in the HZSK Repository, the core component of the infrastructure, is a collection of recordings with time-aligned transcripts and additional contextual data, a spoken language corpus. Since the centre has a thematic focus on multilingualism and linguistic diversity and provides its service to researchers within linguistics and other disciplines, the development of the infrastructure was driven by diverse usage scenarios and user needs on the one hand, and by the common technical requirements for certified service centres of the CLARIN infrastructure on the other. Beyond the technical details, the article also aims to be a contribution to the discussion on responsibilities and services within emerging digital research data infrastructures and the fundamental issues in sustainability of research software engineering, concluding that in order to truly cater to user needs across the research data lifecycle, we still need to bridge the gap between discipline-specific research methods in the process of digitalisation and generic digital research data management approaches.


2016 ◽  
Vol 27 (2) ◽  
pp. 156
Author(s):  
Prihantoro Prihantoro

The research problems in this research are 1) how lexicogrammar takes role in determining polarity of F-Word1 and 2) how to formalize it for corpus processing. The data is obtained from the Contemporary American English Corpus (COCA). In this corpus, F-word is proven to be highest in frequency as compared to its distribution across corpora. Corpus methodology is applied by sending queries to retrieve F-Words to COCA interface. Tokens combination surrounding F-words resulted in the phrase and clause unit accompanying F-words, which are significant cues to determine F-word polarity. The polarity is later proven to be not necessarily negative. I also designed a computational resource to allow the retrieval of F-words offline so that users might apply it to any digital text collections.


2017 ◽  
Vol 4 (1) ◽  
pp. 38
Author(s):  
Awliya Rahmi

The research discusse joke strategies in American situational comedy How I Met You Mother (HIMYM). The purpose of this research is to identify; (1) the joke strategies in situational comedy HIMYM (2) The pragmatic meaning of jokes that are expressed by the characters in HIMYM AND (3) Pragmatic prank functions that are expressed by the characters in HIMYM. This research is categorized as descriptive linguistic research. Observation method applied in data collection, while the method of distribution and matching applied in analyzing data. The results of this research data analysis is presented using informal and formal methods. From the results of data analysis found 14 strategy joke uttered by characters in a situational comedy American HIMYM, namely: ambiguity, grammar, syllabics, idiomatics, questionable English, antonymics, style, negativism, lexicography, spelling, punctuation, Rhyming English, numerical English And part of speech. The dominant strategy used is ambiguity because there are many words in English that mean more than one and are likely to lead the listener to multiple interpretations. The jokes uttered by the characters in situational comedy HIMYM have assertive, expressive and directive meanings. Moreover, the joke also serves to show the power, solidarity and psychological defense of the speaker.


2020 ◽  
Author(s):  
Liezl Ball ◽  
◽  
Theo Bothma ◽  

Introduction. With the increase in the availability of digital text collections for humanities researchers, tools to enable enhanced retrieval are required. If words with very specific properties could be retrieved from a text collection more accurate linguistic and other analyses can be made. There are a range of properties and metadata that could be specified for retrieval, from morphological data up to bibliographic data. Furthermore, the bibliographic data should not only be on item level but extended to the text-level. For example, in an anthology each section could be encoded with the author of that section. Such extended metadata will enable fine-grained retrieval. Method. In this study, current tools were evaluated to determine to what extent they allow users to retrieve words with specific properties from a text collection. Analysis. The analysis is limited to the following criteria: interface design, metadata, search options, filtering and search results. Results. Currently, it is not possible for a user to retrieve words with specific properties from a text collection. Conclusion. An extended set of metadata should be used to encode text to enable retrieval of words on a fine-grained level.


Sign in / Sign up

Export Citation Format

Share Document