Using web data to explore lexico-semantic relations

Fast Neural Network Engine for Natural Science Language Processing: A Drug-Search Case.

10.26434/chemrxiv.12800348 ◽

2020 ◽

Author(s):

Vadim V. Korolev ◽

Artem Mitrofanov ◽

Kirill Karpov ◽

Valery Tkachenko

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Natural Science ◽

Therapeutic Agent ◽

Semantic Relations ◽

Chemical Data ◽

Processing Methods ◽

Modern Natural

The main advantage of modern natural language processing methods is a possibility to turn an amorphous human-readable task into a strict mathematic form. That allows to extract chemical data and insights from articles and to find new semantic relations. We propose a universal engine for processing chemical and biological texts. We successfully tested it on various use-cases and applied to a case of searching a therapeutic agent for a COVID-19 disease by analyzing PubMed archive.

Download Full-text

THE METADIALOGUE CONCEPT IN ITS LINGUISTIC ASPECT

Bulletin of Udmurt University. Series History and Philology ◽

10.35634/2412-9534-2019-29-5-721-726 ◽

2019 ◽

Vol 29 (5) ◽

pp. 721-726

Author(s):

Yu.V. Kupriyanova ◽

I.M. Vasilyanova

Keyword(s):

Subject Matter ◽

Dynamic Structure ◽

Point Of View ◽

Semantic Relations ◽

Human Cognition ◽

Main Research ◽

Key Points ◽

Research Findings ◽

The Subject ◽

Linguistic Aspect

The article summarizes the key points in the development of the metadialogue phenomenon from a linguistic point of view. Some stages of the development of this concept and the difficulties associated with its structuring are covered. The main research findings of modern foreign and domestic experts on its study are considered. Some characteristics of the subject of the research from the standpoint of various pragmatic installations are given. On the basis of the dynamic structure of the metadialogue development, certain principles of semantic relations connected with the dialectical nature of human cognition are presented. Excursion into the history and evolution of the concept is presented. Several types of formulation of the subject matter are given. In accordance with the goal of speech exposure, internal problems of the development of metadialogue are highlighted and the critical points related to solving these problems are described. The rules of metadialogue flow are explained at the level of steps, the success/failure of which directly affects the final result of communication. The prospects of development of the concept research in accordance with various types of discourse are indicated.

Download Full-text

Research and Development on Semantic Web Data Management

Journal of Software ◽

10.3724/sp.j.1001.2009.03678 ◽

2009 ◽

Vol 20 (11) ◽

pp. 2950-2964 ◽

Cited By ~ 4

Author(s):

Xiao-Yong DU ◽

Yan WANG ◽

Bin LÜ

Keyword(s):

Semantic Web ◽

Research And Development ◽

Data Management ◽

Web Data ◽

Web Data Management

Download Full-text

Comparing face-to-face to web data collection: unit response and costs in a national health survey (Preprint)

10.2196/preprints.26299 ◽

2020 ◽

Author(s):

Elise Braekman ◽

Stefaan Demarest ◽

Rana Charafeddine ◽

Sabine Drieskens ◽

Finaba Berete ◽

...

Keyword(s):

Data Collection ◽

Response Rate ◽

Response Rates ◽

Demographic Characteristics ◽

Unit Response ◽

Cost Advantage ◽

Web Data ◽

Demographic Groups ◽

Considerable Cost ◽

The Difference

BACKGROUND Potential is seen in web data collection for population health surveys due to a combination of its cost-effectiveness, implementation ease and the increased internet penetration. Nonetheless, web modes may lead to lower and more selective unit response rates than traditional modes and hence may increase bias in the measured indicators. OBJECTIVE This research assesses the unit response and costs of a web versus F2F study. METHODS Alongside the F2F Belgian Health Interview Survey of 2018 (BHIS2018; n gross sample used: 7,698), a web survey (BHISWEB; n gross sample=6,183) is organized. Socio-demographic data on invited individuals is obtained from the national register and census linkages. Unit response rates considering the different sampling probabilities of both surveys are calculated. Logistic regression analyses examine the association between mode system (web vs. F2F) and socio-demographic characteristics on unit non-response. The costs per completed web questionnaire are compared with these for a completed F2F questionnaire. RESULTS The unit response rate is lower in BHISWEB (18.0%) versus BHIS2018 (43.1%). A lower web response is found among all socio-demographic groups, however, the difference is higher among people older than 65, low educated people, people with a non-Belgian nationality, people living alone and these living in Brussels Capital. Not the same socio-demographic characteristics are associated with non-response in both studies. Having another European (OR (95% CI): 1.60 (1.20-2.13)) or a non-European nationality (OR (95% CI): 2.57 (1.79-3.70)) (compared to having the Belgian nationality) and living in the Brussels Capital (95% CI): 1.72 (1.41-2.10)) or Walloon (OR (95% CI): 1.47 (1.15 - 1.87) region (compared to living in the Flemish region) is only in BHISWEB associated with a higher non-response. In BHIS2018 younger people (OR (95% CI): 1.31 (1.11-1.54)) are more likely to be non-respondent than older people, this was not found BHISWEB. In both studies, lower educated people have a higher change to be non-respondent, but this effect is more pronounced in BHISWEB (OR low vs. high education level (95% CI): Web 2.71 (2.21-3.39)); F2F 1.70 (1.48-1.95)). The BHISWEB study has a considerable cost advantage; the total cost per completed questionnaire is almost three times lower (€41) compared to the F2F data collection (€111). CONCLUSIONS The F2F unit response rate is generally higher, yet for certain groups the difference between web versus F2F is more limited. A considerable cost advantage of web collection is found. It is therefore worthwhile to experiment with adaptive mixed-mode designs to optimize financial resources without increasing selection bias; e.g. only inviting socio-demographic groups more eager to participate online for web surveys while remaining to focus on increasing the F2F response rates for other groups. CLINICALTRIAL Studies approved by the Ethics Committee of the University hospital of Ghent

Download Full-text

Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles

Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 ◽

10.1145/3383583.3398525 ◽

2020 ◽

Author(s):

Malte Ostendorff ◽

Terry Ruas ◽

Moritz Schubotz ◽

Georg Rehm ◽

Bela Gipp

Keyword(s):

Document Classification ◽

Semantic Relations

Download Full-text

Robust Web Data Extraction Based on Weighted Path-layer Similarity

Journal of Computer Information Systems ◽

10.1080/08874417.2020.1861571 ◽

2021 ◽

pp. 1-11

Author(s):

Peng Gao ◽

Hao Han

Keyword(s):

Data Extraction ◽

Web Data ◽

Web Data Extraction

Download Full-text

Multimodal Deep Learning Framework for Sentiment Analysis from Text-Image Web Data

2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT) ◽

10.1109/wiiat50758.2020.00039 ◽

2020 ◽

Author(s):

Selvarajah Thuseethan ◽

Sivasubramaniam Janarthan ◽

Sutharshan Rajasegarar ◽

Priya Kumari ◽

John Yearwood

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Web Data ◽

Learning Framework

Download Full-text

A Novel Architecture for Deep Web Crawler

International Journal of Information Technology and Web Engineering ◽

10.4018/jitwe.2011010103 ◽

2011 ◽

Vol 6 (1) ◽

pp. 25-48 ◽

Cited By ~ 7

Author(s):

Dilip Kumar Sharma ◽

A. K. Sharma

Keyword(s):

Cost Effective ◽

Deep Web ◽

Web Data ◽

Web Crawler ◽

Web Information ◽

General Search ◽

Web Crawlers

A traditional crawler picks up a URL, retrieves the corresponding page and extracts various links, adding them to the queue. A deep Web crawler, after adding links to the queue, checks for forms. If forms are present, it processes them and retrieves the required information. Various techniques have been proposed for crawling deep Web information, but much remains undiscovered. In this paper, the authors analyze and compare important deep Web information crawling techniques to find their relative limitations and advantages. To minimize limitations of existing deep Web crawlers, a novel architecture is proposed based on QIIIEP specifications (Sharma & Sharma, 2009). The proposed architecture is cost effective and has features of privatized search and general search for deep Web data hidden behind html forms.

Download Full-text

Semantic composition of AT-LOCATION relation with other relations

Natural Language Engineering ◽

10.1017/s1351324911000222 ◽

2011 ◽

Vol 18 (3) ◽

pp. 343-374

Author(s):

HAKKI C. CANKAYA ◽

EDUARDO BLANCO ◽

DAN MOLDOVAN

Keyword(s):

Experimental Study ◽

High Accuracy ◽

Semantic Relations ◽

Semantic Composition

AbstractThis paper presents a method for the composition of at-location with other semantic relations. The method is based on inference axioms that combine two semantic relations yielding another relation that otherwise is not expressed. An experimental study conducted on PropBank, WordNet, and eXtended WordNet shows that inferences have high accuracy. The method is applicable to combining other semantic relations and it is beneficial to many semantically intense applications.

Download Full-text

Extraction and integration of web data by end-users

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13 ◽

10.1145/2505515.2505635 ◽

2013 ◽

Cited By ~ 2

Author(s):

Sudhir Agarwal ◽

Michael Genesereth

Keyword(s):

End Users ◽

Web Data

Download Full-text