scholarly journals WINFRA: A Web-Based Platform for Semantic Data Retrieval and Data Analytics

Mathematics ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 2090
Author(s):  
Addi Ait-Mlouk ◽  
Xuan-Son Vu ◽  
Lili Jiang

Given the huge amount of heterogeneous data stored in different locations, it needs to be federated and semantically interconnected for further use. This paper introduces WINFRA, a comprehensive open-access platform for semantic web data and advanced analytics based on natural language processing (NLP) and data mining techniques (e.g., association rules, clustering, classification based on associations). The system is designed to facilitate federated data analysis, knowledge discovery, information retrieval, and new techniques to deal with semantic web and knowledge graph representation. The processing step integrates data from multiple sources virtually by creating virtual databases. Afterwards, the developed RDF Generator is built to generate RDF files for different data sources, together with SPARQL queries, to support semantic data search and knowledge graph representation. Furthermore, some application cases are provided to demonstrate how it facilitates advanced data analytics over semantic data and showcase our proposed approach toward semantic association rules.

Author(s):  
Alexey Cheptsov ◽  
Stefan Wesner ◽  
Bastian Koller

The modern Semantic Web scenarios require reasoning algorithms to be flexible, modular, and highly-configurable. A solid approach, followed in the design of the most currently existing reasoners, is not sufficient when dealing with today's challenges of data analysis across multiple sources of heterogeneous data or when the data amount grows to the “Big Data” sizes. The “reasoning as a workflow” concept has attracted a lot of attention in the design of new-generation Semantic Web applications, offering a lot of opportunities to improve both flexibility and scalability of the reasoning process. Considering a single workflow component as a service offers a lot of opportunities for a reasoning algorithm to target a much wider range of potentially enabled Semantic Web use cases by taking benefits of a service-oriented and component-based implementation. We introduce a technique for developing service-oriented Semantic Reasoning applications based on the workflow concept. We also present the Large Knowledge Collider - a software platform for developing workflow-based Semantic Web applications, taking advantages of on-demand high performance computing and cloud infrastructures.


Author(s):  
Abhilash Srivastav ◽  
Alok Chauhan

Social network data analysis is an important problem due to proliferation of social network applications, amount of data these applications generate and potential of insight based on this big data. The objective of present work is to propose architecture for a semantic web application to facilitate meaningful social network data analytics as well as answering query about concerned ontology. Proposed technique links, on one hand, tools based on semantic technology provided by social network applications with data analytics tools and on the other hand extends this link to ontology authoring tools for further inference.   Results obtained from data analytics tool, results of query on generated ontology and benchmarking of the performance of data analytics tool are shown. It has been observed that a semantic web application utilizing above mentioned tools and technologies is more versatile and flexible and further improvements are possible by applying generic data mining algorithms to the above scenario.    


Author(s):  
Alexey Cheptsov ◽  
Stefan Wesner ◽  
Bastian Koller

The modern Semantic Web scenarios require reasoning algorithms to be flexible, modular, and highly-configurable. A solid approach, followed in the design of the most currently existing reasoners, is not sufficient when dealing with today's challenges of data analysis across multiple sources of heterogeneous data or when the data amount grows to the “Big Data” sizes. The “reasoning as a workflow” concept has attracted a lot of attention in the design of new-generation Semantic Web applications, offering a lot of opportunities to improve both flexibility and scalability of the reasoning process. Considering a single workflow component as a service offers a lot of opportunities for a reasoning algorithm to target a much wider range of potentially enabled Semantic Web use cases by taking benefits of a service-oriented and component-based implementation. We introduce a technique for developing service-oriented Semantic Reasoning applications based on the workflow concept. We also present the Large Knowledge Collider - a software platform for developing workflow-based Semantic Web applications, taking advantages of on-demand high performance computing and cloud infrastructures.


Author(s):  
Adrian Pachzelt ◽  
Gerwin Kasperek ◽  
Andy Lücking ◽  
Giuseppe Abrami ◽  
Christine Driller

Nowadays, obtaining information by entering queries into a web search engine is routine behaviour. With its search portal, the Specialised Information Service Biodiversity Research (BIOfid) adapts the exploration of legacy biodiversity literature and data extraction to current standards (Driller et al. 2020). In this presentation, we introduce the BIOfid search portal and its functionalities in a How-To short guide. To this end, we adapted a knowledge graph representation of our thematic focus of Central European, primarily German language, biodiversity literature of the 19th and 20th centuries. Now, users can search our text-mined corpus containing to date more than 8.700 full-text articles from 68 journals, and particularly focussing on birds, lepidopterans and vascular plants. The texts are automatically preprocessed by the Natural Language Processing provider TextImager (Hemati et al. 2016) and will be linked to various databases such as Wikidata, Wikipedia, the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL), Geonames, the Integrated Authority File (GND) and WordNet. For data retrieval, users can filter search results and download the article metadata as well as text annotations and database links in JavaScript Object Notation (JSON) format. For example, literature that mentions taxa from certain decades or co-occurrences of species can be searched. Our search engine recognises scientific and vernacular taxon names based on the GBIF Backbone Taxonomy and offers search suggestions to support the user. The semantic network of the BIOfid search portal is also enriched with data from the EoL trait bank, so that trait data can be included in the search queries. Thus, scientists can enhance their own data sets with the search results and feed them into the relevant biodiversity data repositories to sustainably expand the corresponding knowledge graphs with reliable data. Since BIOfid applies standard ontology terms, all data mobilized from literature can be combined with data on natural history collection objects or data from current research projects in order to generate more comprehensive knowledge. Furthermore, taxonomy, ecology and trait ontologies that have been built or extended within this project will be made available through appropriate platforms such as The Open Biological and Biomedical Ontology (OBO) Foundry and the Terminology Service of The German Federation for Biological Data (GFBio).


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4035
Author(s):  
Teresa Zawadzka ◽  
Tomasz Wierciński ◽  
Grzegorz Meller ◽  
Mateusz Rock ◽  
Robert Zwierzycki ◽  
...  

Data reusability is an important feature of current research, just in every field of science. Modern research in Affective Computing, often rely on datasets containing experiments-originated data such as biosignals, video clips, or images. Moreover, conducting experiments with a vast number of participants to build datasets for Affective Computing research is time-consuming and expensive. Therefore, it is extremely important to provide solutions allowing one to (re)use data from a variety of sources, which usually demands data integration. This paper presents the Graph Representation Integrating Signals for Emotion Recognition and Analysis (GRISERA) framework, which provides a persistent model for storing integrated signals and methods for its creation. To the best of our knowledge, this is the first approach in Affective Computing field that addresses the problem of integrating data from multiple experiments, storing it in a consistent way, and providing query patterns for data retrieval. The proposed framework is based on the standardized graph model, which is known to be highly suitable for signal processing purposes. The validation proved that data from the well-known AMIGOS dataset can be stored in the GRISERA framework and later retrieved for training deep learning models. Furthermore, the second case study proved that it is possible to integrate signals from multiple sources (AMIGOS, ASCERTAIN, and DEAP) into GRISERA and retrieve them for further statistical analysis.


2020 ◽  
Vol 26 (3) ◽  
pp. 103-107
Author(s):  
Ilie Cristian Dorobăţ ◽  
Vlad Posea

AbstractThe continuous expansion of the semantic web and of the linked open data cloud meant more semantic data are available for querying from endpoints all over the web. We propose extending a standard SPARQL interface with UI and Natural Language Processing features to allow easier and more intelligent querying. The paper describes some usage scenarios for easy querying and launches a discussion on the advantages of such an implementation.


Author(s):  
Yuan Sun ◽  
Andong Chen ◽  
Chaofan Chen ◽  
Tianci Xia ◽  
Xiaobing Zhao

Learning the representation of a knowledge graph is critical to the field of natural language processing. There is a lot of research for English knowledge graph representation. However, for the low-resource languages, such as Tibetan, how to represent sparse knowledge graphs is a key problem. In this article, aiming at scarcity of Tibetan knowledge graphs, we extend the Tibetan knowledge graph by using the triples of the high-resource language knowledge graphs and Point of Information map information. To improve the representation learning of the Tibetan knowledge graph, we propose a joint model to merge structure and entity description information based on the Translating Embeddings and Convolution Neural Networks models. In addition, to solve the segmentation errors, we use character and word embedding to learn more complex information in Tibetan. Finally, the experimental results show that our model can make a better representation of the Tibetan knowledge graph than the baseline.


2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Ruihui Mu ◽  
Xiaoqin Zeng

To solve the problem that collaborative filtering algorithm only uses the user-item rating matrix and does not consider semantic information, we proposed a novel collaborative filtering recommendation algorithm based on knowledge graph. Using the knowledge graph representation learning method, this method embeds the existing semantic data into a low-dimensional vector space. It integrates the semantic information of items into the collaborative filtering recommendation by calculating the semantic similarity between items. The shortcoming of collaborative filtering algorithm which does not consider the semantic information of items is overcome, and therefore the effect of collaborative filtering recommendation is improved on the semantic level. Experimental results show that the proposed algorithm can get higher values on precision, recall, and F-measure for collaborative filtering recommendation.


2017 ◽  
Vol 9 (1) ◽  
pp. 19-24 ◽  
Author(s):  
David Domarco ◽  
Ni Made Satvika Iswari

Technology development has affected many areas of life, especially the entertainment field. One of the fastest growing entertainment industry is anime. Anime has evolved as a trend and a hobby, especially for the population in the regions of Asia. The number of anime fans grow every year and trying to dig up as much information about their favorite anime. Therefore, a chatbot application was developed in this study as anime information retrieval media using regular expression pattern matching method. This application is intended to facilitate the anime fans in searching for information about the anime they like. By using this application, user can gain a convenience and interactive anime data retrieval that can’t be found when searching for information via search engines. Chatbot application has successfully met the standards of information retrieval engine with a very good results, the value of 72% precision and 100% recall showing the harmonic mean of 83.7%. As the application of hedonic, chatbot already influencing Behavioral Intention to Use by 83% and Immersion by 82%. Index Terms—anime, chatbot, information retrieval, Natural Language Processing (NLP), Regular Expression Pattern Matching


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Lisa Grossman Liu ◽  
Raymond H. Grossman ◽  
Elliot G. Mitchell ◽  
Chunhua Weng ◽  
Karthik Natarajan ◽  
...  

AbstractThe recognition, disambiguation, and expansion of medical abbreviations and acronyms is of upmost importance to prevent medically-dangerous misinterpretation in natural language processing. To support recognition, disambiguation, and expansion, we present the Medical Abbreviation and Acronym Meta-Inventory, a deep database of medical abbreviations. A systematic harmonization of eight source inventories across multiple healthcare specialties and settings identified 104,057 abbreviations with 170,426 corresponding senses. Automated cross-mapping of synonymous records using state-of-the-art machine learning reduced redundancy, which simplifies future application. Additional features include semi-automated quality control to remove errors. The Meta-Inventory demonstrated high completeness or coverage of abbreviations and senses in new clinical text, a substantial improvement over the next largest repository (6–14% increase in abbreviation coverage; 28–52% increase in sense coverage). To our knowledge, the Meta-Inventory is the most complete compilation of medical abbreviations and acronyms in American English to-date. The multiple sources and high coverage support application in varied specialties and settings. This allows for cross-institutional natural language processing, which previous inventories did not support. The Meta-Inventory is available at https://bit.ly/github-clinical-abbreviations.


Sign in / Sign up

Export Citation Format

Share Document