Wikidata: A platform for data integration and dissemination for the life sciences and beyond

Management of SME's Semi Structured Data Using Semantic Technique

Web Services ◽

10.4018/978-1-5225-7501-6.ch094 ◽

2019 ◽

pp. 1812-1835

Author(s):

Saravjeet Singh ◽

Jaiteg Singh

Keyword(s):

Semantic Web ◽

Data Management ◽

Complex Form ◽

Web Technology ◽

Structured Data ◽

Unstructured Data ◽

Effective Solution ◽

Semantic Web Technology ◽

Semantic Data ◽

To Come

Management of data for an organization is crucial task but when data goes to its complex form then it becomes multifaceted as well as vital. In today era most of the organizations generating semi structured or unstructured data that requires special techniques to handle and manage. With the needs to handle unstructured data, semantic web technology provides a way to come up with the effective solution. In this chapter Synthetic Semantic Data Management (SSDM) is explained that is based semantic web technique and will helps to manage data of small and Midsized Enterprise (SME). SSDM provide the procedure to handle, store, manages and retrieval of semi structured data.

Download Full-text

Management of SME's Semi Structured Data Using Semantic Technique

Information Retrieval and Management ◽

10.4018/978-1-5225-5191-1.ch073 ◽

2018 ◽

pp. 1614-1637

Author(s):

Saravjeet Singh ◽

Jaiteg Singh

Keyword(s):

Semantic Web ◽

Data Management ◽

Complex Form ◽

Web Technology ◽

Structured Data ◽

Unstructured Data ◽

Effective Solution ◽

Semantic Web Technology ◽

Semantic Data ◽

To Come

Management of data for an organization is crucial task but when data goes to its complex form then it becomes multifaceted as well as vital. In today era most of the organizations generating semi structured or unstructured data that requires special techniques to handle and manage. With the needs to handle unstructured data, semantic web technology provides a way to come up with the effective solution. In this chapter Synthetic Semantic Data Management (SSDM) is explained that is based semantic web technique and will helps to manage data of small and Midsized Enterprise (SME). SSDM provide the procedure to handle, store, manages and retrieval of semi structured data.

Download Full-text

Semantic Data Integration on Biomedical Data Using Semantic Web Technologies

Bioinformatics - Trends and Methodologies ◽

10.5772/21086 ◽

2011 ◽

Cited By ~ 2

Author(s):

Roland Kienast ◽

Christian Baumgartner

Keyword(s):

Semantic Web ◽

Data Integration ◽

Biomedical Data ◽

Semantic Web Technologies ◽

Web Technologies ◽

Semantic Data ◽

Semantic Data Integration

Download Full-text

Visualization of Graph Structured Data of Semantic Web in Life Sciences

Journal of the Visualization Society of Japan ◽

10.3154/jvs.40.156_19 ◽

2020 ◽

Vol 40 (156) ◽

pp. 19-24

Author(s):

Toshiaki KATAYAMA

Keyword(s):

Semantic Web ◽

Life Sciences ◽

Structured Data

Download Full-text

Wikidata as a linked-data hub for Biodiversity data

Biodiversity Information Science and Standards ◽

10.3897/biss.3.35206 ◽

2019 ◽

Vol 3 ◽

Cited By ~ 2

Author(s):

Andra Waagmeester ◽

Lynn Schriml ◽

Andrew Su

Keyword(s):

Semantic Web ◽

Linked Data ◽

Life Science ◽

Model Development ◽

Primary Source ◽

Disease Ontology ◽

Semantic Data ◽

Public Data ◽

Structured Information ◽

Data Commons

Wikidata (http://www.wikidata.org) is the linked database of the Wikimedia Foundation. Like its sister project Wikipedia it is open to humans and machines. Initially primarily intended as a central repository of structured data for the approximately 200 language versions of Wikipedia, Wikidata currently also serves many other use cases. It is an open, Semantic Web-compatible database that anyone can edit. Here, we present the Gene Wiki initiative. In 2008, this project started by creating Wikipedia articles for all human genes (Huss et al. 2008). These articles were enriched with structured information on these genes as tables (called infoboxes). With the onset of Wikidata in 2012, the project diverted its attention from the infoboxes and since we have been enriching Wikidata with structured knowledge from public scientific resources on gene, proteins, diseases and compounds (Burgstaller-Muehlbacher et al. 2016). This structured information is added to Wikidata, while active links to the primary source are maintained. Adding a new resource to Wikidata is a community-driven process that starts with modelling the subjects of the resource under scrutiny. This involves seeking commonalities with similar concepts in Wikidata and, if none are found, new are created. This process mostly happens in a collaboratively-edited document (i.e. GDocs), where different graphical networks are drawn to reflect the data being modelled and its embedding in Wikidata. Once consensus has been reached, the model typically exists in a human-readable document. To allow future validations of these models on existing data, it is converted in a machine-readable Shape Expression (ShEx) (Anonymous 2019, Waagmeester et al. 2017). Shape Expressions schema language can be consumed and produced by humans and machines and is useful in model development, legacy review or as formal documentation. Once a semantic data model (as Shape Expression) is found, i.e. community consensus is reached, a bot is developed to convert the knowledge from the primary source, into the Wikidata model. While Wikidata is linked data (part of the semantic web), many life science resources are not. On the contrary, many distinct file formats or API output formats are used to present life-science knowledge. To convert between these different formats, bots need to be developed that are able to parse the different resources and serialize into wikidata. We have developed a software library in the Python programming language, which we use to build these bots. Once created, these bots run regularly to keep Wikidata up-to-date with knowledge on genes, proteins, diseases and drugs. Having scientific knowledge represented in Wikidata comes with benefits. First, having research data on Wikidata increases its sustainability. When research projects end, their findings now remain on an independently funded infrastructure. Having someone else dealing with an infrastructure for a data commons also relieves the research community of having to do it themselves, leading to more time to focus on doing research As a generic public data commons, Wikidata allows public scrutiny and rapid integration with other domains. Inconsistencies or disagreement between resources become more visible, due to the unified data models and interfaces. The latter we leverage as a feature in our bots. One of our core resources is, for example, the Disease Ontology (Schriml et al. 2018). This ontology on human diseases is continuously updated by its curation team. 2 times per month, updates are then synchronised with Wikidata. If inconsistencies and disagreement with other resources surface, they are logged and shared with the curation team of the Disease Ontology. Hence, we have created a bi-directional update cycle, improving both the Disease Ontology and Wikidata. Although our bots focus on molecular biology, our approaches are generic in onset that we are confident a similar approach can work in biodiversity informatics.

Download Full-text

Management of SME's Semi Structured Data Using Semantic Technique

Advances in Business Information Systems and Analytics - Applied Big Data Analytics in Operations Management ◽

10.4018/978-1-5225-0886-1.ch007 ◽

2016 ◽

pp. 133-164

Author(s):

Saravjeet Singh ◽

Jaiteg Singh

Keyword(s):

Semantic Web ◽

Data Management ◽

Complex Form ◽

Web Technology ◽

Structured Data ◽

Unstructured Data ◽

Effective Solution ◽

Semantic Web Technology ◽

Semantic Data ◽

To Come

Management of data for an organization is crucial task but when data goes to its complex form then it becomes multifaceted as well as vital. In today era most of the organizations generating semi structured or unstructured data that requires special techniques to handle and manage. With the needs to handle unstructured data, semantic web technology provides a way to come up with the effective solution. In this chapter Synthetic Semantic Data Management (SSDM) is explained that is based semantic web technique and will helps to manage data of small and Midsized Enterprise (SME). SSDM provide the procedure to handle, store, manages and retrieval of semi structured data.

Download Full-text

Divine Machines

10.23943/princeton/9780691141787.001.0001 ◽

2011 ◽

Cited By ~ 14

Author(s):

Justin E. H. Smith

Keyword(s):

Seventeenth Century ◽

Early Modern ◽

Scientific Inquiry ◽

Life Sciences ◽

Theoretical Interest ◽

Living World ◽

Mechanical Philosophy ◽

Organic Life ◽

The World ◽

Generation Theory

Though it did not yet exist as a discrete field of scientific inquiry, biology was at the heart of many of the most important debates in seventeenth-century philosophy. Nowhere is this more apparent than in the work of G. W. Leibniz. This book offers the first in-depth examination of Leibniz's deep and complex engagement with the empirical life sciences of his day, in areas as diverse as medicine, physiology, taxonomy, generation theory, and paleontology. The book shows how these wide-ranging pursuits were not only central to Leibniz's philosophical interests, but often provided the insights that led to some of his best-known philosophical doctrines. Presenting the clearest picture yet of the scope of Leibniz's theoretical interest in the life sciences, the book takes seriously the philosopher's own repeated claims that the world must be understood in fundamentally biological terms. Here it reveals a thinker who was immersed in the sciences of life, and looked to the living world for answers to vexing metaphysical problems. The book casts Leibniz's philosophy in an entirely new light, demonstrating how it radically departed from the prevailing models of mechanical philosophy and had an enduring influence on the history and development of the life sciences. Along the way, the book provides a fascinating glimpse into early modern debates about the nature and origins of organic life, and into how philosophers such as Leibniz engaged with the scientific dilemmas of their era.

Download Full-text

Research and design of a semi-structured data-integration system on multiple Web sources

Advances in Computer Science and Technology ◽

10.2495/iccst140281 ◽

2014 ◽

Author(s):

Q. Yu ◽

Y. N. Wang

Keyword(s):

Data Integration ◽

Structured Data ◽

Integration System ◽

Data Integration System ◽

Research And Design

Download Full-text

Editorial: Semantic Web for Life Sciences

SSRN Electronic Journal ◽

10.2139/ssrn.3199335 ◽

2006 ◽

Author(s):

Michael Schroeder ◽

Eric Neumann

Keyword(s):

Semantic Web ◽

Life Sciences

Download Full-text

CIM Dictionaries: An Evolutionary Approach to CIM Data Integration

ASME 1991 5th Annual Database Symposium: Engineering Databases — An Enterprise Resource ◽

10.1115/edm1991-0184 ◽

1991 ◽

Author(s):

Uwe Weissflog

Keyword(s):

Data Integration ◽

Added Value ◽

Heterogeneous Environments ◽

Semantic Data ◽

Application Systems ◽

Homogeneous Set ◽

Semantic Data Model ◽

Additional Service ◽

Product Definition

Abstract This paper provides an overview of methods and ideas to achieve data integration in CIM. It describes a dictionary approach allowing participating applications to define their common constructs gradually as an additional service across application systems. Because of the importance of product definition data, the role of PDES/STEP as part of this dictionary approach is also described. The technical concepts of the dictionary, such as schema mapping, semantic data model, user methods and the required additions within participating applications are explained. Problems related to data integrity, data redundancy, performance and binding of dissimilar software components are discussed as well as the deficiencies related to today’s data modelling capabilities. The added value an active dictionary can provide to a CIM environment consisting of established applications in heterogeneous environments, where migration into one standardized homogeneous set of CIM applications is not likely, is also explained.

Download Full-text