Large-Scale Entity Clustering Based on Structural Similarities within Knowledge Graphs

Fast Computation of Explanations for Inconsistency in Large-Scale Knowledge Graphs

Proceedings of The Web Conference 2020 ◽

10.1145/3366423.3380014 ◽

2020 ◽

Author(s):

Trung-Kien Tran ◽

Mohamed H. Gad-Elrab ◽

Daria Stepanova ◽

Evgeny Kharlamov ◽

Jannik Strötgen

Keyword(s):

Large Scale ◽

Fast Computation ◽

Knowledge Graphs

Download Full-text

Large-scale relation extraction from web documents and knowledge graphs with human-in-the-loop

Journal of Web Semantics ◽

10.1016/j.websem.2019.100546 ◽

2020 ◽

Vol 60 ◽

pp. 100546

Author(s):

Petar Ristoski ◽

Anna Lisa Gentile ◽

Alfredo Alba ◽

Daniel Gruhl ◽

Steven Welch

Keyword(s):

Large Scale ◽

Relation Extraction ◽

Web Documents ◽

Human In The Loop ◽

Knowledge Graphs

Download Full-text

Representation Learning of Large-Scale Knowledge Graphs via Entity Feature Combinations

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management - CIKM '17 ◽

10.1145/3132847.3132961 ◽

2017 ◽

Cited By ~ 2

Author(s):

Zhen Tan ◽

Xiang Zhao ◽

Wei Wang

Keyword(s):

Large Scale ◽

Representation Learning ◽

Knowledge Graphs

Download Full-text

Commonsense Knowledge Aware Conversation Generation with Graph Attention

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/643 ◽

2018 ◽

Cited By ~ 35

Author(s):

Hao Zhou ◽

Tom Young ◽

Minlie Huang ◽

Haizhou Zhao ◽

Jingfang Xu ◽

...

Keyword(s):

Language Processing ◽

Large Scale ◽

Semantic Information ◽

Attention Mechanism ◽

Generation Model ◽

Dynamic Graph ◽

Commonsense Knowledge ◽

Word Generation ◽

Proposed Model ◽

Knowledge Graphs

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus supports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism. This is the first attempt that uses large-scale commonsense knowledge in conversation generation. Furthermore, unlike existing models that use knowledge triples (entities) separately and independently, our model treats each knowledge graph as a whole, which encodes more structured, connected semantic information in the graphs. Experiments show that the proposed model can generate more appropriate and informative responses than state-of-the-art baselines.

Download Full-text

Accurate Prediction of Kinase-Substrate Networks Using Knowledge Graphs

10.1101/865055 ◽

2019 ◽

Author(s):

Vít Nováček ◽

Gavin McGauran ◽

David Matallanas ◽

Adrián Vallejo Blanco ◽

Piero Conca ◽

...

Keyword(s):

Cell Fate ◽

Large Scale ◽

Conceptual Approach ◽

Kinase Substrate ◽

Biochemical Processes ◽

Cell Fate Decisions ◽

Cellular Processes ◽

Robust Model ◽

Vital Cell ◽

Knowledge Graphs

AbstractPhosphorylation of specific substrates by protein kinases is a key control mechanism for vital cell-fate decisions and other cellular processes. However, discovering specific kinase-substrate relationships is timeconsuming and often rather serendipitous. Computational predictions alleviate these challenges, but the current approaches suffer from limitations like restricted kinome coverage and inaccuracy. They also typically utilise only local features without reflecting broader interaction context. To address these limitations, we have developed an alternative predictive model. It uses statistical relational learning on top of phosphorylation networks interpreted as knowledge graphs, a simple yet robust model for representing networked knowledge. Compared to a representative selection of six existing systems, our model has the highest kinome coverage and produces biologically valid high-confidence predictions not possible with the other tools. Specifically, we have experimentally validated predictions of previously unknown phosphorylations by the LATS1, AKT1, PKA and MST2 kinases in human. Thus, our tool is useful for focusing phosphoproteomic experiments, and facilitates the discovery of new phosphorylation reactions. Our model can be accessed publicly via an easy-to-use web interface (LinkPhinder).Author SummaryLinkPhinder is a new approach to prediction of protein signalling networks based on kinase-substrate relationships that outperforms existing approaches. Phosphorylation networks govern virtually all fundamental biochemical processes in cells, and thus have moved into the centre of interest in biology, medicine and drug development. Fundamentally different from current approaches, LinkPhinder is inherently network-based and makes use of the most recent AI de-velopments. We represent existing phosphorylation data as knowledge graphs, a format for large-scale and robust knowledge representation. Training a link prediction model on such a structure leads to novel, biologically valid phosphorylation network predictions that cannot be made with competing tools. Thus our new conceptual approach can lead to establishing a new niche of AI applications in computational biology.

Download Full-text

A Survey of RDF Stores & SPARQL Engines for Querying Knowledge Graphs

10.20944/preprints202104.0199.v1 ◽

2021 ◽

Author(s):

Waqas Ali ◽

Muhammad Saleem ◽

Yao Bin ◽

Aidan Hogan ◽

A.-C. Ngonga Ngomo

Keyword(s):

Data Model ◽

Incomplete Data ◽

Large Scale ◽

Query Language ◽

Real World Data ◽

Diverse Range ◽

Survey Paper ◽

Rdf Graph ◽

Knowledge Graphs ◽

Rdf Graphs

Recent years have seen the growing adoption of non-relational data models for representing diverse, incomplete data. Among these, the RDF graph-based data model has seen ever-broadening adoption, particularly on the Web. This adoption has prompted the standardization of the SPARQL query language for RDF, as well as the development of a variety of local and distributed engines for processing queries over RDF graphs. These engines implement a diverse range of specialized techniques for storage, indexing, and query processing. A number of benchmarks, based on both synthetic and real-world data, have also emerged to allow for contrasting the performance of different query engines, often at large scale. This survey paper draws together these developments, providing a comprehensive review of the techniques, engines and benchmarks for querying RDF knowledge graphs.

Download Full-text

Knowledge Graphs

Biodiversity Information Science and Standards ◽

10.3897/biss.5.73796 ◽

2021 ◽

Vol 5 ◽

Author(s):

Roderic Page

Keyword(s):

Knowledge Management ◽

Large Scale ◽

Personal Knowledge ◽

Knowledge Graph ◽

Specific Knowledge ◽

Management Tools ◽

Global Projects ◽

Knowledge Graphs ◽

Constructing Knowledge ◽

Knowledge Management Tools

Knowledge graphs embody the idea of "everything connected to everything else." As attractive as this seems, there is a substantial gap between the dream of fully interconnected knowledge and the reality of data that is still mostly siloed, or weakly connected by shared strings such as taxonomic names. How do we move forward? Do we focus on building our own domain- or project-specific knowledge graphs, or do we engage with global projects such as Wikidata? Do we construct knowledge graphs, or focus on making our data "knowledge graph ready" by adopting structured markup in the hope that knowledge graphs will spontaneously self-assemble from that data? Do we focus on large-scale, database-driven projects (e.g., triple stores in the cloud), or do we rely on more localised and distributed approaches, such as annotations (e.g., hypothes.is), "content-hash" systems where a cryptographic hash of the data is also its identifier (Elliott et al. 2020), or the growing number of personal knowledge management tools (e.g., Roam, Obsidian, LogSeq)? This talk will share experiences (the good, bad, and the ugly) as I have tried to transition from naïve advocacy to constructing knowledge graphs (Page 2019), or participating in their construction (Page 2021).

Download Full-text

End-to-End Argumentation Knowledge Graph Construction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6231 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7367-7374

Author(s):

Khalid Al-Khatib ◽

Yufang Hou ◽

Henning Wachsmuth ◽

Charles Jochim ◽

Francesca Bonin ◽

...

Keyword(s):

Large Scale ◽

Question Answering ◽

Knowledge Graph ◽

Exploratory Search ◽

Text Generation ◽

Fake News ◽

High Quality ◽

Web Based ◽

Knowledge Graphs ◽

End To End

This paper studies the end-to-end construction of an argumentation knowledge graph that is intended to support argument synthesis, argumentative question answering, or fake news detection, among others. The study is motivated by the proven effectiveness of knowledge graphs for interpretable and controllable text generation and exploratory search. Original in our work is that we propose a model of the knowledge encapsulated in arguments. Based on this model, we build a new corpus that comprises about 16k manual annotations of 4740 claims with instances of the model's elements, and we develop an end-to-end framework that automatically identifies all modeled types of instances. The results of experiments show the potential of the framework for building a web-based argumentation graph that is of high quality and large scale.

Download Full-text

A Method to Learn Embedding of a Probabilistic Medical Knowledge Graph: Algorithm Development (Preprint)

10.2196/preprints.17645 ◽

2019 ◽

Author(s):

Linfeng Li ◽

Peng Wang ◽

Yao Wang ◽

Shenghui Wang ◽

Jun Yan ◽

...

Keyword(s):

Medical Records ◽

Large Scale ◽

Semantic Representation ◽

Medical Knowledge ◽

Mapping Function ◽

Graph Algorithm ◽

Knowledge Graph ◽

Knowledge Graphs ◽

Representation Method ◽

Better Than

BACKGROUND Knowledge graph embedding is an effective semantic representation method for entities and relations in knowledge graphs. Several translation-based algorithms, including TransE, TransH, TransR, TransD, and TranSparse, have been proposed to learn effective embedding vectors from typical knowledge graphs in which the relations between head and tail entities are deterministic. However, in medical knowledge graphs, the relations between head and tail entities are inherently probabilistic. This difference introduces a challenge in embedding medical knowledge graphs. OBJECTIVE We aimed to address the challenge of how to learn the probability values of triplets into representation vectors by making enhancements to existing TransX (where X is E, H, R, D, or Sparse) algorithms, including the following: (1) constructing a mapping function between the score value and the probability, and (2) introducing probability-based loss of triplets into the original margin-based loss function. METHODS We performed the proposed PrTransX algorithm on a medical knowledge graph that we built from large-scale real-world electronic medical records data. We evaluated the embeddings using link prediction task. RESULTS Compared with the corresponding TransX algorithms, the proposed PrTransX performed better than the TransX model in all evaluation indicators, achieving a higher proportion of corrected entities ranked in the top 10 and normalized discounted cumulative gain of the top 10 predicted tail entities, and lower mean rank. CONCLUSIONS The proposed PrTransX successfully incorporated the uncertainty of the knowledge triplets into the embedding vectors.

Download Full-text

Guided Generation of Cause and Effect

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/502 ◽

2020 ◽

Author(s):

Zhongyang Li ◽

Xiao Ding ◽

Ting Liu ◽

J. Edward Hu ◽

Benjamin Van Durme

Keyword(s):

Causal Reasoning ◽

Large Scale ◽

State Of The Art ◽

Prior Work ◽

Text Generation ◽

Causal Knowledge ◽

High Quality ◽

Cause And Effect ◽

Knowledge Graphs ◽

Human Assessment

We present a conditional text generation framework that posits sentential expressions of possible causes and effects. This framework depends on two novel resources we develop in the course of this work: a very large-scale collection of English sentences expressing causal patterns (CausalBank); and a refinement over previous work on constructing large lexical causal knowledge graphs (Cause Effect Graph). Further, we extend prior work in lexically-constrained decoding to support disjunctive positive constraints. Human assessment conﬁrms that our approach gives high-quality and diverse outputs. Finally, we use CausalBank to perform continued training of an encoder supporting a recent state-of-the-art model for causal reasoning, leading to a 3-point improvement on the COPA challenge set, with no change in model architecture.

Download Full-text