scholarly journals Enriching Knowledge Base by Parse Tree Pattern and Semantic Filter

2020 ◽  
Vol 10 (18) ◽  
pp. 6209
Author(s):  
Hee-Geun Yoon ◽  
Seyoung Park ◽  
Seong-Bae Park

This paper proposes a simple knowledge base enrichment based on parse tree patterns with a semantic filter. Parse tree patterns are superior to lexical patterns used commonly in many previous studies in that they can manage long distance dependencies among words. In addition, the proposed semantic filter, which is a combination of WordNet-based similarity and word embedding similarity, removes parse tree patterns that are semantically irrelevant to the meaning of a target relation. According to our experiments using the DBpedia ontology and Wikipedia corpus, the average accuracy of the top 100 parse tree patterns for ten relations is 68%, which is 16% higher than that of lexical patterns, and the average accuracy of the newly extracted triples is 60.1%. These results prove that the proposed method produces more relevant patterns for the relations of seed knowledge, and thus more accurate triples are generated by the patterns.

2011 ◽  
Vol 95 (1) ◽  
pp. 87-106 ◽  
Author(s):  
Bushra Jawaid ◽  
Daniel Zeman

Word-Order Issues in English-to-Urdu Statistical Machine Translation We investigate phrase-based statistical machine translation between English and Urdu, two Indo-European languages that differ significantly in their word-order preferences. Reordering of words and phrases is thus a necessary part of the translation process. While local reordering is modeled nicely by phrase-based systems, long-distance reordering is known to be a hard problem. We perform experiments using the Moses SMT system and discuss reordering models available in Moses. We then present our novel, Urdu-aware, yet generalizable approach based on reordering phrases in syntactic parse tree of the source English sentence. Our technique significantly improves quality of English-Urdu translation with Moses, both in terms of BLEU score and of subjective human judgments.


2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Jun Li ◽  
Guimin Huang ◽  
Jianheng Chen ◽  
Yabing Wang

Relation extraction is the underlying critical task of textual understanding. However, the existing methods currently have defects in instance selection and lack background knowledge for entity recognition. In this paper, we propose a knowledge-based attention model, which can make full use of supervised information from a knowledge base, to select an entity. We also design a method of dual convolutional neural networks (CNNs) considering the word embedding of each word is restricted by using a single training tool. The proposed model combines a CNN with an attention mechanism. The model inserts the word embedding and supervised information from the knowledge base into the CNN, performs convolution and pooling, and combines the knowledge base and CNN in the full connection layer. Based on these processes, the model not only obtains better entity representations but also improves the performance of relation extraction with the help of rich background knowledge. The experimental results demonstrate that the proposed model achieves competitive performance.


2017 ◽  
Author(s):  
Fanqing Meng ◽  
Wenpeng Lu ◽  
Yuteng Zhang ◽  
Ping Jian ◽  
Shumin Shi ◽  
...  

2014 ◽  
Vol 556-562 ◽  
pp. 6281-6285
Author(s):  
Zhen Le Wu ◽  
Ying Li ◽  
Yong Bin Wang ◽  
Yan Jiao Zang

Ontology matching is the task of finding alignments between two different ontologies. It has become the key point of building knowledge base and integrating heterogeneous data. In this paper, a novel ontology matching approach that is based on continual word embedding is proposed. We describe in details how is skip-gram model adapted to capture the semantic of words to learn the word embedding. After computing the name similarity of concepts, similarity flooding algorithm is used to fix the initial similarity. Experiments on Ontology Alignment Evaluation Initiative (OAEI) benchmark without instances show that the proposed method significantly improves the quality of mappings.


Author(s):  
Heyoung Yang ◽  
Eunsoo Sohn

A better understanding of the clinical characteristics of coronavirus disease 2019 (COVID-19) is urgently required to address this health crisis. Numerous researchers and pharmaceutical companies are working on developing vaccines and treatments; however, a clear solution has yet to be found. The current study proposes the use of artificial intelligence methods to comprehend biomedical knowledge and infer the characteristics of COVID-19. A biomedical knowledge base was established via FastText, a word embedding technique, using PubMed literature from the past decade. Subsequently, a new knowledge base was created using recently published COVID-19 articles. Using this newly constructed knowledge base from the word embedding model, a list of anti-infective drugs and proteins of either human or coronavirus origin were inferred to be related, because they are located close to COVID-19 on the knowledge base. This study attempted to form a method to quickly infer related information about COVID-19 using the existing knowledge base, before sufficient knowledge about COVID-19 is accumulated. With COVID-19 not completely overcome, machine learning-based research in the PubMed literature will provide a broad guideline for researchers and pharmaceutical companies working on treatments for COVID-19.


2012 ◽  
Vol 9 (3) ◽  
pp. 1125-1153
Author(s):  
J. Travnícek ◽  
J. Janousek ◽  
B. Melichar

Trees are one of the fundamental data structures used in Computer Science. We present a new kind of acyclic pushdown automata, the tree pattern pushdown automaton and the nonlinear tree pattern pushdown automaton, constructed for an ordered tree. These automata accept all tree patterns and nonlinear tree patterns, respectively, which match the tree and represent a full index of the tree for such patterns. Given a tree with n nodes, the numbers of these distinct tree patterns and nonlinear tree patterns can be at most 2n?1 +n and at most (2+v)n?1+2, respectively, where v is the maximal number of nonlinear variables allowed in nonlinear tree patterns. The total sizes of nondeterministic versions of the two pushdown automata are O(n) and O(n2), respectively. We discuss the time complexities and show timings of our implementations using the bit-parallelism technique. The timings show that for a given tree the running time is linear to the size of the input pattern.


Author(s):  
Sahisnu Mazumder ◽  
Bing Liu

Knowledge base (KB) completion aims to infer missing facts from existing ones in a KB. Among various approaches, path ranking (PR) algorithms have received increasing attention in recent years. PR algorithms enumerate paths between entity-pairs in a KB and use those paths as features to train a model for missing fact prediction. Due to their good performances and high model interpretability, several methods have been proposed. However, most existing methods suffer from scalability (high RAM consumption) and feature explosion (trains on an exponentially large number of features) problems. This paper proposes a Context-aware Path Ranking (C-PR) algorithm to solve these problems by introducing a selective path exploration strategy. C-PR learns global semantics of entities in the KB using word embedding and leverages the knowledge of entity semantics to enumerate contextually relevant paths using bidirectional random walk. Experimental results on three large KBs show that the path features (fewer in number) discovered by C-PR not only improve predictive performance but also are more interpretable than existing baselines.


10.37236/2099 ◽  
2012 ◽  
Vol 19 (3) ◽  
Author(s):  
Michael Dairyko ◽  
Lara Pudwell ◽  
Samantha Tyner ◽  
Casey Wynn

In this paper we consider the enumeration of binary trees avoiding non-contiguous binary tree patterns. We begin by computing closed formulas for the number of trees avoiding a single binary tree pattern with 4 or fewer leaves and compare these results to analogous work for contiguous tree patterns. Next, we give an explicit generating function that counts binary trees avoiding a single non-contiguous tree pattern according to number of leaves and show that there is exactly one Wilf class of k-leaf tree patterns for any positive integer k.  In addition, we give a bijection between between certain sets of pattern-avoiding trees and sets of pattern-avoiding permutations.  Finally, we enumerate binary trees that simultaneously avoid more than one tree pattern.


Author(s):  
James Cronshaw

Long distance transport in plants takes place in phloem tissue which has characteristic cells, the sieve elements. At maturity these cells have sieve areas in their end walls with specialized perforations. They are associated with companion cells, parenchyma cells, and in some species, with transfer cells. The protoplast of the functioning sieve element contains a high concentration of sugar, and consequently a high hydrostatic pressure, which makes it extremely difficult to fix mature sieve elements for electron microscopical observation without the formation of surge artifacts. Despite many structural studies which have attempted to prevent surge artifacts, several features of mature sieve elements, such as the distribution of P-protein and the nature of the contents of the sieve area pores, remain controversial.


Sign in / Sign up

Export Citation Format

Share Document