Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

Author(s):  
Shanshan Qi ◽  
Limin Zheng ◽  
Feiyu Shang

Open Relation Extraction (ORE) plays a significant role in the field of Information Extraction. It breaks the limitation that traditional relation extraction must pre-define relational types in the annotated corpus and specific domains restrictions, to realize the goal of extracting entities and the relation between entities in the open domain. However, with the increase of sentence complexity, the precision and recall of Entity Relation Extraction will be significantly reduced. To solve this problem, we present an unsupervised Clause_CORE method based on Chinese grammar and dependency parsing features. Clause_CORE is used for complex sentences processing, including decomposing complex sentence and dynamically complementing sentence components, which can reduce sentences complexity and maintain the integrity of sentences at the same time. Then, we perform dependency parsing for complete sentences and implement open entity relation extraction based on the model constructed by Chinese grammar rules. The experimental results show that the performance of Clause_CORE method is better than that of other advanced Chinese ORE systems on Wikipedia and Sina news datasets, which proves the correctness and effectiveness of the method. The results on mixed datasets of news data and encyclopedia data prove the generalization and portability of the method.

2021 ◽  
Author(s):  
Yue Zhang ◽  
Zhang Bo ◽  
Rui Wang ◽  
Junjie Cao ◽  
Chen Li ◽  
...  

2021 ◽  
Author(s):  
Duc Thuan Vo

Information Extraction (IE) is one of the challenging tasks in natural language processing. The goal of relation extraction is to discover the relevant segments of information in large numbers of textual documents such that they can be used for structuring data. IE aims at discovering various semantic relations in natural language text and has a wide range of applications such as question answering, information retrieval, knowledge presentation, among others. This thesis proposes approaches for relation extraction with clause-based Open Information Extraction that use linguistic knowledge to capture a variety of information including semantic concepts, words, POS tags, shallow and full syntax, dependency parsing in rich syntactic and semantic structures.<div>Within the plethora of Open Information Extraction that focus on the use of syntactic and dependency parsing for the purposes of detecting relations, incoherent and uninformative relation extractions can still be found. The extracted relations can be erroneous at times and fail to have a meaningful interpretation. As such, we first propose refinements to the grammatical structure of syntactic and dependency parsing with clause structures and clause types in an effort to generate propositions that can be deemed as meaningful extractable relations. Second, considering that choosing the most efficient seeds are pivotal to the success of the bootstrapping process when extracting relations, we propose an extended clause-based pattern extraction method with selftraining for unsupervised relation extraction. The proposed self-training algorithm relies on the clause-based approach to extract a small set of seed instances in order to identify and derive new patterns. Third, we employ matrix factorization and collaborative filtering for relation extraction. To avoid the need for manually predefined schemas, we employ the notion of universal schemas that is formed as a collection of patterns derived from Open Information Extraction tools as well as from relation schemas of pre-existing datasets. While previous systems have trained relations only for entities, we exploit advanced features from relation characteristics such as clause types and semantic topics for predicting new relation instances. Finally, we present an event network representation for temporal and causal event relation extraction that benefits from existing Open IE systems to generate a set of triple relations that are then used to build an event network. The event network is bootstrapped by labeling the temporal and causal disposition of events that are directly linked to each other. The event network can be systematically traversed to identify temporal and causal relations between indirectly connected events. <br></div>


2019 ◽  
Vol 69 ◽  
pp. 00081 ◽  
Author(s):  
Olga Nikolenko ◽  
Olga Zakharchu ◽  
Larisa Babakova ◽  
Boris Morenko

What makes this study topical: the urgency of the problem under consideration is due to the existing need for structural and semantic analysis of complex sentences (CS) with homogeneously collateral subordination of clauses, in different functional styles of speech and language. Our study is directed towards revealing the ability of syntaxemes with homogeneously collateral subordination to render hidden meanings of the author‘s ‘I’ and to thereby affect the reader/hearer. The cornerstone research method in this study is direct observation of language phenomena with generous borrowings from transformational analysis; it allows us to assert that the multi-component sentences under scrutiny here possess powerful expressive potential and can better than any other render additional information, thereby giving a strongly suggestive focus to an utterance or statement. In this paper, for the first time in the history of linguistics, we reveal how CS with homogeneously collateral subordination of sub clauses work in all functional styles. We also define cognitive boundaries within which takes place the choice between such multi-component structures in the process of language activity, with concern for how the ‘I’ of the author affects the addressee.


2021 ◽  
Author(s):  
Duc Thuan Vo

Information Extraction (IE) is one of the challenging tasks in natural language processing. The goal of relation extraction is to discover the relevant segments of information in large numbers of textual documents such that they can be used for structuring data. IE aims at discovering various semantic relations in natural language text and has a wide range of applications such as question answering, information retrieval, knowledge presentation, among others. This thesis proposes approaches for relation extraction with clause-based Open Information Extraction that use linguistic knowledge to capture a variety of information including semantic concepts, words, POS tags, shallow and full syntax, dependency parsing in rich syntactic and semantic structures.<div>Within the plethora of Open Information Extraction that focus on the use of syntactic and dependency parsing for the purposes of detecting relations, incoherent and uninformative relation extractions can still be found. The extracted relations can be erroneous at times and fail to have a meaningful interpretation. As such, we first propose refinements to the grammatical structure of syntactic and dependency parsing with clause structures and clause types in an effort to generate propositions that can be deemed as meaningful extractable relations. Second, considering that choosing the most efficient seeds are pivotal to the success of the bootstrapping process when extracting relations, we propose an extended clause-based pattern extraction method with selftraining for unsupervised relation extraction. The proposed self-training algorithm relies on the clause-based approach to extract a small set of seed instances in order to identify and derive new patterns. Third, we employ matrix factorization and collaborative filtering for relation extraction. To avoid the need for manually predefined schemas, we employ the notion of universal schemas that is formed as a collection of patterns derived from Open Information Extraction tools as well as from relation schemas of pre-existing datasets. While previous systems have trained relations only for entities, we exploit advanced features from relation characteristics such as clause types and semantic topics for predicting new relation instances. Finally, we present an event network representation for temporal and causal event relation extraction that benefits from existing Open IE systems to generate a set of triple relations that are then used to build an event network. The event network is bootstrapped by labeling the temporal and causal disposition of events that are directly linked to each other. The event network can be systematically traversed to identify temporal and causal relations between indirectly connected events. <br></div>


Author(s):  
Margreet Vogelzang ◽  
Christiane M. Thiel ◽  
Stephanie Rosemann ◽  
Jochem W. Rieger ◽  
Esther Ruigendijk

Purpose Adults with mild-to-moderate age-related hearing loss typically exhibit issues with speech understanding, but their processing of syntactically complex sentences is not well understood. We test the hypothesis that listeners with hearing loss' difficulties with comprehension and processing of syntactically complex sentences are due to the processing of degraded input interfering with the successful processing of complex sentences. Method We performed a neuroimaging study with a sentence comprehension task, varying sentence complexity (through subject–object order and verb–arguments order) and cognitive demands (presence or absence of a secondary task) within subjects. Groups of older subjects with hearing loss ( n = 20) and age-matched normal-hearing controls ( n = 20) were tested. Results The comprehension data show effects of syntactic complexity and hearing ability, with normal-hearing controls outperforming listeners with hearing loss, seemingly more so on syntactically complex sentences. The secondary task did not influence off-line comprehension. The imaging data show effects of group, sentence complexity, and task, with listeners with hearing loss showing decreased activation in typical speech processing areas, such as the inferior frontal gyrus and superior temporal gyrus. No interactions between group, sentence complexity, and task were found in the neuroimaging data. Conclusions The results suggest that listeners with hearing loss process speech differently from their normal-hearing peers, possibly due to the increased demands of processing degraded auditory input. Increased cognitive demands by means of a secondary visual shape processing task influence neural sentence processing, but no evidence was found that it does so in a different way for listeners with hearing loss and normal-hearing listeners.


Sign in / Sign up

Export Citation Format

Share Document