scholarly journals SVM Based Learning System for Information Extraction

Author(s):  
Yaoyong Li ◽  
Kalina Bontcheva ◽  
Hamish Cunningham
2004 ◽  
Vol 13 (04) ◽  
pp. 813-828 ◽  
Author(s):  
JING XIAO ◽  
TAT-SENG CHUA ◽  
JIMIN LIU

The ability to extract desired pieces of information from natural language texts is an important task with a growing number of potential applications. This paper presents a novel pattern rule induction learning system, GRID, which emphasizes the use of global feature distribution in all of the training instances in order to make better decision on rule induction. GRID uses chunks as contextual units instead of tokens, and incorporates features at lexical, syntactical and semantic levels simultaneously. The features chosen in GRID are general and they were applied successfully to both semi-structured text and free text. Our experimental results on some publicly available webpage corpora and MUC-4 test set indicate that our approach is effective.


2002 ◽  
Vol 8 (2-3) ◽  
pp. 167-191 ◽  
Author(s):  
J. TURMO ◽  
H. RODRIGUEZ

The growing availability of textual sources has lead to an increase in the use of automatic knowledge acquisition approaches from textual data, as in Information Extraction (IE). Most IE systems use knowledge explicitly represented as sets of IE rules usually manually acquired. Recently, however, the acquisition of this knowledge has been faced by applying a huge variety of Machine Learning (ML) techniques. Within this framework, new problems arise in relation to the way of selecting and annotating positive examples, and sometimes negative ones, in supervised approaches, or the way of organizing unsupervised or semi-supervised approaches. This paper presents a new IE-rule learning system that deals with these training set problems and describes a set of experiments for testing this capability of the new learning approach.


2020 ◽  
Vol 34 (05) ◽  
pp. 9225-9232
Author(s):  
Wenya Wang ◽  
Sinno Jialin Pan

Information extraction (IE) aims to produce structured information from an input text, e.g., Named Entity Recognition and Relation Extraction. Various attempts have been proposed for IE via feature engineering or deep learning. However, most of them fail to associate the complex relationships inherent in the task itself, which has proven to be especially crucial. For example, the relation between 2 entities is highly dependent on their entity types. These dependencies can be regarded as complex constraints that can be efficiently expressed as logical rules. To combine such logic reasoning capabilities with learning capabilities of deep neural networks, we propose to integrate logical knowledge in the form of first-order logic into a deep learning system, which can be trained jointly in an end-to-end manner. The integrated framework is able to enhance neural outputs with knowledge regularization via logic rules, and at the same time update the weights of logic rules to comply with the characteristics of the training data. We demonstrate the effectiveness and generalization of the proposed model on multiple IE tasks.


2016 ◽  
Vol 23 (3) ◽  
pp. 385-418 ◽  
Author(s):  
JAN KOCOŃ ◽  
MICHAŁ MARCIŃCZUK

AbstractA key challenge of the Information Extraction in Natural Language Processing is the ability to recognise and classify temporal expressions (timexes). It is a crucial source of information about when something happens, how often something occurs or how long something lasts. Timexes extracted automatically from text, play a major role in many Information Extraction systems, such as question answering or event recognition. We prepared a broad specification of Polish timexes – PLIMEX. It is based on the state-of-the-art annotation guidelines for English, mainly TIMEX2 and TIMEX3 (a part of TimeML – Markup Language for Temporal and Event Expressions). We have expanded our specification for a description of the local meaning of timexes, based on LTIMEX annotation guidelines for English. Temporal description supports further event identification and extends event description model, focussing on anchoring events in time, events ordering and reasoning about the persistence of events. We prepared the specification, which is designed to address these issues, and we annotated all documents in Polish Corpus of Wroclaw University of Technology (KPWr) using our annotation guidelines. We also adapted our Liner2 machine learning system to recognise Polish timexes and we propose two-phase method to select a subset of features for Conditional Random Fields sequence labelling method. This article presents the whole process of corpus annotation, evaluation of inter-annotator agreement, extending Liner2 system with new features and evaluation of the recognition models before and after feature selection with the analysis of statistical significance of differences. Liner2 with presented models is available as open source software under the GNU General Public License.


1981 ◽  
Vol 20 (03) ◽  
pp. 169-173
Author(s):  
J. Wagner ◽  
G. Pfurtscheixer

The shape, latency and amplitude of changes in electrical brain activity related to a stimulus (Evoked Potential) depend both on the stimulus parameters and on the background EEG at the time of stimulation. An adaptive, learnable stimulation system is introduced, whereby the subject is stimulated (e.g. with light), whenever the EEG power is subthreshold and minimal. Additionally, the system is conceived in such a way that a certain number of stimuli could be given within a particular time interval. Related to this time criterion, the threshold specific for each subject is calculated at the beginning of the experiment (preprocessing) and adapted to the EEG power during the processing mode because of long-time fluctuations and trends in the EEG. The process of adaptation is directed by a table which contains the necessary correction numbers for the threshold. Experiences of the stimulation system are reflected in an automatic correction of this table. Because the corrected and improved table is stored after each experiment and is used as the starting table for the next experiment, the system >learns<. The system introduced here can be used both for evoked response studies and for alpha-feedback experiments.


Author(s):  
T. A. Chernetskaya ◽  
N. A. Lebedeva

The article presents the experience of mass organization of distance learning in organizations of secondary general and vocational education in March—May 2020 in connection with the difficult epidemiological situation in Russia. The possibilities of the 1C:Education system for organizing the educational process in a distance format, the peculiarities of organizing distance interaction in schools and colleges are considered, the results of using the system are summarized, examples of the successful use of the system in specific educational organizations are given. Based on the questionnaire survey of users, a number of capabilities of the 1C:Education system have been identified, which are essential for the full-fledged transfer of the educational process from full-time to distance learning. The nature and frequency of the use of electronic educational resources in various general education subjects in schools and colleges are analyzed, the importance of the presence in the distance learning system not only of a digital library of ready-made educational materials, but also of tools for creating author’s content is assessed. On the basis of an impersonal analysis of user actions in the system, a number of problems were identified that teachers and students faced in the process of an emergency transition to distance learning.


Sign in / Sign up

Export Citation Format

Share Document