An Overview of Biomolecular Event Extraction from Scientific Documents

Computational and Mathematical Methods in Medicine ◽

10.1155/2015/571381 ◽

2015 ◽

Vol 2015 ◽

pp. 1-19 ◽

Cited By ~ 2

Author(s):

Jorge A. Vanegas ◽

Sérgio Matos ◽

Fabio González ◽

José L. Oliveira

Keyword(s):

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Biomedical Literature ◽

Event Extraction ◽

Automatic Extraction ◽

Biological Processes ◽

Scientific Texts ◽

Research Areas ◽

Current State

This paper presents a review of state-of-the-art approaches to automatic extraction of biomolecular events from scientific texts. Events involving biomolecules such as genes, transcription factors, or enzymes, for example, have a central role in biological processes and functions and provide valuable information for describing physiological and pathogenesis mechanisms. Event extraction from biomedical literature has a broad range of applications, including support for information retrieval, knowledge summarization, and information extraction and discovery. However, automatic event extraction is a challenging task due to the ambiguity and diversity of natural language and higher-level linguistic phenomena, such as speculations and negations, which occur in biological texts and can lead to misunderstanding or incorrect interpretation. Many strategies have been proposed in the last decade, originating from different research areas such as natural language processing, machine learning, and statistics. This review summarizes the most representative approaches in biomolecular event extraction and presents an analysis of the current state of the art and of commonly used methods, features, and tools. Finally, current research trends and future perspectives are also discussed.

Download Full-text

Computational Analysis of Storylines

10.1017/9781108854221 ◽

2021 ◽

Keyword(s):

Computational Linguistics ◽

Language Processing ◽

Computational Analysis ◽

State Of The Art ◽

Relevant Information ◽

Event Extraction ◽

Multidisciplinary Research ◽

Narrative Structures ◽

Current State ◽

Event Representations

Event structures are central in Linguistics and Artificial Intelligence research: people can easily refer to changes in the world, identify their participants, distinguish relevant information, and have expectations of what can happen next. Part of this process is based on mechanisms similar to narratives, which are at the heart of information sharing. But it remains difficult to automatically detect events or automatically construct stories from such event representations. This book explores how to handle today's massive news streams and provides multidimensional, multimodal, and distributed approaches, like automated deep learning, to capture events and narrative structures involved in a 'story'. This overview of the current state-of-the-art on event extraction, temporal and casual relations, and storyline extraction aims to establish a new multidisciplinary research community with a common terminology and research agenda. Graduate students and researchers in natural language processing, computational linguistics, and media studies will benefit from this book.

Download Full-text

Natural language interfaces to databases

The Knowledge Engineering Review ◽

10.1017/s0269888900005476 ◽

1990 ◽

Vol 5 (4) ◽

pp. 225-249 ◽

Cited By ~ 52

Author(s):

Ann Copestake ◽

Karen Sparck Jones

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Central Process ◽

Current State ◽

Natural Language Question ◽

The One ◽

Language Question ◽

And Task

AbstractThis paper reviews the current state of the art in natural language access to databases. This has been a long-standing area of work in natural language processing. But though some commercial systems are now available, providing front ends has proved much harder than was expected, and the necessary limitations on front ends have to be recognized. The paper discusses the issues, both general to language and task-specific, involved in front end design, and the way these have been addressed, concentrating on the work of the last decade. The focus is on the central process of translating a natural language question into a database query, but other supporting functions are also covered. The points are illustrated by the use of a single example application. The paper concludes with an evaluation of the current state, indicating that future progress will depend on the one hand on general advances in natural language processing, and on the other on expanding the capabilities of traditional databases.

Download Full-text

A State-of-the-Art Review of Nigerian Languages Natural Language Processing Research

Advances in IT Standards and Standardization Research - Developing Countries and Technology Inclusion in the 21st Century Information Society ◽

10.4018/978-1-7998-3468-7.ch008 ◽

2021 ◽

pp. 147-167

Author(s):

Toluwase Victor Asubiaro ◽

Ebelechukwu Gloria Igwe

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Optical Character Recognition ◽

State Of The Art ◽

Resource Development ◽

African Languages ◽

Low Resource ◽

Research Areas ◽

Cross Lingual

African languages, including those that are natives to Nigeria, are low-resource languages because they lack basic computing resources such as language-dependent hardware keyboard. Speakers of these low-resource languages are therefore unfairly deprived of information access on the internet. There is no information about the level of progress that has been made on the computation of Nigerian languages. Hence, this chapter presents a state-of-the-art review of Nigerian languages natural language processing. The review reveals that only four Nigerian languages; Hausa, Ibibio, Igbo, and Yoruba have been significantly studied in published NLP papers. Creating alternatives to hardware keyboard is one of the most popular research areas, and means such as automatic diacritics restoration, virtual keyboard, and optical character recognition have been explored. There was also an inclination towards speech and computational morphological analysis. Resource development and knowledge representation modeling of the languages using rapid resource development and cross-lingual methods are recommended.

Download Full-text

Event Extraction from Biomedical Literature

10.1101/034397 ◽

2015 ◽

Cited By ~ 1

Author(s):

Abdur Rahman M.A. Basher ◽

Alexander S. Purdy ◽

Inanc Birol

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

Extraction Methods ◽

Explanatory Models ◽

Biomedical Literature ◽

Event Extraction ◽

Automatic Extraction ◽

Biomedical Domain ◽

Shared Task ◽

Areas Of Interest

The breadth and scope of the biomedical literature hinders a timely and thorough comprehension of its content. PubMed, the leading repository for biomedical literature, currently holds over 26 million records, and is growing at a rate of over 1.2 million records per year, with about 300 records added daily that mention `cancer' in the title or abstract. Natural language processing (NLP) can assist in accessing and interpreting this massive volume of literature, including its quality. NLP approaches to the automatic extraction of biomedical entities and relationships may assist the development of explanatory models that can comprehensively scan and summarize biomedical articles for end users. Users can also formulate structured queries against these entities, and their interactions, to mine the latest developments in related areas of interest. In this article, we explore the latest advances in automated event extraction methods in the biomedical domain, focusing primarily on tools participated in the Biomedical NLP (BioNLP) Shared Task (ST) competitions. We review the leading BioNLP methods, summarize their results, and their innovative contributions in this field.

Download Full-text

Meemi: A simple method for post-processing and integrating cross-lingual word embeddings

Natural Language Engineering ◽

10.1017/s1351324921000280 ◽

2021 ◽

pp. 1-23

Author(s):

Yerai Doval ◽

Jose Camacho-Collados ◽

Luis Espinosa-Anke ◽

Steven Schockaert

Keyword(s):

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Orthogonal Transformation ◽

Word Embeddings ◽

Initial Alignment ◽

Simple Method ◽

Word Similarity ◽

Current State ◽

Cross Lingual

Abstract Word embeddings have become a standard resource in the toolset of any Natural Language Processing practitioner. While monolingual word embeddings encode information about words in the context of a particular language, cross-lingual embeddings define a multilingual space where word embeddings from two or more languages are integrated together. Current state-of-the-art approaches learn these embeddings by aligning two disjoint monolingual vector spaces through an orthogonal transformation which preserves the structure of the monolingual counterparts. In this work, we propose to apply an additional transformation after this initial alignment step, which aims to bring the vector representations of a given word and its translations closer to their average. Since this additional transformation is non-orthogonal, it also affects the structure of the monolingual spaces. We show that our approach both improves the integration of the monolingual spaces and the quality of the monolingual spaces themselves. Furthermore, because our transformation can be applied to an arbitrary number of languages, we are able to effectively obtain a truly multilingual space. The resulting (monolingual and multilingual) spaces show consistent gains over the current state-of-the-art in standard intrinsic tasks, namely dictionary induction and word similarity, as well as in extrinsic tasks such as cross-lingual hypernym discovery and cross-lingual natural language inference.

Download Full-text

In vivo interactome profiling by enzyme‐catalyzed proximity labeling

Cell & Bioscience ◽

10.1186/s13578-021-00542-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yangfan Xu ◽

Xianqun Fan ◽

Yang Hu

Keyword(s):

State Of The Art ◽

Catalytic Efficiency ◽

Biological Processes ◽

Protein Protein Interaction ◽

Current State ◽

Protein Interactome ◽

Potential Applications ◽

Protein Protein Interaction Networks ◽

Temporal And Spatial

AbstractEnzyme-catalyzed proximity labeling (PL) combined with mass spectrometry (MS) has emerged as a revolutionary approach to reveal the protein-protein interaction networks, dissect complex biological processes, and characterize the subcellular proteome in a more physiological setting than before. The enzymatic tags are being upgraded to improve temporal and spatial resolution and obtain faster catalytic dynamics and higher catalytic efficiency. In vivo application of PL integrated with other state of the art techniques has recently been adapted in live animals and plants, allowing questions to be addressed that were previously inaccessible. It is timely to summarize the current state of PL-dependent interactome studies and their potential applications. We will focus on in vivo uses of newer versions of PL and highlight critical considerations for successful in vivo PL experiments that will provide novel insights into the protein interactome in the context of human diseases.

Download Full-text

Report on the 4th Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries at SIGIR 2019

ACM SIGIR Forum ◽

10.1145/3458553.3458554 ◽

2019 ◽

Vol 53 (2) ◽

pp. 3-10

Author(s):

Muthu Kumar Chandrasekaran ◽

Philipp Mayr

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Research And Development ◽

Language Processing ◽

Digital Libraries ◽

State Of The Art ◽

Shared Task ◽

Processing Information ◽

Joint Workshop

The 4 th joint BIRNDL workshop was held at the 42nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France. BIRNDL 2019 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated different paper sessions and the 5 th edition of the CL-SciSumm Shared Task.

Download Full-text

Automatic Extraction and Classification of Patients’ Smoking Status from Free Text Using Natural Language Processing

Value in Health ◽

10.1016/j.jval.2016.09.158 ◽

2016 ◽

Vol 19 (7) ◽

pp. A373

Author(s):

A Caccamisi ◽

L Jörgensen ◽

H Dalianis ◽

M Rosenlund

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Smoking Status ◽

Free Text ◽

Automatic Extraction

Download Full-text

Automatic Extraction of Major Osteoporotic Fractures from Radiology Reports using Natural Language Processing

2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W) ◽

10.1109/ichi-w.2018.00021 ◽

2018 ◽

Author(s):

Yanshan Wang ◽

Saeed Mehrabi ◽

Sunghwan Sohn ◽

Elizabeth Atkinson ◽

Shreyasee Amin ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Osteoporotic Fractures ◽

Automatic Extraction ◽

Radiology Reports ◽

Major Osteoporotic Fractures

Download Full-text

Textual entailment graphs

Natural Language Engineering ◽

10.1017/s1351324915000108 ◽

2015 ◽

Vol 21 (5) ◽

pp. 699-724 ◽

Cited By ~ 6

Author(s):

LILI KOTLERMAN ◽

IDO DAGAN ◽

BERNARDO MAGNINI ◽

LUISA BENTIVOGLI

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Gold Standard ◽

State Of The Art ◽

Text Analytics ◽

Joint Work ◽

Gold Standard Dataset ◽

Textual Entailment ◽

Interesting Task

AbstractIn this work, we present a novel type of graphs for natural language processing (NLP), namely textual entailment graphs (TEGs). We describe the complete methodology we developed for the construction of such graphs and provide some baselines for this task by evaluating relevant state-of-the-art technology. We situate our research in the context of text exploration, since it was motivated by joint work with industrial partners in the text analytics area. Accordingly, we present our motivating scenario and the first gold-standard dataset of TEGs. However, while our own motivation and the dataset focus on the text exploration setting, we suggest that TEGs can have different usages and suggest that automatic creation of such graphs is an interesting task for the community.

Download Full-text