PLAN2L: a web tool for integrated text mining and literature-based bioentity relation extraction

The identification of disease genes from candidated regions is one of the most important tasks in bioinformatics research. Most approaches based on function annotations cannot be used to identify genes for diseases without any known pathogenic genes or related function annotations. The authors have built a new web tool, DGHunter, to predict genes associated with these diseases which lack detailed function annotations. Its performance was tested with a set of 1506 genes involved in 1147 disease phenotypes derived from the morbid map table in the OMIM database. The results show that, on average, the target gene was in the top 13.60% of the ranked lists of candidates, and the target gene was in the top 5% with a 40.70% chance. DGHunter can identify disease genes effectively for those diseases lacking sufficient function annotations.

Download Full-text

Text mining and pattern clustering for relation extraction of breast cancer and related genes

2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) ◽

10.1109/snpd.2017.8022701 ◽

2017 ◽

Cited By ~ 1

Author(s):

Koya Kawashima ◽

Wenjun Bai ◽

Changqin Quan

Keyword(s):

Breast Cancer ◽

Text Mining ◽

Relation Extraction ◽

Pattern Clustering

Download Full-text

Building Text-mining Framework for Gene-Phenotype Relation Extraction using Deep Leaning

Proceedings of the ACM Ninth International Workshop on Data and Text Mining in Biomedical Informatics - DTMBIO '15 ◽

10.1145/2811163.2811165 ◽

2015 ◽

Author(s):

Dongjin Jang ◽

Jaehyun Lee ◽

Kwangmin Kim ◽

Doheon Lee

Keyword(s):

Text Mining ◽

Relation Extraction

Download Full-text

Text Mining for Building Biomedical Networks Using Cancer as a Case Study

Biomolecules ◽

10.3390/biom11101430 ◽

2021 ◽

Vol 11 (10) ◽

pp. 1430

Author(s):

Sofia I. R. Conceição ◽

Francisco M. Couto

Keyword(s):

Text Mining ◽

Biological Networks ◽

Scientific Literature ◽

Real Life ◽

Relation Extraction ◽

Cancer Disease ◽

Network Building ◽

Automatic Information ◽

Notable Increase

In the assembly of biological networks it is important to provide reliable interactions in an effort to have the most possible accurate representation of real-life systems. Commonly, the data used to build a network comes from diverse high-throughput essays, however most of the interaction data is available through scientific literature. This has become a challenge with the notable increase in scientific literature being published, as it is hard for human curators to track all recent discoveries without using efficient tools to help them identify these interactions in an automatic way. This can be surpassed by using text mining approaches which are capable of extracting knowledge from scientific documents. One of the most important tasks in text mining for biological network building is relation extraction, which identifies relations between the entities of interest. Many interaction databases already use text mining systems, and the development of these tools will lead to more reliable networks, as well as the possibility to personalize the networks by selecting the desired relations. This review will focus on different approaches of automatic information extraction from biomedical text that can be used to enhance existing networks or create new ones, such as deep learning state-of-the-art approaches, focusing on cancer disease as a case-study.

Download Full-text

KinderMiner Web: a simple web tool for ranking pairwise associations in biomedical applications

F1000Research ◽

10.12688/f1000research.25523.1 ◽

2020 ◽

Vol 9 ◽

pp. 832

Author(s):

Finn Kuusisto ◽

Daniel Ng ◽

John Steill ◽

Ian Ross ◽

Miron Livny ◽

...

Keyword(s):

Text Mining ◽

Web Application ◽

Biomedical Applications ◽

Analysis Tool ◽

Trial And Error ◽

Web Tool ◽

Interactive Analysis ◽

Term List ◽

Public Resource ◽

The Web

Many important scientific discoveries require lengthy experimental processes of trial and error and could benefit from intelligent prioritization based on deep domain understanding. While exponential growth in the scientific literature makes it difficult to keep current in even a single domain, that same rapid growth in literature also presents an opportunity for automated extraction of knowledge via text mining. We have developed a web application implementation of the KinderMiner algorithm for proposing ranked associations between a list of target terms and a key phrase. Any key phrase and target term list can be used for biomedical inquiry. We built the web application around a text index derived from PubMed. It is the first publicly available implementation of the algorithm, is fast and easy to use, and includes an interactive analysis tool. The KinderMiner web application is a public resource offering scientists a cohesive summary of what is currently known about a particular topic within the literature, and helping them to prioritize experiments around that topic. It performs comparably or better to similar state-of-the-art text mining tools, is more flexible, and can be applied to any biomedical topic of interest. It is also continually improving with quarterly updates to the underlying text index and through response to suggestions from the community. The web application is available at https://www.kinderminer.org.

Download Full-text

A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature

BioMed Research International ◽

10.1155/2014/253128 ◽

2014 ◽

Vol 2014 ◽

pp. 1-11 ◽

Cited By ~ 31

Author(s):

À. Bravo ◽

M. Cases ◽

N. Queralt-Rosinach ◽

F. Sanz ◽

L. I. Furlong

Keyword(s):

Text Mining ◽

Named Entity Recognition ◽

Relation Extraction ◽

Recognition System ◽

Biomedical Literature ◽

Entity Recognition ◽

Scientific Publications ◽

Positive Ratio ◽

Related Information ◽

Mesh Terms

The biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exploitation of information contained in the scientific publications. Here, we show that a knowledge-driven text mining approach can exploit a large literature database to extract a dataset of biomarkers related to diseases covering all therapeutic areas. Our methodology takes advantage of the annotation of MEDLINE publications pertaining to biomarkers with MeSH terms, narrowing the search to specific publications and, therefore, minimizing the false positive ratio. It is based on a dictionary-based named entity recognition system and a relation extraction module. The application of this methodology resulted in the identification of 131,012 disease-biomarker associations between 2,803 genes and 2,751 diseases, and represents a valuable knowledge base for those interested in disease-related biomarkers. Additionally, we present a bibliometric analysis of the journals reporting biomarker related information during the last 40 years.

Download Full-text

Identifying interactions between chemical entities in biomedical text

Journal of Integrative Bioinformatics ◽

10.1515/jib-2014-247 ◽

2014 ◽

Vol 11 (3) ◽

pp. 1-16 ◽

Cited By ~ 6

Author(s):

Andre Lamurias ◽

João D. Ferreira ◽

Francisco M. Couto

Keyword(s):

Named Entity Recognition ◽

Relation Extraction ◽

Ensemble Classifier ◽

Entity Recognition ◽

Support Vector ◽

Biomedical Text ◽

Web Tool ◽

Named Entity ◽

Vector Machines ◽

Chemical Named Entity Recognition

Summary Interactions between chemical compounds described in biomedical text can be of great importance to drug discovery and design, as well as pharmacovigilance. We developed a novel system, “Identifying Interactions between Chemical Entities” (IICE), to identify chemical interactions described in text. Kernel-based Support Vector Machines first identify the interactions and then an ensemble classifier validates and classifies the type of each interaction. This relation extraction module was evaluated with the corpus released for the DDI Extraction task of SemEval 2013, obtaining results comparable to stateof- the-art methods for this type of task. We integrated this module with our chemical named entity recognition module and made the whole system available as a web tool at www.lasige.di.fc.ul.pt/webtools/iice.

Download Full-text