Text Mining Biomedical Literature for Discovering Gene-to-Gene Relationships: A Comparative Study of Algorithms

Ying Liu; S.B. Navathe; J. Civera; V. Dasigi; A. Ram; B.J. Ciliax; R. Dingledine

doi:10.1109/tcbb.2005.14

Text Mining Biomedical Literature for Discovering Gene-to-Gene Relationships: A Comparative Study of Algorithms

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2005.14 ◽

2005 ◽

Vol 2 (1) ◽

pp. 62-76 ◽

Cited By ~ 23

Author(s):

Ying Liu ◽

S.B. Navathe ◽

J. Civera ◽

V. Dasigi ◽

A. Ram ◽

...

Keyword(s):

Text Mining ◽

Comparative Study ◽

Biomedical Literature

Download Full-text

A comparative study of the current technologies and approaches of relation extraction in biomedical literature using text mining

2017 4th IEEE International Conference on Engineering Technologies and Applied Sciences (ICETAS) ◽

10.1109/icetas.2017.8277841 ◽

2017 ◽

Cited By ~ 3

Author(s):

Faisal Alshuwaier ◽

Ali Areshey ◽

Josiah Poon

Keyword(s):

Text Mining ◽

Comparative Study ◽

Relation Extraction ◽

Biomedical Literature

Download Full-text

A Comparative Study of Root -Based and Stem -Based Approaches for Measuring the Similarity Between Arabic Words for Arabic Text Mining Applications

Advanced Computing An International Journal ◽

10.5121/acij.2012.3607 ◽

2012 ◽

Vol 3 (6) ◽

pp. 55-67 ◽

Cited By ~ 13

Author(s):

Hanane FROUD

Keyword(s):

Text Mining ◽

Comparative Study ◽

Arabic Text

Download Full-text

Disaster Reporting and Alert System Using Tweets in a Social Media

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195249 ◽

2019 ◽

pp. 176-181

Author(s):

P. Tamije Selvy ◽

V. Suriya Prakash ◽

S. Shriram ◽

N. Vimalesh

Keyword(s):

Social Media ◽

Text Mining ◽

Comparative Study ◽

Alert System ◽

Accuracy Rate ◽

The Social ◽

Short Period ◽

Disaster Reporting ◽

Centralized Database

The number of Social Media users have increased rapidly these days and a lot of valuable as well as non valuable information is shared in the social which is capable of reaching many people in a short period of time and hence the valuable information that are shared in the social media can be used for many types of analysis. In this paper the tweets that are shared in the name of a disaster is taken and then a alert system is build. This alert system gives alert to the users after checking the received data with the centralized database. This paper also gives a comparative study on the algorithm used in extracting the data from the social media which gives us the accuracy rate of different algorithm that can be used for text mining.

Download Full-text

A Novel Text-Mining Approach for Retrieving Pharmacogenomics Associations From the Literature

Frontiers in Pharmacology ◽

10.3389/fphar.2020.602030 ◽

2020 ◽

Vol 11 ◽

Author(s):

Maria-Theodora Pandi ◽

Peter J. van der Spek ◽

Maria Koromina ◽

George P. Patrinos

Keyword(s):

Text Mining ◽

Generalized Linear Models ◽

Linear Models ◽

Biomedical Literature ◽

Linear Kernel ◽

R Programming Language ◽

Research Areas ◽

Text Classifiers ◽

R Programming ◽

Further Development

Text mining in biomedical literature is an emerging field which has already been shown to have a variety of implementations in many research areas, including genetics, personalized medicine, and pharmacogenomics. In this study, we describe a novel text-mining approach for the extraction of pharmacogenomics associations. The code that was used toward this end was implemented using R programming language, either through custom scripts, where needed, or through utilizing functions from existing libraries. Articles (abstracts or full texts) that correspond to a specified query were extracted from PubMed, while concept annotations were derived by PubTator Central. Terms that denote a Mutation or a Gene as well as Chemical compound terms corresponding to drug compounds were normalized and the sentences containing the aforementioned terms were filtered and preprocessed to create appropriate training sets. Finally, after training and adequate hyperparameter tuning, four text classifiers were created and evaluated (FastText, Linear kernel SVMs, XGBoost, Lasso, and Elastic-Net Regularized Generalized Linear Models) with regard to their performance in identifying pharmacogenomics associations. Although further improvements are essential toward proper implementation of this text-mining approach in the clinical practice, our study stands as a comprehensive, simplified, and up-to-date approach for the identification and assessment of research articles enriched in clinically relevant pharmacogenomics relationships. Furthermore, this work highlights a series of challenges concerning the effective application of text mining in biomedical literature, whose resolution could substantially contribute to the further development of this field.

Download Full-text

BioReader: a text mining tool for performing classification of biomedical literature

BMC Bioinformatics ◽

10.1186/s12859-019-2607-x ◽

2019 ◽

Vol 19 (S13) ◽

Cited By ~ 9

Author(s):

Christian Simon ◽

Kristian Davidsen ◽

Christina Hansen ◽

Emily Seymour ◽

Mike Bogetofte Barnkob ◽

...

Keyword(s):

Text Mining ◽

Biomedical Literature ◽

Mining Tool ◽

Text Mining Tool

Download Full-text

Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature

BMC Bioinformatics ◽

10.1186/s12859-018-2103-8 ◽

2018 ◽

Vol 19 (1) ◽

Cited By ~ 27

Author(s):

H.-M. Müller ◽

K. M. Van Auken ◽

Y. Li ◽

P. W. Sternberg

Keyword(s):

Text Mining ◽

Biomedical Literature

Download Full-text

Identifying Potential Early Biomarkers Of Acute Myocardial Infarction In The Biomedical Literature: A Comparison Of Text Mining And Manual Sifting Techniques

Value in Health ◽

10.1016/j.jval.2016.09.120 ◽

2016 ◽

Vol 19 (7) ◽

pp. A367 ◽

Cited By ~ 1

Author(s):

S Paisley ◽

J Seva ◽

M Stevenson ◽

R Archer ◽

L Preston ◽

...

Keyword(s):

Myocardial Infarction ◽

Acute Myocardial Infarction ◽

Text Mining ◽

Biomedical Literature ◽

Early Biomarkers

Download Full-text

TMT-HCC: A tool for text mining the biomedical literature for hepatocellular carcinoma (HCC) biomarkers identification

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2013.07.014 ◽

2013 ◽

Vol 112 (3) ◽

pp. 640-648 ◽

Cited By ~ 8

Author(s):

Rania A. Abul Seoud ◽

Mai S. Mabrouk

Keyword(s):

Hepatocellular Carcinoma ◽

Text Mining ◽

Biomedical Literature

Download Full-text

BCISeach: A Searching Platform of Breast Cancer Text Mining for Biomedical Literature

2016 12th International Conference on Semantics, Knowledge and Grids (SKG) ◽

10.1109/skg.2016.034 ◽

2016 ◽

Cited By ~ 1

Author(s):

Lejun Gong ◽

Ronggen Yang ◽

Haoyu Yang ◽

Kaiyu Jiang ◽

Zhenjiang Dong ◽

...

Keyword(s):

Breast Cancer ◽

Text Mining ◽

Biomedical Literature

Download Full-text

MACE2K: A Text-Mining Tool to Extract Literature-based Evidence for Variant Interpretation using Machine Learning

10.1101/2020.12.03.409094 ◽

2020 ◽

Author(s):

Samir Gupta ◽

Shruti Rao ◽

Trisha Miglani ◽

Yasaswini Iyer ◽

Junxia Lin ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Text Mining ◽

Genomic Medicine ◽

Relevant Information ◽

Biomedical Literature ◽

Variant Interpretation ◽

Learning Models ◽

Mining Tool ◽

Text Mining Tool

AbstractInterpretation of a given variant’s pathogenicity is one of the most profound challenges to realizing the promise of genomic medicine. A large amount of information about associations between variants and diseases used by curators and researchers for interpreting variant pathogenicity is buried in biomedical literature. The development of text-mining tools that can extract relevant information from the literature will speed up and assist the variant interpretation curation process. In this work, we present a text-mining tool, MACE2k that extracts evidence sentences containing associations between variants and diseases from full-length PMC Open Access articles. We use different machine learning models (classical and deep learning) to identify evidence sentences with variant-disease associations. Evaluation shows promising results with the best F1-score of 82.9% and AUC-ROC of 73.9%. Classical ML models had a better recall (96.6% for Random Forest) compared to deep learning models. The deep learning model, Convolutional Neural Network had the best precision (75.6%), which is essential for any curation task.

Download Full-text