Characterization of Cancer Types by Applying Machine Learning Methods on Blood RNA-Sequencing Data

Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods

Journal of Translational Medicine ◽

10.1186/s12967-020-02635-y ◽

2020 ◽

Vol 18 (1) ◽

Author(s):

Qidong Cai ◽

Boxue He ◽

Pengfei Zhang ◽

Zhenyu Zhao ◽

Xiong Peng ◽

...

Keyword(s):

Machine Learning ◽

Alternative Splicing ◽

Lung Adenocarcinoma ◽

Prognostic Model ◽

Cox Regression ◽

Machine Learning Algorithms ◽

The Cancer Genome Atlas ◽

Sequencing Data ◽

Learning Methods ◽

Machine Learning Methods

Abstract Background Alternative splicing (AS) plays critical roles in generating protein diversity and complexity. Dysregulation of AS underlies the initiation and progression of tumors. Machine learning approaches have emerged as efficient tools to identify promising biomarkers. It is meaningful to explore pivotal AS events (ASEs) to deepen understanding and improve prognostic assessments of lung adenocarcinoma (LUAD) via machine learning algorithms. Method RNA sequencing data and AS data were extracted from The Cancer Genome Atlas (TCGA) database and TCGA SpliceSeq database. Using several machine learning methods, we identified 24 pairs of LUAD-related ASEs implicated in splicing switches and a random forest-based classifiers for identifying lymph node metastasis (LNM) consisting of 12 ASEs. Furthermore, we identified key prognosis-related ASEs and established a 16-ASE-based prognostic model to predict overall survival for LUAD patients using Cox regression model, random survival forest analysis, and forward selection model. Bioinformatics analyses were also applied to identify underlying mechanisms and associated upstream splicing factors (SFs). Results Each pair of ASEs was spliced from the same parent gene, and exhibited perfect inverse intrapair correlation (correlation coefficient = − 1). The 12-ASE-based classifier showed robust ability to evaluate LNM status of LUAD patients with the area under the receiver operating characteristic (ROC) curve (AUC) more than 0.7 in fivefold cross-validation. The prognostic model performed well at 1, 3, 5, and 10 years in both the training cohort and internal test cohort. Univariate and multivariate Cox regression indicated the prognostic model could be used as an independent prognostic factor for patients with LUAD. Further analysis revealed correlations between the prognostic model and American Joint Committee on Cancer stage, T stage, N stage, and living status. The splicing network constructed of survival-related SFs and ASEs depicts regulatory relationships between them. Conclusion In summary, our study provides insight into LUAD researches and managements based on these AS biomarkers.

Download Full-text

Quantitative characterization of bovine serum albumin thin-films using terahertz spectroscopy and machine learning methods

Biomedical Optics Express ◽

10.1364/boe.9.002917 ◽

2018 ◽

Vol 9 (7) ◽

pp. 2917 ◽

Cited By ~ 7

Author(s):

Yiwen Sun ◽

Pengju Du ◽

Xingxing Lu ◽

Pengfei Xie ◽

Zhengfang Qian ◽

...

Keyword(s):

Machine Learning ◽

Thin Films ◽

Bovine Serum Albumin ◽

Serum Albumin ◽

Bovine Serum ◽

Terahertz Spectroscopy ◽

Quantitative Characterization ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Applying machine learning methods for characterization of hexagonal prisms from their 2D scattering patterns – an investigation using modelled scattering data

Journal of Quantitative Spectroscopy and Radiative Transfer ◽

10.1016/j.jqsrt.2017.07.001 ◽

2017 ◽

Vol 201 ◽

pp. 115-127 ◽

Cited By ~ 2

Author(s):

Emmanuel Oluwatobi Salawu ◽

Evelyn Hesse ◽

Chris Stopford ◽

Neil Davey ◽

Yi Sun

Keyword(s):

Machine Learning ◽

Scattering Data ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Prediction of acetylcholinesterase inhibitors and characterization of correlative molecular descriptors by machine learning methods

European Journal of Medicinal Chemistry ◽

10.1016/j.ejmech.2009.12.038 ◽

2010 ◽

Vol 45 (3) ◽

pp. 1167-1172 ◽

Cited By ~ 19

Author(s):

Wei Lv ◽

Ying Xue

Keyword(s):

Machine Learning ◽

Molecular Descriptors ◽

Acetylcholinesterase Inhibitors ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Reproductive phasiRNAs in grasses are compositionally distinct from other classes of small RNAs

10.1101/242727 ◽

2018 ◽

Author(s):

Parth Patel ◽

Sandra Mathioni ◽

Atul Kakrana ◽

Hagit Shatkay ◽

Blake C. Meyers

Keyword(s):

Machine Learning ◽

Small Rnas ◽

Structural Features ◽

Classification Performance ◽

List Type ◽

Small Interfering Rnas ◽

Sequencing Data ◽

Learning Methods ◽

Specific Sequence ◽

Machine Learning Methods

Summary and keywordsLittle is known about the characteristics and function of reproductive phased, secondary, small interfering RNAs (phasiRNAs) in the Poaceae, despite the availability of significant genomic resources, experimental data, and a growing number of computational tools. We utilized machine-learning methods to identify sequence-based and structural features that distinguish phasiRNAs in rice and maize from other small RNAs (sRNAs).We developed Random Forest classifiers that can distinguish reproductive phasiRNAs from other sRNAs in complex sets of sequencing data, utilizing sequence-based (k-mers) and features describing position-specific sequence biases.The classification performance attained is >80% in accuracy, sensitivity, specificity, and positive predicted value. Feature selection identified important features in both ends of phasiRNAs. We demonstrated that phasiRNAs have strand specificity and position-specific nucleotide biases potentially influencing AGO sorting; we also predicted targets to infer functions of phasiRNAs, and computationally-assessed their sequence characteristics relative to other sRNAs.Our results demonstrate that machine-learning methods effectively identify phasiRNAs despite the lack of characteristic features typically present in precursor loci of other small RNAs, such as sequence conservation or structural motifs. The 5’-end features we identified provide insights into AGO-phasiRNA interactions; we describe a hypothetical model of competition for AGO loading between phasiRNAs of different nucleotide compositions.

Download Full-text

Hyperparameters optimisation for time varying signals

Scientific Bulletin of Naval Academy ◽

10.21279/1454-864x-19-i1-027 ◽

2019 ◽

Vol XXII (1) ◽

pp. 196-199

Author(s):

Rogobete M.

Keyword(s):

Machine Learning ◽

General Class ◽

Performance Model ◽

Specific Class ◽

Time Varying ◽

Learning Methods ◽

Amount Of Information ◽

Machine Learning Methods ◽

Validation Set

For the most machine learning methods, for cyclo-stationary or even stochastic signals, the performance depends critically on hyperparameters. Moreover, the tuning of more hyperparameters based on the feedback of the performance model will leak an increasingly significant amount of information about the validation set into the model. Therefore, we propose in this research two classes of hyperparameters, a general class that makes the characterization of general signal curve and the second, a specific class that define special parameters connected to the phenomena type (e.g. sensor type).

Download Full-text

Machine Learning of Coq Proof Guidance: First Experiments

10.29007/lmmg ◽

2018 ◽

Author(s):

Cezary Kaliszyk ◽

Lionel Mamane ◽

Josef Urban

Keyword(s):

Machine Learning ◽

Evaluation Method ◽

Learning Methods ◽

Machine Learning Methods ◽

Comparable Performance

We report the results of the first experiments with learning proof dependencies from the formalizations done with the Coq system. We explain the process of obtaining the dependencies from the Coq proofs, the characterization of formulas that is used for the learning, and the evaluation method. Various machine learning methods are compared on a dataset of 5021 toplevel Coq proofs coming from the CoRN repository. The best resulting method covers on average 75% of the needed proof dependencies among the first 100 predictions, which is a comparable performance of such initial experiments on other large-theory corpora.

Download Full-text

Recent Development of Computational Predicting Bioluminescent Proteins

Current Pharmaceutical Design ◽

10.2174/1381612825666191107100758 ◽

2020 ◽

Vol 25 (40) ◽

pp. 4264-4273 ◽

Cited By ~ 2

Author(s):

Dan Zhang ◽

Zheng-Xing Guan ◽

Zi-Mei Zhang ◽

Shi-Hao Li ◽

Fu-Ying Dao ◽

...

Keyword(s):

Machine Learning ◽

Light Emission ◽

Biotechnological Application ◽

Learning Methods ◽

Machine Learning Methods ◽

Technological Advances ◽

Living Organisms ◽

Role Of Light

Bioluminescent Proteins (BLPs) are widely distributed in many living organisms that act as a key role of light emission in bioluminescence. Bioluminescence serves various functions in finding food and protecting the organisms from predators. With the routine biotechnological application of bioluminescence, it is recognized to be essential for many medical, commercial and other general technological advances. Therefore, the prediction and characterization of BLPs are significant and can help to explore more secrets about bioluminescence and promote the development of application of bioluminescence. Since the experimental methods are money and time-consuming for BLPs identification, bioinformatics tools have played important role in fast and accurate prediction of BLPs by combining their sequences information with machine learning methods. In this review, we summarized and compared the application of machine learning methods in the prediction of BLPs from different aspects. We wish that this review will provide insights and inspirations for researches on BLPs.

Download Full-text

Machine Learning Methods as a Tool for Predicting Risk of Illness Applying Next‐Generation Sequencing Data

Risk Analysis ◽

10.1111/risa.13239 ◽

2018 ◽

Cited By ~ 4

Author(s):

Patrick Murigu Kamau Njage ◽

Clementine Henri ◽

Pimlapas Leekitcharoenphon ◽

Michel‐Yves Mistou ◽

Rene S. Hendriksen ◽

...

Keyword(s):

Machine Learning ◽

Next Generation Sequencing ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Learning Methods ◽

Machine Learning Methods ◽

Generation Sequencing

Download Full-text

Machine Learning methods for a complete TGA analysis. Characterization of La1-xCaxNiO3±δ as catalyst precursors for dry methane reforming

10.26226/morressier.5f6c5f439b74b699bf390c72 ◽

2020 ◽

Author(s):

Jaime Gallego ◽

Andres Marulanda-Bran ◽

Jose C-Salazar

Keyword(s):

Machine Learning ◽

Methane Reforming ◽

Learning Methods ◽

Machine Learning Methods ◽

Tga Analysis

Download Full-text