DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

Balachandran Manavalan; Tae Hwan Shin; Gwang Lee

doi:10.18632/oncotarget.23099

DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

Oncotarget ◽

10.18632/oncotarget.23099 ◽

2017 ◽

Vol 9 (2) ◽

pp. 1944-1956 ◽

Cited By ~ 58

Author(s):

Balachandran Manavalan ◽

Tae Hwan Shin ◽

Gwang Lee

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Dnase I ◽

Support Vector ◽

Dnase I Hypersensitive Sites ◽

Hypersensitive Sites

Download Full-text

DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

10.1101/224527 ◽

2017 ◽

Cited By ~ 1

Author(s):

Balachandran Manavalan ◽

Tae Hwan Shin ◽

Gwang Lee

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Dna Sequences ◽

Feature Selection Method ◽

Regulatory Elements ◽

Dnase I ◽

Support Vector ◽

Large Set ◽

Dnase I Hypersensitive Sites ◽

Hypersensitive Sites

AbstractDNase I hypersensitive sites (DHSs) are genomic regions that provide important information regarding the presence of transcriptional regulatory elements and the state of chromatin. Therefore, identifying DHSs in uncharacterized DNA sequences is crucial for understanding their biological functions and mechanisms. Although many experimental methods have been proposed to identify DHSs, they have proven to be expensive for genome-wide application. Therefore, it is necessary to develop computational methods for DHS prediction. In this study, we proposed a support vector machine (SVM)-based method for predicting DHSs, called DHSpred (DNase I Hypersensitive Site predictor in human DNA sequences), which was trained with 174 optimal features. The optimal combination of features was identified from a large set that included nucleotide composition and di- and trinucleotide physicochemical properties, using a random forest algorithm. DHSpred achieved a Matthews correlation coefficient and accuracy of 0.660 and 0.871, respectively, which were 3% higher than those of control SVM predictors trained with non-optimized features, indicating the efficiency of the feature selection method. Furthermore, the performance of DHSpred was superior to that of state-of-the-art predictors. An online prediction server has been developed to assist the scientific community, and is freely available at:http://www.thegleelab.org/DHSpred.html.

Download Full-text

pDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2017.05.030 ◽

2017 ◽

Vol 426 ◽

pp. 126-133 ◽

Cited By ~ 11

Author(s):

Shanxin Zhang ◽

Zhiping Zhou ◽

Xinmeng Chen ◽

Yong Hu ◽

Lindong Yang

Keyword(s):

Support Vector Machine ◽

Prediction Method ◽

Dnase I ◽

Support Vector ◽

Dnase I Hypersensitive Sites ◽

Hypersensitive Sites

Download Full-text

Prediction of DNase I Hypersensitive Sites by Using Pseudo Nucleotide Compositions

The Scientific World JOURNAL ◽

10.1155/2014/740506 ◽

2014 ◽

Vol 2014 ◽

pp. 1-4 ◽

Cited By ~ 16

Author(s):

Pengmian Feng ◽

Ning Jiang ◽

Nan Liu

Keyword(s):

Cost Effective ◽

Dnase I ◽

Support Vector ◽

Jackknife Test ◽

Dnase I Hypersensitive Sites ◽

Proposed Model ◽

Hypersensitive Sites ◽

Dna Elements ◽

Genomic Regions ◽

Regulatory Dna

DNase I hypersensitive sites (DHS) associated with a wide variety of regulatory DNA elements. Knowledge about the locations of DHS is helpful for deciphering the function of noncoding genomic regions. With the acceleration of genome sequences in the postgenomic age, it is highly desired to develop cost-effective computational methods to identify DHS. In the present work, a support vector machine based model was proposed to identify DHS by using the pseudo dinucleotide composition. In the jackknife test, the proposed model obtained an accuracy of 83%, which is competitive with that of the existing method. This result suggests that the proposed model may become a useful tool for DHS identifications.

Download Full-text

Investigating the use of random forest, gradient boosting machine, support vector machine and their ensemble applied to fault detection

10.26678/abcm.cobem2017.cob17-1600 ◽

2017 ◽

Author(s):

Luis Felipe Nogoseke ◽

Gabriel Herman Bernardim Andrade ◽

Marco Boaretto ◽

Leandro Coelho

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Fault Detection ◽

Gradient Boosting ◽

Support Vector ◽

Gradient Boosting Machine

Download Full-text

Progress on identification and analysis of DNase I hypersensitive sites in plant genomes

Hereditas (Beijing) ◽

10.3724/sp.j.1005.2013.00867 ◽

2013 ◽

Vol 35 (7) ◽

pp. 867-874

Author(s):

Tao ZHANG ◽

Zu-Jun YANG

Keyword(s):

Dnase I ◽

Plant Genomes ◽

Dnase I Hypersensitive Sites ◽

Hypersensitive Sites

Download Full-text

The transferability of random forest and support vector machine for estimating daily global solar radiation using sunshine duration over different climate zones

Theoretical and Applied Climatology ◽

10.1007/s00704-021-03726-6 ◽

2021 ◽

Author(s):

Wei Wu ◽

Mao-Fen Li ◽

Xia Xu ◽

Xiao-Ping Tang ◽

Chao Yang ◽

...

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Solar Radiation ◽

Sunshine Duration ◽

Global Solar Radiation ◽

Support Vector ◽

Climate Zones

Download Full-text

Implementing a network intrusion detection system using semi-supervised support vector machine and random forest

Proceedings of the 2021 ACM Southeast Conference ◽

10.1145/3409334.3452073 ◽

2021 ◽

Author(s):

Sandeep Shah ◽

Pramita Sree Muhuri ◽

Xiaohong Yuan ◽

Kaushik Roy ◽

Prosenjit Chatterjee

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Support Vector ◽

Network Intrusion ◽

Network Intrusion Detection System

Download Full-text

A Two Layer Machine Learning System for Intrusion Detection Based on Random Forest and Support Vector Machine

2020 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE) ◽

10.1109/wiecon-ece52138.2020.9397945 ◽

2020 ◽

Author(s):

Sabrina Afroz ◽

S.M Ariful Islam ◽

Samin Nawer Rafa ◽

Maheen Islam

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Intrusion Detection ◽

Learning System ◽

Support Vector

Download Full-text

DNase I hypersensitive sites of globin genes of uninduced Friend erythroleukemia cells and changes during induction with dimethyl sulfoxide.

Journal of Biological Chemistry ◽

10.1016/s0021-9258(17)44502-9 ◽

1983 ◽

Vol 258 (17) ◽

pp. 10622-10628

Author(s):

J M Balcarek ◽

F A McMorris

Keyword(s):

Dimethyl Sulfoxide ◽

Dnase I ◽

Globin Genes ◽

Dnase I Hypersensitive Sites ◽

Erythroleukemia Cells ◽

Hypersensitive Sites ◽

Friend Erythroleukemia Cells

Download Full-text

The prediction of human DNase I hypersensitive sites based on DNA sequence information

Chemometrics and Intelligent Laboratory Systems ◽

10.1016/j.chemolab.2020.104223 ◽

2021 ◽

Vol 209 ◽

pp. 104223

Author(s):

Wei Su ◽

Fang Wang ◽

Jiu-Xin Tan ◽

Fu-Ying Dao ◽

Hui Yang ◽

...

Keyword(s):

Dna Sequence ◽

Dnase I ◽

Sequence Information ◽

Dnase I Hypersensitive Sites ◽

Hypersensitive Sites

Download Full-text