K-mer based classifiers extract functionally relevant features to support accurate Peroxiredoxin subgroup distinction

Mapping Intimacies ◽

10.1101/387787 ◽

2018 ◽

Author(s):

Jiajie Xiao ◽

William H. Turkett

Keyword(s):

Active Site ◽

Antioxidant Defense ◽

State Of The Art ◽

Support Vector ◽

Site Analysis ◽

Representative Sequence ◽

Current State ◽

Vector Machines ◽

Sequence Representation ◽

Functional Relevance

AbstractBackgroundThe Peroxiredoxins (Prx) are a family of proteins that play a major role in antioxidant defense and peroxide-regulated signaling. Six distinct Prx subgroups have been defined based on analysis of structure and sequence regions in proximity to the Prx active site. Analysis of other sequence regions of these annotated proteins may improve the ability to distinguish subgroups and uncover additional representative sequence regions beyond the active site.ResultsThe space of Prx subgroup classifiers is surveyed to highlight similarities and differences in the available approaches. Exploiting the recent growth in annotated Prx proteins, a whole sequence-based classifier is presented that employs support vector machines and a k-mer (k=3) sequence representation.Distinguishing k-mers are extracted and located relative to published active site regions.ConclusionsThis work demonstrates that the 3-mer based classifier can attain high accuracy in subgroup annotation, at rates similar to the current state-of-the-art. Analysis of the classifier’s automatically derived models show that the classification decision is based on a combination of conserved features, including a significant number of residue regions that have not been previously suggested as informative by other classifiers but for which there is evidence of functional relevance.

Download Full-text

Contact Lens Classification by Using Segmented Lens Boundary Features

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v11.i3.pp1129-1135 ◽

2018 ◽

Vol 11 (3) ◽

pp. 1129

Author(s):

Nur Ariffin Mohd Zin ◽

Hishammuddin Asmuni ◽

Haza Nuzly Abdul Hamed ◽

Razib M. Othman ◽

Shahreen Kasim ◽

...

Keyword(s):

Support Vector Machines ◽

Contact Lens ◽

State Of The Art ◽

Classification Method ◽

Support Vector ◽

Local Descriptors ◽

Iris Image ◽

Vector Machines ◽

False Reject Rate ◽

Better Than

Recent studies have shown that the wearing of soft lens may lead to performance degradation with the increase of false reject rate. However, detecting the presence of soft lens is a non-trivial task as its texture that almost indiscernible. In this work, we proposed a classification method to identify the existence of soft lens in iris image. Our proposed method starts with segmenting the lens boundary on top of the sclera region. Then, the segmented boundary is used as features and extracted by local descriptors. These features are then trained and classified using Support Vector Machines. This method was tested on Notre Dame Cosmetic Contact Lens 2013 database. Experiment showed that the proposed method performed better than state of the art methods.

Download Full-text

State of the Art Survey of Deep Learning and Machine Learning Models for Smart Cities and Urban Sustainability

10.31219/osf.io/gmuzk ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Ramin Keivani ◽

Sina Faizollahzadeh Ardabili ◽

Farshid Aram

Keyword(s):

Machine Learning ◽

Deep Learning ◽

State Of The Art ◽

Smart Cities ◽

Model Development ◽

Urban Sustainability ◽

Urban Transport ◽

Support Vector ◽

Neuro Fuzzy ◽

Vector Machines

Deep learning (DL) and machine learning (ML) methods have recently contributed to the advancement of models in the various aspects of prediction, planning, and uncertainty analysis of smart cities and urban development. This paper presents the state of the art of DL and ML methods used in this realm. Through a novel taxonomy, the advances in model development and new application domains in urban sustainability and smart cities are presented. Findings reveal that five DL and ML methods have been most applied to address the different aspects of smart cities. These are artificial neural networks; support vector machines; decision trees; ensembles, Bayesians, hybrids, and neuro-fuzzy; and deep learning. It is also disclosed that energy, health, and urban transport are the main domains of smart cities that DL and ML methods contributed in to address their problems.

Download Full-text

Linear Support Vector Machines for Prediction of Student Performance in School-Based Education

Mathematical Problems in Engineering ◽

10.1155/2020/4761468 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Nalindren Naicker ◽

Timothy Adeliyi ◽

Jeanette Wing

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Student Performance ◽

State Of The Art ◽

Learning Algorithms ◽

The State ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Support Vector ◽

Vector Machines

Educational Data Mining (EDM) is a rich research field in computer science. Tools and techniques in EDM are useful to predict student performance which gives practitioners useful insights to develop appropriate intervention strategies to improve pass rates and increase retention. The performance of the state-of-the-art machine learning classifiers is very much dependent on the task at hand. Investigating support vector machines has been used extensively in classification problems; however, the extant of literature shows a gap in the application of linear support vector machines as a predictor of student performance. The aim of this study was to compare the performance of linear support vector machines with the performance of the state-of-the-art classical machine learning algorithms in order to determine the algorithm that would improve prediction of student performance. In this quantitative study, an experimental research design was used. Experiments were set up using feature selection on a publicly available dataset of 1000 alpha-numeric student records. Linear support vector machines benchmarked with ten categorical machine learning algorithms showed superior performance in predicting student performance. The results of this research showed that features like race, gender, and lunch influence performance in mathematics whilst access to lunch was the primary factor which influences reading and writing performance.

Download Full-text

Unsupervised Action Proposals Using Support Vector Classifiers for Online Video Processing

Sensors ◽

10.3390/s20102953 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2953

Author(s):

Marcos Baptista Ríos ◽

Roberto Javier López-Sastre ◽

Francisco Javier Acevedo-Rodríguez ◽

Pilar Martín-Martín ◽

Saturnino Maldonado-Bascón

Keyword(s):

Video Processing ◽

State Of The Art ◽

Learning To Rank ◽

Support Vector ◽

New Approach ◽

Support Vector Classifier ◽

Current State ◽

Video Frames ◽

Unsupervised Approach ◽

Video Sensor

In this work, we introduce an intelligent video sensor for the problem of Action Proposals (AP). AP consists of localizing temporal segments in untrimmed videos that are likely to contain actions. Solving this problem can accelerate several video action understanding tasks, such as detection, retrieval, or indexing. All previous AP approaches are supervised and offline, i.e., they need both the temporal annotations of the datasets during training and access to the whole video to effectively cast the proposals. We propose here a new approach which, unlike the rest of the state-of-the-art models, is unsupervised. This implies that we do not allow it to see any labeled data during learning nor to work with any pre-trained feature on the used dataset. Moreover, our approach also operates in an online manner, which can be beneficial for many real-world applications where the video has to be processed as soon as it arrives at the sensor, e.g., robotics or video monitoring. The core of our method is based on a Support Vector Classifier (SVC) module which produces candidate segments for AP by distinguishing between sets of contiguous video frames. We further propose a mechanism to refine and filter those candidate segments. This filter optimizes a learning-to-rank formulation over the dynamics of the segments. An extensive experimental evaluation is conducted on Thumos’14 and ActivityNet datasets, and, to the best of our knowledge, this work supposes the first unsupervised approach on these main AP benchmarks. Finally, we also provide a thorough comparison to the current state-of-the-art supervised AP approaches. We achieve 41% and 59% of the performance of the best-supervised model on ActivityNet and Thumos’14, respectively, confirming our unsupervised solution as a correct option to tackle the AP problem. The code to reproduce all our results will be publicly released upon acceptance of the paper.

Download Full-text

Prediction of Active Site Cleft Using Support Vector Machines

Journal of Chemical Information and Modeling ◽

10.1021/ci1002922 ◽

2010 ◽

Vol 50 (12) ◽

pp. 2266-2273 ◽

Cited By ~ 11

Author(s):

Shrihari Sonavane ◽

Pinak Chakrabarti

Keyword(s):

Support Vector Machines ◽

Active Site ◽

Support Vector ◽

Vector Machines ◽

Active Site Cleft

Download Full-text

In silico approaches for the prediction and analysis of antiviral peptides: a review

Current Pharmaceutical Design ◽

10.2174/1381612826666201102105827 ◽

2020 ◽

Vol 26 ◽

Cited By ~ 1

Author(s):

Phasit Charoenkwan ◽

Nuttapat Anuwongcharoen ◽

Chanin Nantasenamat ◽

Md. Mehedi Hasan ◽

Watshara Shoombuatong

Keyword(s):

State Of The Art ◽

Therapeutic Agents ◽

Support Vector ◽

Sequence Information ◽

Accurate Identification ◽

Robust Learning ◽

Current State ◽

Pharmacokinetic Properties ◽

Feature Encoding ◽

Antiviral Peptides

: In light of the growing resistance toward current antiviral drugs, efforts to discover novel and effective antiviral therapeutic agents remain a pressing scientific effort. Antiviral peptides (AVPs) represents promising therapeutic agents due to their extraordinary advantages in terms of potency, efficacy and pharmacokinetic properties. The growing volume of newly discovered peptide sequences in the post-genomic era requires computational approaches for timely and accurate identification of AVPs. Machine learning (ML) methods such as random forest and support vector machine represents robust learning algorithms that are instrumental in successful peptide-based drug discovery. Therefore, this review summarizes the current state-of-the-art on the application of ML methods for identifying AVPs directly from the sequence information. We compare the efficiency of these methods in terms of the underlying characteristics of the dataset used along with feature encoding methods, ML algorithms, cross-validation methods and prediction performance. Finally, guidelines for development of robust AVP models are also discussed. It is anticipated that this review will be serve as a useful guide for the design and development of robust AVP and related therapeutic peptide predictors in the future.

Download Full-text

Embedded Feature Selection for Support Vector Machines: State-of-the-Art and Future Challenges

Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-642-25085-9_36 ◽

2011 ◽

pp. 304-311 ◽

Cited By ~ 3

Author(s):

Sebastián Maldonado ◽

Richard Weber

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

State Of The Art ◽

Support Vector ◽

Vector Machines ◽

Future Challenges ◽

Selection For

Download Full-text

Use of a Novel Grammatical Inference Approach in Classification of Amyloidogenic Hexapeptides

Computational and Mathematical Methods in Medicine ◽

10.1155/2016/1782732 ◽

2016 ◽

Vol 2016 ◽

pp. 1-8 ◽

Cited By ~ 6

Author(s):

Wojciech Wieczorek ◽

Olgierd Unold

Keyword(s):

Support Vector Machine ◽

Correlation Coefficient ◽

State Of The Art ◽

Superior Performance ◽

Support Vector ◽

Grammatical Inference ◽

Regular Expressions ◽

Inference Algorithm ◽

Current State

The present paper is a novel contribution to the field of bioinformatics by using grammatical inference in the analysis of data. We developed an algorithm for generating star-free regular expressions which turned out to be good recommendation tools, as they are characterized by a relatively high correlation coefficient between the observed and predicted binary classifications. The experiments have been performed for three datasets of amyloidogenic hexapeptides, and our results are compared with those obtained using the graph approaches, the current state-of-the-art methods in heuristic automata induction, and the support vector machine. The results showed the superior performance of the new grammatical inference algorithm on fixed-length amyloid datasets.

Download Full-text

Modelling Proteolytic Enzymes With Support Vector Machines

Journal of Integrative Bioinformatics ◽

10.1515/jib-2011-170 ◽

2011 ◽

Vol 8 (3) ◽

pp. 1-15

Author(s):

Lionel Morgado ◽

Carlos Pereira ◽

Paula Veríssimo ◽

António Dourado

Keyword(s):

State Of The Art ◽

Proteolytic Enzymes ◽

Support Vector ◽

Computational Techniques ◽

Strong Activity ◽

Discriminative Models ◽

Vector Machines ◽

Merops Database ◽

Enzyme Detection ◽

Silver Bullet

Summary The strong activity felt in proteomics during the last decade created huge amounts of data, for which the knowledge is limited. Retrieving information from these proteins is the next step. For that, computational techniques are indispensable. Although there is not yet a silver bullet approach to solve the problem of enzyme detection and classification, machine learning formulations such as the state-of-the-art Support Vector Machine (SVM) appear among the most reliable options. A SVM based framework for peptidase analysis, that recognizes the hierarchies demarked in the MEROPS database is presented. Feature selection with SVM-RFE is used to improve the discriminative models and build classifiers computationally more efficient than alignment based techniques.

Download Full-text

Detecting Protein-Protein Interactions with a Novel Matrix-Based Protein Sequence Representation and Support Vector Machines

BioMed Research International ◽

10.1155/2015/867516 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 26

Author(s):

Zhu-Hong You ◽

Jianqiang Li ◽

Xin Gao ◽

Zhou He ◽

Lin Zhu ◽

...

Keyword(s):

Protein Interactions ◽

Protein Sequence ◽

Molecular Mechanisms ◽

Computational Approach ◽

Support Vector ◽

Biological Processes ◽

Protein Protein Interactions ◽

Technological Advances ◽

Vector Machines ◽

Sequence Representation

Proteins and their interactions lie at the heart of most underlying biological processes. Consequently, correct detection of protein-protein interactions (PPIs) is of fundamental importance to understand the molecular mechanisms in biological systems. Although the convenience brought by high-throughput experiment in technological advances makes it possible to detect a large amount of PPIs, the data generated through these methods is unreliable and may not be completely inclusive of all possible PPIs. Targeting at this problem, this study develops a novel computational approach to effectively detect the protein interactions. This approach is proposed based on a novel matrix-based representation of protein sequence combined with the algorithm of support vector machine (SVM), which fully considers the sequence order and dipeptide information of the protein primary sequence. When performed on yeast PPIs datasets, the proposed method can reach 90.06% prediction accuracy with 94.37% specificity at the sensitivity of 85.74%, indicating that this predictor is a useful tool to predict PPIs. Achieved results also demonstrate that our approach can be a helpful supplement for the interactions that have been detected experimentally.

Download Full-text