scholarly journals A Framework for Kernel-Based Multi-Category Classification

2007 ◽  
Vol 30 ◽  
pp. 525-564 ◽  
Author(s):  
S. I. Hill ◽  
A. Doucet

A geometric framework for understanding multi-category classification is introduced, through which many existing 'all-together' algorithms can be understood. The structure enables parsimonious optimisation, through a direct extension of the binary methodology. The focus is on Support Vector Classification, with parallels drawn to related methods. The ability of the framework to compare algorithms is illustrated by a brief discussion of Fisher consistency. Its utility in improving understanding of multi-category analysis is demonstrated through a derivation of improved generalisation bounds. It is also described how this architecture provides insights regarding how to further improve on the speed of existing multi-category classification algorithms. An initial example of how this might be achieved is developed in the formulation of a straightforward multi-category Sequential Minimal Optimisation algorithm. Proof-of-concept experimental results have shown that this, combined with the mapping of pairwise results, is comparable with benchmark optimisation speeds.

2021 ◽  
Author(s):  
jorge cabrera Alvargonzalez ◽  
Ana Larranaga Janeiro ◽  
Sonia Perez ◽  
Javier Martinez Torres ◽  
Lucia martinez lamas ◽  
...  

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been and remains one of the major challenges humanity has faced thus far. Over the past few months, large amounts of information have been collected that are only now beginning to be assimilated. In the present work, the existence of residual information in the massive numbers of rRT-PCRs that tested positive out of the almost half a million tests that were performed during the pandemic is investigated. This residual information is believed to be highly related to a pattern in the number of cycles that are necessary to detect positive samples as such. Thus, a database of more than 20,000 positive samples was collected, and two supervised classification algorithms (a support vector machine and a neural network) were trained to temporally locate each sample based solely and exclusively on the number of cycles determined in the rRT-PCR of each individual. Finally, the results obtained from the classification show how the appearance of each wave is coincident with the surge of each of the variants present in the region of Galicia (Spain) during the development of the SARS-CoV-2 pandemic and clearly identified with the classification algorithm.


2018 ◽  
Vol 11 (1) ◽  
pp. 2 ◽  
Author(s):  
Tao Zhang ◽  
Hong Tang

Detailed information about built-up areas is valuable for mapping complex urban environments. Although a large number of classification algorithms for such areas have been developed, they are rarely tested from the perspective of feature engineering and feature learning. Therefore, we launched a unique investigation to provide a full test of the Operational Land Imager (OLI) imagery for 15-m resolution built-up area classification in 2015, in Beijing, China. Training a classifier requires many sample points, and we proposed a method based on the European Space Agency’s (ESA) 38-m global built-up area data of 2014, OpenStreetMap, and MOD13Q1-NDVI to achieve the rapid and automatic generation of a large number of sample points. Our aim was to examine the influence of a single pixel and image patch under traditional feature engineering and modern feature learning strategies. In feature engineering, we consider spectra, shape, and texture as the input features, and support vector machine (SVM), random forest (RF), and AdaBoost as the classification algorithms. In feature learning, the convolutional neural network (CNN) is used as the classification algorithm. In total, 26 built-up land cover maps were produced. The experimental results show the following: (1) The approaches based on feature learning are generally better than those based on feature engineering in terms of classification accuracy, and the performance of ensemble classifiers (e.g., RF) are comparable to that of CNN. Two-dimensional CNN and the 7-neighborhood RF have the highest classification accuracies at nearly 91%; (2) Overall, the classification effect and accuracy based on image patches are better than those based on single pixels. The features that can highlight the information of the target category (e.g., PanTex (texture-derived built-up presence index) and enhanced morphological building index (EMBI)) can help improve classification accuracy. The code and experimental results are available at https://github.com/zhangtao151820/CompareMethod.


2006 ◽  
Vol 35 (3) ◽  
Author(s):  
Bernd Jürgen Falkowski

The importance of classification algorithms in the context of risk assessment is briefly explained. As an alternative to the popular support vector machines fault tolerant perceptron learning is suggested. In order to achieve better generalization properties the additional use of an iterative large margin perceptron algorithm is investigated. In particular it is shown that care has to be taken when initializing the algorithm. Some preliminary experimental results are briefly discussed.


2020 ◽  
Vol 27 (4) ◽  
pp. 329-336 ◽  
Author(s):  
Lei Xu ◽  
Guangmin Liang ◽  
Baowen Chen ◽  
Xu Tan ◽  
Huaikun Xiang ◽  
...  

Background: Cell lytic enzyme is a kind of highly evolved protein, which can destroy the cell structure and kill the bacteria. Compared with antibiotics, cell lytic enzyme will not cause serious problem of drug resistance of pathogenic bacteria. Thus, the study of cell wall lytic enzymes aims at finding an efficient way for curing bacteria infectious. Compared with using antibiotics, the problem of drug resistance becomes more serious. Therefore, it is a good choice for curing bacterial infections by using cell lytic enzymes. Cell lytic enzyme includes endolysin and autolysin and the difference between them is the purpose of the break of cell wall. The identification of the type of cell lytic enzymes is meaningful for the study of cell wall enzymes. Objective: In this article, our motivation is to predict the type of cell lytic enzyme. Cell lytic enzyme is helpful for killing bacteria, so it is meaningful for study the type of cell lytic enzyme. However, it is time consuming to detect the type of cell lytic enzyme by experimental methods. Thus, an efficient computational method for the type of cell lytic enzyme prediction is proposed in our work. Method: We propose a computational method for the prediction of endolysin and autolysin. First, a data set containing 27 endolysins and 41 autolysins is built. Then the protein is represented by tripeptides composition. The features are selected with larger confidence degree. At last, the classifier is trained by the labeled vectors based on support vector machine. The learned classifier is used to predict the type of cell lytic enzyme. Results: Following the proposed method, the experimental results show that the overall accuracy can attain 97.06%, when 44 features are selected. Compared with Ding's method, our method improves the overall accuracy by nearly 4.5% ((97.06-92.9)/92.9%). The performance of our proposed method is stable, when the selected feature number is from 40 to 70. The overall accuracy of tripeptides optimal feature set is 94.12%, and the overall accuracy of Chou's amphiphilic PseAAC method is 76.2%. The experimental results also demonstrate that the overall accuracy is improved by nearly 18% when using the tripeptides optimal feature set. Conclusion: The paper proposed an efficient method for identifying endolysin and autolysin. In this paper, support vector machine is used to predict the type of cell lytic enzyme. The experimental results show that the overall accuracy of the proposed method is 94.12%, which is better than some existing methods. In conclusion, the selected 44 features can improve the overall accuracy for identification of the type of cell lytic enzyme. Support vector machine performs better than other classifiers when using the selected feature set on the benchmark data set.


Molecules ◽  
2021 ◽  
Vol 26 (13) ◽  
pp. 3983
Author(s):  
Ozren Gamulin ◽  
Marko Škrabić ◽  
Kristina Serec ◽  
Matej Par ◽  
Marija Baković ◽  
...  

Gender determination of the human remains can be very challenging, especially in the case of incomplete ones. Herein, we report a proof-of-concept experiment where the possibility of gender recognition using Raman spectroscopy of teeth is investigated. Raman spectra were recorded from male and female molars and premolars on two distinct sites, tooth apex and anatomical neck. Recorded spectra were sorted into suitable datasets and initially analyzed with principal component analysis, which showed a distinction between spectra of male and female teeth. Then, reduced datasets with scores of the first 20 principal components were formed and two classification algorithms, support vector machine and artificial neural networks, were applied to form classification models for gender recognition. The obtained results showed that gender recognition with Raman spectra of teeth is possible but strongly depends both on the tooth type and spectrum recording site. The difference in classification accuracy between different tooth types and recording sites are discussed in terms of the molecular structure difference caused by the influence of masticatory loading or gender-dependent life events.


Diagnostics ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1136
Author(s):  
Duc Long Duong ◽  
Quoc Duy Nam Nguyen ◽  
Minh Son Tong ◽  
Manh Tuan Vu ◽  
Joseph Dy Lim ◽  
...  

Dental caries has been considered the heaviest worldwide oral health burden affecting a significant proportion of the population. To prevent dental caries, an appropriate and accurate early detection method is demanded. This proof-of-concept study aims to develop a two-stage computational system that can detect early occlusal caries from smartphone color images of unrestored extracted teeth according to modified International Caries Detection and Assessment System (ICDAS) criteria (3 classes: Code 0; Code 1-2; Code 3-6): in the first stage, carious lesion areas were identified and extracted from sound tooth regions. Then, five characteristic features of these areas were intendedly selected and calculated to be inputted into the classification stage, where five classifiers (Support Vector Machine, Random Forests, K-Nearest Neighbors, Gradient Boosted Tree, Logistic Regression) were evaluated to determine the best one among them. On a set of 587 smartphone images of extracted teeth, our system achieved accuracy, sensitivity, and specificity that were 87.39%, 89.88%, and 68.86% in the detection stage when compared to modified visual and image-based ICDAS criteria. For the classification stage, the Support Vector Machine model was recorded as the best model with accuracy, sensitivity, and specificity at 88.76%, 92.31%, and 85.21%. As the first step in developing the technology, our present findings confirm the feasibility of using smartphone color images to employ Artificial Intelligence algorithms in caries detection. To improve the performance of the proposed system, there is a need for further development in both in vitro and in vivo modeling. Besides that, an applicable system for accurately taking intra-oral images that can capture entire dental arches including the occlusal surfaces of premolars and molars also needs to be developed.


Plants ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 95
Author(s):  
Heba Kurdi ◽  
Amal Al-Aldawsari ◽  
Isra Al-Turaiki ◽  
Abdulrahman S. Aldawood

In the past 30 years, the red palm weevil (RPW), Rhynchophorus ferrugineus (Olivier), a pest that is highly destructive to all types of palms, has rapidly spread worldwide. However, detecting infestation with the RPW is highly challenging because symptoms are not visible until the death of the palm tree is inevitable. In addition, the use of automated RPW weevil identification tools to predict infestation is complicated by a lack of RPW datasets. In this study, we assessed the capability of 10 state-of-the-art data mining classification algorithms, Naive Bayes (NB), KSTAR, AdaBoost, bagging, PART, J48 Decision tree, multilayer perceptron (MLP), support vector machine (SVM), random forest, and logistic regression, to use plant-size and temperature measurements collected from individual trees to predict RPW infestation in its early stages before significant damage is caused to the tree. The performance of the classification algorithms was evaluated in terms of accuracy, precision, recall, and F-measure using a real RPW dataset. The experimental results showed that infestations with RPW can be predicted with an accuracy up to 93%, precision above 87%, recall equals 100%, and F-measure greater than 93% using data mining. Additionally, we found that temperature and circumference are the most important features for predicting RPW infestation. However, we strongly call for collecting and aggregating more RPW datasets to run more experiments to validate these results and provide more conclusive findings.


Author(s):  
Shikhar P. Acharya ◽  
Ivan G. Guardiola

Radio Frequency (RF) devices produce some amount of Unintended Electromagnetic Emissions (UEEs). UEEs are generally unique to a device and can be used as a signature for the purpose of detection and identification. The problem with UEEs is that they are very low in power and are often buried deep inside the noise band. The research herein provides the application of Support Vector Machine (SVM) for detection and identification of RF devices using their UEEs. Experimental Results shows that SVM can detect RF devices within the noise band, and can also identify RF devices using their UEEs.


2013 ◽  
Vol 311 ◽  
pp. 158-163 ◽  
Author(s):  
Li Qin Huang ◽  
Li Qun Lin ◽  
Yan Huang Liu

MapReduce framework of cloud computing has an effective way to achieve massive text categorization. In this paper a distributed parallel text training algorithm in cloud computing environment based on multi-class Support Vector Machines(SVM) is designed. In cloud computing environment Map tasks realize distributing various types of samples and Reduce tasks realize the specific SVM training. Experimental results show that the execution time of text training decreases with the number of Reduce tasks increasing. Also a parallel text classifying based on cloud computing is designed and implemented, which classify the unknown type texts. Experimental results show that the speed of text classifying increases with the number of Map tasks increasing.


2018 ◽  
Vol 127 ◽  
pp. S155-S156
Author(s):  
I. Torres Xirau ◽  
I. Olaciregui-Ruiz ◽  
B.J. Mijnheer ◽  
B. Vivas-Maiques ◽  
U.A. van der Heide ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document