EBOC: Ensemble-Based Ordinal Classification in Transportation

Journal of Advanced Transportation ◽

10.1155/2019/7482138 ◽

2019 ◽

Vol 2019 ◽

pp. 1-17

Author(s):

Pelin Yıldırım ◽

Ulaş K. Birant ◽

Derya Birant

Keyword(s):

Historical Data ◽

Classification Problem ◽

Classification Performance ◽

Classification Algorithms ◽

Ordinal Classification ◽

Adaboost Algorithm ◽

Education And Health ◽

Transportation Sector ◽

C4.5 Decision Tree ◽

Target Attribute

Learning the latent patterns of historical data in an efficient way to model the behaviour of a system is a major need for making right decisions. For this purpose, machine learning solution has already begun its promising marks in transportation as well as in many areas such as marketing, finance, education, and health. However, many classification algorithms in the literature assume that the target attribute values in the datasets are unordered, so they lose inherent order between the class values. To overcome the problem, this study proposes a novel ensemble-based ordinal classification (EBOC) approach which suggests bagging and boosting (AdaBoost algorithm) methods as a solution for ordinal classification problem in transportation sector. This article also compares the proposed EBOC approach with ordinal class classifier and traditional tree-based classification algorithms (i.e., C4.5 decision tree, RandomTree, and REPTree) in terms of accuracy. The results indicate that the proposed EBOC approach achieves better classification performance than the conventional solutions.

Download Full-text

Signal Classification Algorithms over Time Selective Channels

Electronics ◽

10.3390/electronics10141714 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1714

Author(s):

Mohamed Marey ◽

Hala Mostafa

Keyword(s):

Block Code ◽

Classification Problem ◽

Classification Performance ◽

Signal Classification ◽

Final Decision ◽

Time Block ◽

Channel Response ◽

Over Time ◽

Parallel Fashion

In this work, we propose a general framework to design a signal classification algorithm over time selective channels for wireless communications applications. We derive an upper bound on the maximum number of observation samples over which the channel response is an essential invariant. The proposed framework relies on dividing the received signal into blocks, and each of them has a length less than the mentioned bound. Then, these blocks are fed into a number of classifiers in a parallel fashion. A final decision is made through a well-designed combiner and detector. As a case study, we employ the proposed framework on a space-time block-code classification problem by developing two combiners and detectors. Monte Carlo simulations show that the proposed framework is capable of achieving excellent classification performance over time selective channels compared to the conventional algorithms.

Download Full-text

Impact of Dataset Size on Classification Performance: An Empirical Evaluation in the Medical Domain

Applied Sciences ◽

10.3390/app11020796 ◽

2021 ◽

Vol 11 (2) ◽

pp. 796

Author(s):

Alhanoof Althnian ◽

Duaa AlSaeed ◽

Heyam Al-Baity ◽

Amani Samha ◽

Alanoud Bin Dris ◽

...

Keyword(s):

Empirical Evaluation ◽

Classification Performance ◽

Support Vector ◽

Robust Model ◽

Original Distribution ◽

C4.5 Decision Tree ◽

Dataset Size ◽

Overall Performance ◽

Medical Domain ◽

The Impact

Dataset size is considered a major concern in the medical domain, where lack of data is a common occurrence. This study aims to investigate the impact of dataset size on the overall performance of supervised classification models. We examined the performance of six widely-used models in the medical field, including support vector machine (SVM), neural networks (NN), C4.5 decision tree (DT), random forest (RF), adaboost (AB), and naïve Bayes (NB) on eighteen small medical UCI datasets. We further implemented three dataset size reduction scenarios on two large datasets and analyze the performance of the models when trained on each resulting dataset with respect to accuracy, precision, recall, f-score, specificity, and area under the ROC curve (AUC). Our results indicated that the overall performance of classifiers depend on how much a dataset represents the original distribution rather than its size. Moreover, we found that the most robust model for limited medical data is AB and NB, followed by SVM, and then RF and NN, while the least robust model is DT. Furthermore, an interesting observation is that a robust machine learning model to limited dataset does not necessary imply that it provides the best performance compared to other models.

Download Full-text

Data Augmentation and Spectral Structure Features for Limited Samples Hyperspectral Classification

Remote Sensing ◽

10.3390/rs13040547 ◽

2021 ◽

Vol 13 (4) ◽

pp. 547

Author(s):

Wenning Wang ◽

Xuebin Liu ◽

Xuanqin Mou

Keyword(s):

Classification Accuracy ◽

Data Augmentation ◽

Classification Problem ◽

Classification Performance ◽

Spectral Structure ◽

Limited Sample ◽

Sample Classification ◽

Training Samples ◽

Traditional Classification ◽

Hyperspectral Classification

For both traditional classification and current popular deep learning methods, the limited sample classification problem is very challenging, and the lack of samples is an important factor affecting the classification performance. Our work includes two aspects. First, the unsupervised data augmentation for all hyperspectral samples not only improves the classification accuracy greatly with the newly added training samples, but also further improves the classification accuracy of the classifier by optimizing the augmented test samples. Second, an effective spectral structure extraction method is designed, and the effective spectral structure features have a better classification accuracy than the true spectral features.

Download Full-text

Advertisement Click-Through Rate Prediction Based on the Weighted-ELM and Adaboost Algorithm

Scientific Programming ◽

10.1155/2017/2938369 ◽

2017 ◽

Vol 2017 ◽

pp. 1-8 ◽

Cited By ~ 1

Author(s):

Sen Zhang ◽

Qiang Fu ◽

Wendong Xiao

Keyword(s):

Real Time ◽

Prediction Accuracy ◽

Historical Data ◽

User Profile ◽

Imbalanced Learning ◽

Sample Distribution ◽

The Real ◽

Adaboost Algorithm ◽

Click Through Rate ◽

Prediction Approach

Accurate click-through rate (CTR) prediction can not only improve the advertisement company’s reputation and revenue, but also help the advertisers to optimize the advertising performance. There are two main unsolved problems of the CTR prediction: low prediction accuracy due to the imbalanced distribution of the advertising data and the lack of the real-time advertisement bidding implementation. In this paper, we will develop a novel online CTR prediction approach by incorporating the real-time bidding (RTB) advertising by the following strategies: user profile system is constructed from the historical data of the RTB advertising to describe the user features, the historical CTR features, the ID features, and the other numerical features. A novel CTR prediction approach is presented to address the imbalanced learning sample distribution by integrating the Weighted-ELM (WELM) and the Adaboost algorithm. Compared to the commonly used algorithms, the proposed approach can improve the CTR significantly.

Download Full-text

Efficient pan-cancer whole-slide image classification and outlier detection using convolutional neural networks

10.1101/633123 ◽

2019 ◽

Cited By ~ 1

Author(s):

Seda Bilaloglu ◽

Joyce Wu ◽

Eduardo Fierro ◽

Raul Delgado Sanchez ◽

Paolo Santiago Ocampo ◽

...

Keyword(s):

Visual Analysis ◽

Classification Problem ◽

Classification Performance ◽

Neoplastic Tissue ◽

Multiple Tumor ◽

Slide Image ◽

Prediction Systems ◽

Multi Class Classification ◽

The Many ◽

Whole Slide Images

AbstractVisual analysis of solid tissue mounted on glass slides is currently the primary method used by pathologists for determining the stage, type and subtypes of cancer. Although whole slide images are usually large (10s to 100s thousands pixels wide), an exhaustive though time-consuming assessment is necessary to reduce the risk of misdiagnosis. In an effort to address the many diagnostic challenges faced by trained experts, recent research has been focused on developing automatic prediction systems for this multi-class classification problem. Typically, complex convolutional neural network (CNN) architectures, such as Google’s Inception, are used to tackle this problem. Here, we introduce a greatly simplified CNN architecture, PathCNN, which allows for more efficient use of computational resources and better classification performance. Using this improved architecture, we trained simultaneously on whole-slide images from multiple tumor sites and corresponding non-neoplastic tissue. Dimensionality reduction analysis of the weights of the last layer of the network capture groups of images that faithfully represent the different types of cancer, highlighting at the same time differences in staining and capturing outliers, artifacts and misclassification errors. Our code is available online at: https://github.com/sedab/PathCNN.

Download Full-text

NEW DISCRETE CROW SEARCH ALGORITHM FOR CLASS ASSOCIATION RULE MINING

International Journal of Swarm Intelligence Research ◽

10.4018/ijsir.2022010109 ◽

2022 ◽

Vol 13 (1) ◽

pp. 0-0

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Search Algorithm ◽

Classification Problem ◽

Discrete Version ◽

Classification Algorithms ◽

Associative Classification ◽

Rule Mining ◽

Rule Based ◽

Class Association Rule

Associative Classification (AC) or Class Association Rule (CAR) mining is a very efficient method for the classification problem. It can build comprehensible classification models in the form of a list of simple IF-THEN classification rules from the available data. In this paper, we present a new, and improved discrete version of the Crow Search Algorithm (CSA) called NDCSA-CAR to mine the Class Association Rules. The goal of this article is to improve the data classification accuracy and the simplicity of classifiers. The authors applied the proposed NDCSA-CAR algorithm on eleven benchmark dataset and compared its result with traditional algorithms and recent well known rule-based classification algorithms. The experimental results show that the proposed algorithm outperformed other rule-based approaches in all evaluated criteria.

Download Full-text

Detection and Tracking Cows by Computer Vision and Image Classification Methods

International Journal of Security and Privacy in Pervasive Computing ◽

10.4018/ijsppc.2021010101 ◽

2021 ◽

Vol 13 (1) ◽

pp. 1-45

Author(s):

Terry Gao

Keyword(s):

Feature Fusion ◽

Binary Classification ◽

Classification Problem ◽

Detection Algorithm ◽

Video Sequences ◽

Image Block ◽

Body Contour ◽

Adaboost Algorithm ◽

Target Model ◽

Detection And Tracking

In this paper, the cow recognition and traction in video sequences is studied. In the recognition phase, this paper does some discussion and analysis which aim at different classification algorithms and feature extraction algorithms, and cow's detection is transformed into a binary classification problem. The detection method extracts cow's features using a method of multiple feature fusion. These features include edge characters which reflects the cow body contour, grey value, and spatial position relationship. In addition, the algorithm detects the cow body through the classifier which is trained by Gentle Adaboost algorithm. Experiments show that the method has good detection performance when the target has deformation or the contrast between target and background is low. Compared with the general target detection algorithm, this method reduces the miss rate and the detection precision is improved. Detection rate can reach 97.3%. In traction phase, the popular compressive tracking (CT) algorithm is proposed. The learning rate is changed through adaptively calculating the pap distance of image block. Moreover, the update for target model is stopped to avoid introducing error and noise when the classification response values are negative. The experiment results show that the improved tracking algorithm can effectively solve the target model update by mistaken when there are large covers or the attitude is changed frequently. For the detection and tracking of cow body, a detection and tracking framework for the image of cow is built and the detector is combined with the tracking framework. The algorithm test for some video sequences under the complex environment indicates the detection algorithm based on improved compressed perception shows good tracking effect in the changing and complicated background.

Download Full-text

Semi-supervised learning using autodidactic interpolation on sparse representation-based multiple one-dimensional embedding

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691319500139 ◽

2019 ◽

Vol 17 (03) ◽

pp. 1950013 ◽

Cited By ~ 3

Author(s):

Hao Deng ◽

Chao Ma ◽

Lijun Shen ◽

Chuanwu Yang

Keyword(s):

Sparse Representation ◽

Euclidean Distance ◽

Main Idea ◽

Classification Problem ◽

Classification Performance ◽

One Dimensional ◽

Adaptive Interpolation ◽

Shortest Distance ◽

The Common ◽

Sample Set

In this paper, we present a novel semi-supervised classification method based on sparse representation (SR) and multiple one-dimensional embedding-based adaptive interpolation (M1DEI). The main idea of M1DEI is to embed the data into multiple one-dimensional (1D) manifolds satisfying that the connected samples have shortest distance. In this way, the problem of high-dimensional data classification is transformed into a 1D classification problem. By alternating interpolation and averaging on the multiple 1D manifolds, the labeled sample set of the data can enlarge gradually. Obviously, proper metric facilitates more accurate embedding and further helps improve the classification performance. We develop a SR-based metric, which measures the affinity between samples more accurately than the common Euclidean distance. The experimental results on several databases show the effectiveness of the improvement.

Download Full-text

Classification of ASKAP VAST Radio Light Curves

Proceedings of the International Astronomical Union ◽

10.1017/s1743921312001196 ◽

2011 ◽

Vol 7 (S285) ◽

pp. 397-399 ◽

Cited By ~ 3

Author(s):

Umaa Rebbapragada ◽

Kitty Lo ◽

Kiri L. Wagstaff ◽

Colorado Reed ◽

Tara Murphy ◽

...

Keyword(s):

Best Practices ◽

Light Curve ◽

Field Survey ◽

Classification Performance ◽

Light Curves ◽

Classification Algorithms ◽

Wide Field ◽

Radio Transients ◽

Instrument Sensitivity

AbstractThe VAST survey is a wide-field survey that observes with unprecedented instrument sensitivity (0.5 mJy or lower) and repeat cadence (a goal of 5 seconds) that will enable novel scientific discoveries related to known and unknown classes of radio transients and variables. Given the unprecedented observing characteristics of VAST, it is important to estimate source classification performance, and determine best practices prior to the launch of ASKAP's BETA in 2012. The goal of this study is to identify light-curve characterization and classification algorithms that are best suited for archival VAST light-curve classification. We perform our experiments on light-curve simulations of eight source types and achieve best-case performance of approximately 90% accuracy. We note that classification performance is most influenced by light-curve characterization rather than classifier algorithm.

Download Full-text

CREDIT SCORING USING MULTI-KERNEL SUPPORT VECTOR MACHINE AND CHAOS PARTICLE SWARM OPTIMIZATION

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026812500198 ◽

2012 ◽

Vol 11 (03) ◽

pp. 1250019 ◽

Cited By ~ 3

Author(s):

YUN LING ◽

QIUYAN CAO ◽

HUA ZHANG

Keyword(s):

Particle Swarm Optimization ◽

Kernel Function ◽

Credit Scoring ◽

Particle Swarm ◽

Pso Algorithm ◽

Classification Problem ◽

Classification Performance ◽

Kernel Functions ◽

Support Vector ◽

Swarm Optimization

Consumer credit scoring is considered as a crucial issue in the credit industry. SVM has been successfully utilized for classification in many areas including credit scoring. Kernel function is vital when applying SVM to classification problem for enhancing the prediction performance. Currently, most of kernel functions used in SVM are single kernel functions such as the radial basis function (RBF) which has been widely used. On the basis of the existing kernel functions, this paper proposes a multi-kernel function to improve the learning and generalization ability of SVM by integrating several single kernel functions. Chaos particle swarm optimization (CPSO) which is a kind of improved PSO algorithm is utilized to optimize parameters and to select features simultaneously. Two UCI credit data sets are used as the experimental data to evaluate the classification performance of the proposed method.

Download Full-text