Inner-Collection Distributional Weight Data Classification Approach

Author(s):  
Li Junlin ◽  
Fu Hongguang
Author(s):  
Yunmei Lu ◽  
Yanhong Gao ◽  
Zhongbo Cao ◽  
Juan Cui ◽  
Zhennan Dong ◽  
...  

Author(s):  
Azuraliza Abu Bakar ◽  
Zulaiha Ali Othman ◽  
Abdul Razak Hamdan ◽  
Rozianiwati Yusof ◽  
Ruhaizan Ismail

Author(s):  
Fadime Üney Yüksektepe

Data classification is a supervised learning strategy that analyzes the organization and categorization of data in distinct classes. Generally, a training set, in which all objects are already associated with known class labels, is used in classification methods. The data classification algorithms work on this set by using input attributes and builds a model to classify new objects. In other words, the algorithm predicts output attribute values. Output attribute of the developed model is categorical (Roiger & Geatz, 2003). There are many applications of data classification in finance, health care, sports, engineering and science. Data classification is an important problem that has applications in a diverse set of areas ranging from finance to bioinformatics (Chen & Han & Yu, 1996; Edelstein, 2003; Jagota, 2000). Majority data classification methods are developed for classifying data into two groups. As multi-group data classification problems are very common but not widely studied, we focus on developing a new multi-group data classification approach based on mixed-integer linear programming.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Baofeng Shi ◽  
Jing Wang ◽  
Junyan Qi ◽  
Yanqiu Cheng

We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.


Sign in / Sign up

Export Citation Format

Share Document