scholarly journals An Attribute Reduction Method using Neighborhood Entropy Measures in Neighborhood Rough Sets

Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 155 ◽  
Author(s):  
Lin Sun ◽  
Xiaoyu Zhang ◽  
Jiucheng Xu ◽  
Shiguang Zhang

Attribute reduction as an important preprocessing step for data mining, and has become a hot research topic in rough set theory. Neighborhood rough set theory can overcome the shortcoming that classical rough set theory may lose some useful information in the process of discretization for continuous-valued data sets. In this paper, to improve the classification performance of complex data, a novel attribute reduction method using neighborhood entropy measures, combining algebra view with information view, in neighborhood rough sets is proposed, which has the ability of dealing with continuous data whilst maintaining the classification information of original attributes. First, to efficiently analyze the uncertainty of knowledge in neighborhood rough sets, by combining neighborhood approximate precision with neighborhood entropy, a new average neighborhood entropy, based on the strong complementarity between the algebra definition of attribute significance and the definition of information view, is presented. Then, a concept of decision neighborhood entropy is investigated for handling the uncertainty and noisiness of neighborhood decision systems, which integrates the credibility degree with the coverage degree of neighborhood decision systems to fully reflect the decision ability of attributes. Moreover, some of their properties are derived and the relationships among these measures are established, which helps to understand the essence of knowledge content and the uncertainty of neighborhood decision systems. Finally, a heuristic attribute reduction algorithm is proposed to improve the classification performance of complex data sets. The experimental results under an instance and several public data sets demonstrate that the proposed method is very effective for selecting the most relevant attributes with great classification performance.

Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 138 ◽  
Author(s):  
Lin Sun ◽  
Lanying Wang ◽  
Jiucheng Xu ◽  
Shiguang Zhang

For continuous numerical data sets, neighborhood rough sets-based attribute reduction is an important step for improving classification performance. However, most of the traditional reduction algorithms can only handle finite sets, and yield low accuracy and high cardinality. In this paper, a novel attribute reduction method using Lebesgue and entropy measures in neighborhood rough sets is proposed, which has the ability of dealing with continuous numerical data whilst maintaining the original classification information. First, Fisher score method is employed to eliminate irrelevant attributes to significantly reduce computation complexity for high-dimensional data sets. Then, Lebesgue measure is introduced into neighborhood rough sets to investigate uncertainty measure. In order to analyze the uncertainty and noisy of neighborhood decision systems well, based on Lebesgue and entropy measures, some neighborhood entropy-based uncertainty measures are presented, and by combining algebra view with information view in neighborhood rough sets, a neighborhood roughness joint entropy is developed in neighborhood decision systems. Moreover, some of their properties are derived and the relationships are established, which help to understand the essence of knowledge and the uncertainty of neighborhood decision systems. Finally, a heuristic attribute reduction algorithm is designed to improve the classification performance of large-scale complex data. The experimental results under an instance and several public data sets show that the proposed method is very effective for selecting the most relevant attributes with high classification accuracy.


Author(s):  
Qing-Hua Zhang ◽  
Long-Yang Yao ◽  
Guan-Sheng Zhang ◽  
Yu-Ke Xin

In this paper, a new incremental knowledge acquisition method is proposed based on rough set theory, decision tree and granular computing. In order to effectively process dynamic data, describing the data by rough set theory, computing equivalence classes and calculating positive region with hash algorithm are analyzed respectively at first. Then, attribute reduction, value reduction and the extraction of rule set by hash algorithm are completed efficiently. Finally, for each new additional data, the incremental knowledge acquisition method is proposed and used to update the original rules. Both algorithm analysis and experiments show that for processing the dynamic information systems, compared with the traditional algorithms and the incremental knowledge acquisition algorithms based on granular computing, the time complexity of the proposed algorithm is lower due to the efficiency of hash algorithm and also this algorithm is more effective when it is used to deal with the huge data sets.


Author(s):  
ZHIMING ZHANG ◽  
JINGFENG TIAN

Intuitionistic fuzzy (IF) rough sets are the generalization of traditional rough sets obtained by combining the IF set theory and the rough set theory. The existing research on IF rough sets mainly concentrates on the establishment of lower and upper approximation operators using constructive and axiomatic approaches. Less effort has been put on the attribute reduction of databases based on IF rough sets. This paper systematically studies attribute reduction based on IF rough sets. Firstly, attribute reduction with traditional rough sets and some concepts of IF rough sets are reviewed. Then, we introduce some concepts and theorems of attribute reduction with IF rough sets, and completely investigate the structure of attribute reduction. Employing the discernibility matrix approach, an algorithm to find all attribute reductions is also presented. Finally, an example is proposed to illustrate our idea and method. Altogether, these findings lay a solid theoretical foundation for attribute reduction based on IF rough sets.


Author(s):  
Sharmila Banu K. ◽  
B. K. Tripathy

Rough Set Theory partitions a universe using single layered granulation. The equivalence classes induced by rough sets are based on discretised values. Considering the fact that the spatial data are continuous at large, discretising them may cause loss of data. Neighborhood approximations can lead to closely related coverings using continuous values. Besides, the spatial attributes also need to be given due consideration and should be handled unlike non-spatial attributes in the process of dimensionality reduction. This chapter analyses the use of Neighborhood rough sets for continuous data and handling spatially correlated attributes using rough sets.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Tengfei Zhang ◽  
Fumin Ma ◽  
Jie Cao ◽  
Chen Peng ◽  
Dong Yue

Parallel attribute reduction is one of the most important topics in current research on rough set theory. Although some parallel algorithms were well documented, most of them are still faced with some challenges for effectively dealing with the complex heterogeneous data including categorical and numerical attributes. Aiming at this problem, a novel attribute reduction algorithm based on neighborhood multigranulation rough sets was developed to process the massive heterogeneous data in the parallel way. The MapReduce-based parallelization method for attribute reduction was proposed in the framework of neighborhood multigranulation rough sets. To improve the reduction efficiency, the hashing Map/Reduce functions were designed to speed up the positive region calculation. Thereafter, a quick parallel attribute reduction algorithm using MapReduce was developed. The effectiveness and superiority of this parallel algorithm were demonstrated by theoretical analysis and comparison experiments.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Kai Zeng ◽  
Siyuan Jing

Rough set theory has been successfully applied to many fields, such as data mining, pattern recognition, and machine learning. Kernel rough sets and neighborhood rough sets are two important models that differ in terms of granulation. The kernel rough sets model, which has fuzziness, is susceptible to noise in the decision system. The neighborhood rough sets model can handle noisy data well but cannot describe the fuzziness of the samples. In this study, we define a novel model called kernel neighborhood rough sets, which integrates the advantages of the neighborhood and kernel models. Moreover, the model is used in the problem of feature selection. The proposed method is tested on the UCI datasets. The results show that our model outperforms classic models.


Author(s):  
Jingjing Song ◽  
Huili Dou ◽  
Xiansheng Rao ◽  
Xiaojing Luo ◽  
Xuan Yan

As a feature selection technique in rough set theory, attribute reduction has been extensively explored from various viewpoints especially the aspect of granularity, and multi-granularity attribute reduction has attracted much attention. Nevertheless, it should be pointed out that multiple granularities require to be considered simultaneously to evaluate the significance of candidate attribute in the corresponding process of computing reduct, which may result in high elapsed time of searching reduct. To alleviate such a problem, an acceleration strategy for neighborhood based multi-granularity attribute reduction is proposed in this paper, which aims to improve the computational efficiency of searching reduct. Our proposed approach is actually realized through the positive approximation mechanism, and the processes of searching qualified attributes are executed through evaluating candidate attributes over the gradually reduced sample space rather than all samples. The experimental results over 12 UCI data sets demonstrate that the acceleration strategy can provide superior performance to the naive approach of deriving multi-granularity reduct in the elapsed time of computing reduct without generating different reducts.


2012 ◽  
Vol 2012 ◽  
pp. 1-24 ◽  
Author(s):  
Feng Hu ◽  
Guoyin Wang

The divide and conquer method is a typical granular computing method using multiple levels of abstraction and granulations. So far, although some achievements based on divided and conquer method in the rough set theory have been acquired, the systematic methods for knowledge reduction based on divide and conquer method are still absent. In this paper, the knowledge reduction approaches based on divide and conquer method, under equivalence relation and under tolerance relation, are presented, respectively. After that, a systematic approach, named as the abstract process for knowledge reduction based on divide and conquer method in rough set theory, is proposed. Based on the presented approach, two algorithms for knowledge reduction, including an algorithm for attribute reduction and an algorithm for attribute value reduction, are presented. Some experimental evaluations are done to test the methods on uci data sets and KDDCUP99 data sets. The experimental results illustrate that the proposed approaches are efficient to process large data sets with good recognition rate, compared with KNN, SVM, C4.5, Naive Bayes, and CART.


Author(s):  
Sharmila Banu K ◽  
B. K. Tripathy

Rough set theory partitions a universe using single-layered granulation. The equivalence classes induced by rough sets are based on discretized values. Considering the fact that the spatial data are continuous at large, discretizing them may cause loss of data. Neighborhood approximations can lead to closely related coverings using continuous values. Besides, the spatial attributes also need to be given due consideration and should be handled unlike non-spatial attributes in the process of dimensionality reduction. This chapter analyzes the use of neighborhood rough sets for continuous data and handling spatially correlated attributes using rough sets.


Sign in / Sign up

Export Citation Format

Share Document