A Modified Rough Set Approach to Incomplete Information Systems

Journal of Applied Mathematics and Decision Sciences ◽

10.1155/2007/58248 ◽

2007 ◽

Vol 2007 ◽

pp. 1-13 ◽

Cited By ~ 8

Author(s):

E. A. Rady ◽

M. M. E. Abd El-Monsef ◽

W. A. Abd El-Latif

Keyword(s):

Data Analysis ◽

Rough Set ◽

Missing Values ◽

Decision Rules ◽

Similarity Relation ◽

Discernibility Matrix ◽

Wide Range ◽

Null Value ◽

Definition Of ◽

Incomplete Information Systems

The key point of the tolerance relation or similarity relation presented in the literature is to assign a “null” value to all missing attribute values. In other words, a “null” value may be equal to any value in the domain of the attribute values. This may cause a serious effect in data analysis and decision analysis because the missing values are just “missed” but they do exist and have an influence on the decision. In this paper, we will introduce the modified similarity relation denoted by MSIM that is dependent on the number of missing values with respect to the number of the whole defined attributes for each object. According to the definition of MSIM, many problems concerning the generalized decisions are solved. This point may be used in scaling in statistics in a wide range. Also, a new definition of the discernibility matrix, deduction of the decision rules, and reducts in the present of the missing values are obtained.

Download Full-text

A Comparative Study Based on Rough Set and Classification Via Clustering Approaches to Handle Incomplete Data to Predict Learning Styles

International Journal of Decision Support System Technology ◽

10.4018/ijdsst.2017040101 ◽

2017 ◽

Vol 9 (2) ◽

pp. 1-20 ◽

Cited By ~ 2

Author(s):

Hemant Rana ◽

Manohar Lal

Keyword(s):

Learning Styles ◽

Rough Set ◽

Incomplete Data ◽

Missing Values ◽

Rough Set Theory ◽

Decision Rules ◽

Data Mining Tool ◽

Knowledge Analysis ◽

Clustering Approach ◽

Mining Tool

Handling of missing attribute values are a big challenge for data analysis. For handling this type of problems, there are some well known approaches, including Rough Set Theory (RST) and classification via clustering. In the work reported here, RSES (Rough Set Exploration System) one of the tools based on RST approach, and WEKA (Waikato Environment for Knowledge Analysis), a data mining tool—based on classification via clustering—are used for predicting learning styles from given data, which possibly has missing values. The results of the experiments using the tools show that the problem of missing attribute values is better handled by RST approach as compared to the classification via clustering approach. Further, in respect of missing values, RSES yields better decision rules, if the missing values are simply ignored than the rules obtained by assigning some values in place of missing attribute values.

Download Full-text

Principles on Symbolic Data Analysis

Handbook of Research on Innovations in Database Technologies and Applications ◽

10.4018/978-1-60566-242-8.ch009 ◽

2009 ◽

pp. 74-81

Author(s):

Héctor Oscar Nigro ◽

Sandra Elizabeth González Císaro

Keyword(s):

Data Analysis ◽

Missing Values ◽

Symbolic Data Analysis ◽

Human Errors ◽

Symbolic Data ◽

Analysis Process ◽

New Type ◽

Null Value ◽

Internal Variation ◽

Different Sources

Today’s technology allows storing vast quantities of information from different sources in nature. This information has missing values, nulls, internal variation, taxonomies, and rules. We need a new type of data analysis that allows us represent the complexity of reality, maintaining the internal variation and structure (Diday, 2003). In Data Analysis Process or Data Mining, it is necessary to know the nature of null values - the cases are by absence value, null value or default value -, being also possible and valid to have some imprecision, due to differential semantic in a concept, diverse sources, linguistic imprecision, element resumed in Database, human errors, etc (Chavent, 1997). So, we need a conceptual support to manipulate these types of situations. As we are going to see below, Symbolic Data Analysis (SDA) is a new issue based on a strong conceptual model called Symbolic Object (SO). A “SO” is defined by its “intent” which contains a way to find its “extent”. For instance, the description of habitants in a region and the way of allocating an individual to this region is called “intent”, the set of individuals, which satisfies this intent, is called “extent” (Diday 2003). For this type of analysis, different experts are needed, each one giving their concepts.

Download Full-text

Fuzzy Set-Valued Information Systems and the Algorithm of Filling Missing Values for Incomplete Information Systems

Complexity ◽

10.1155/2019/3213808 ◽

2019 ◽

Vol 2019 ◽

pp. 1-17

Author(s):

Zhaohao Wang ◽

Xiaoping Zhang

Keyword(s):

Information System ◽

Information Systems ◽

Incomplete Information ◽

Rough Set ◽

Fuzzy Set ◽

Missing Values ◽

Rough Set Theory ◽

Similarity Measures ◽

New Information ◽

Incomplete Information Systems

How to effectively deal with missing values in incomplete information systems (IISs) according to the research target is still a key issue for investigating IISs. If the missing values in IISs are not handled properly, they will destroy the internal connection of data and reduce the efficiency of data usage. In this paper, in order to establish effective methods for filling missing values, we propose a new information system, namely, a fuzzy set-valued information system (FSvIS). By means of the similarity measures of fuzzy sets, we obtain several binary relations in FSvISs, and we investigate the relationship among them. This is a foundation for the researches on FSvISs in terms of rough set approach. Then, we provide an algorithm to fill the missing values in IISs with fuzzy set values. In fact, this algorithm can transform an IIS into an FSvIS. Furthermore, we also construct an algorithm to fill the missing values in IISs with set values (or real values). The effectiveness of these algorithms is analyzed. The results showed that the proposed algorithms achieve higher correct rate than traditional algorithms, and they have good stability. Finally, we discuss the importance of these algorithms for investigating IISs from the viewpoint of rough set theory.

Download Full-text

A Matrix Method for Calculation of the Approximations under the Asymmetric Similarity Relation Based Rough Sets

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.187.251 ◽

2011 ◽

Vol 187 ◽

pp. 251-256

Author(s):

Lei Wang ◽

Tian Rui Li ◽

Jun Ye

Keyword(s):

Rough Set ◽

Rough Sets ◽

Rough Set Theory ◽

Matrix Representation ◽

Point Of View ◽

Similarity Relation ◽

The Matrix ◽

Lower And Upper Approximations ◽

Incomplete Information Systems ◽

Asymmetric Similarity

The essence of the rough set theory (RST) is to deal with the inconsistent problems by two definable subsets which are called the lower and upper approximations respectively. Asymmetric Similarity relation based Rough Sets (ASRS) model is one kind of extensions of the classical rough set model in incomplete information systems. In this paper, we propose a new matrix view of ASRS model and give the matrix representation of the lower and upper approximations of a concept under ASRS model. According to this matrix view, a new method is obtained for calculation of the lower and upper approximations under ASRS model. An example is given to illustrate processes of calculating the approximations of a concept based on the matrix point of view.

Download Full-text

Rough Set Theory and Decision Rules in Data Analysis of Breast Cancer Patients

Transactions on Rough Sets I - Lecture Notes in Computer Science ◽

10.1007/978-3-540-27794-1_18 ◽

2004 ◽

pp. 375-391 ◽

Cited By ~ 9

Author(s):

Jerzy Załuski ◽

Renata Szoszkiewicz ◽

Jerzy Krysiński ◽

Jerzy Stefanowski

Keyword(s):

Breast Cancer ◽

Data Analysis ◽

Cancer Patients ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Decision Rules ◽

Breast Cancer Patients

Download Full-text

Rough Set Data Analysis Algorithms for Incomplete Information Systems

Lecture Notes in Computer Science - Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing ◽

10.1007/3-540-39205-x_35 ◽

2007 ◽

pp. 264-268 ◽

Cited By ~ 2

Author(s):

K. S. Chin ◽

Jiye Liang ◽

Chuangyin Dang

Keyword(s):

Information Systems ◽

Data Analysis ◽

Incomplete Information ◽

Rough Set ◽

Incomplete Information Systems

Download Full-text

UNCERTAINTY MEASURE OF ROUGH SETS BASED ON A KNOWLEDGE GRANULATION FOR INCOMPLETE INFORMATION SYSTEMS

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488508005157 ◽

2008 ◽

Vol 16 (02) ◽

pp. 233-244 ◽

Cited By ~ 11

Author(s):

JUNHONG WANG ◽

JIYE LIANG ◽

YUHUA QIAN ◽

CHUANGYIN DANG

Keyword(s):

Information Systems ◽

Incomplete Information ◽

Rough Set ◽

Rough Sets ◽

Rough Set Theory ◽

Computer Applications ◽

Mathematical Tool ◽

Uncertainty Measure ◽

Definition Of ◽

Incomplete Information Systems

Rough set theory is a relatively new mathematical tool for computer applications in circumstances characterized by vagueness and uncertainty. In this paper, we address uncertainty of rough sets for incomplete information systems. An axiom definition of knowledge granulation for incomplete information systems is obtained, under which a measure of uncertainty of a rough set is proposed. This measure has some nice properties such as equivalence, maximum and minimum. Furthermore, we prove that the uncertainty measure is effective and suitable for measuring roughness and accuracy of rough sets for incomplete information systems.

Download Full-text

Feature Genes Selection Using Fuzzy Rough Uncertainty Metric for Tumor Diagnosis

Computational and Mathematical Methods in Medicine ◽

10.1155/2019/6705648 ◽

2019 ◽

Vol 2019 ◽

pp. 1-9

Author(s):

Jiucheng Xu ◽

Yun Wang ◽

Keqiang Xu ◽

Tianli Zhang

Keyword(s):

Rough Set ◽

Conditional Entropy ◽

Similarity Relation ◽

Selection Algorithm ◽

Fuzzy Similarity ◽

Fuzzy Similarity Relation ◽

Definition Of ◽

Selection Of ◽

Original Information ◽

Metric Model

To select more effective feature genes, many existing algorithms focus on the selection and study of evaluation methods for feature genes, ignoring the accurate mapping of original information in data processing. Therefore, for solving this problem, a new model is proposed in this paper: rough uncertainty metric model. First, the fuzzy neighborhood granule of the sample is constructed by combining the fuzzy similarity relation with the neighborhood radius in the rough set, and the rough decision is defined by using the fuzzy similarity relation and the decision equivalence class. Then, the fuzzy neighborhood granule and the rough decision are introduced into the conditional entropy, and the rough uncertainty metric model is proposed; in the meantime, the definition of measuring the significance of feature genes and the proof of some related theorems are given. To make this model tolerate noises in data, this paper introduces a variable precision model and discusses the selection of parameters. Finally, based on the rough uncertainty metric model, we design a feature genes selection algorithm and compare it with some existing similar algorithms. The experimental results show that the proposed algorithm can select the smaller feature genes subset with higher classification accuracy and verify that the model proposed in this paper is more effective.

Download Full-text

BINARY ENCODING OF DISCERNIBILITY PATTERNS TO FIND MINIMAL COVERINGS

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194002000809 ◽

2002 ◽

Vol 12 (01) ◽

pp. 1-18 ◽

Cited By ~ 1

Author(s):

R. FÉLIX ◽

T. USHIO

Keyword(s):

Rough Set ◽

Rough Set Theory ◽

Decision Rules ◽

Binary Representation ◽

Data Sets ◽

Discernibility Matrix ◽

Imperfect Data ◽

Binary Encoding ◽

Computational Resources ◽

And Control

Rough set based methods have been applied successfully in many real world applications such as data mining, knowledge discovery, machine learning, and control. The rough set theory is used to deal with imperfect data and to eliminate dispensable, superfluous and redundant information as to obtain a simplified set of decision rules. Thus, several approaches and methods have been proposed to find minimal coverings, from which the decision rules can be induced. In many of these approaches, an improvement in the utilization of computational resources is encouraged. In this paper, a binary encoding for attribute sets and a discernibility matrix is proposed. Such a binary representation of sets and sets operations in the implementation of algorithms provides a machine-oriented approach to the utilization of computational memory and allow parallel processing among groups of attributes. The discernibility matrix is reduced to its minimal size through the identification of main patterns in order to eliminate redundancies. Bit-wise operations replace sets operations, thus the search for minimal coverings is performed in an efficient way. Resulting improvement is shown in the analysis of medium-sized data sets using two generic methods to obtain minimal coverings.

Download Full-text

Incomplete Concept Lattice Data Analytical Method Research Based on Rough Set Theory

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.50-51.180 ◽

2011 ◽

Vol 50-51 ◽

pp. 180-184

Author(s):

Jin Peng Wang ◽

Bao Xiang Liu ◽

Zhen Dong Li ◽

Li Chao Feng

Keyword(s):

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Concept Lattice ◽

Reduction Algorithm ◽

Lattice Data ◽

Discernibility Matrix ◽

Reduction Strategy ◽

Temporal Complexity ◽

Definition Of

Concept lattice and rough set are powerful tools for data analyzing and processing, has been successfully applied to many fields. However, the decision information is incomplete in many information systems. In this paper, the definition of incomplete concept lattice has been proposed, and some relation established between imperfect concept lattice and rough set. As is very important that the paper gives a new attributes reduction algorithm about incomplete concept lattice aims at the matter of the inefficient of reduction strategy based on discernibility matrix. Comparing with the attributes reduction for incomplete concept lattice which based on discernibility matrix, this reduction algorithm, reduces the spatial-temporal complexity.

Download Full-text