Reducing Side Effects of Hiding Sensitive Itemsets in Privacy Preserving Data Mining

Intense work in the area of data mining technology and in its applications to several domains has resulted in the development of a large variety of techniques and tools able to automatically and intelligently transform large amounts of data in knowledge relevant to users. However, as with other kinds of useful technologies, the knowledge discovery process can be misused. It can be used, for example, by malicious subjects in order to reconstruct sensitive information for which they do not have an explicit access authorization. This type of “attack” cannot easily be detected, because, usually, the data used to guess the protected information, is freely accessible. For this reason, many research efforts have been recently devoted to addressing the problem of privacy preserving in data mining. The mission of this chapter is therefore to introduce the reader in this new research field and to provide the proper instruments (in term of concepts, techniques and example) in order to allow a critical comprehension of the advantages, the limitations and the open issues of the Privacy Preserving Data Mining Techniques.

Download Full-text

DCT Image Steganography Analysis for Privacy Preserving Data Mining

International Journal of Technology Diffusion ◽

10.4018/ijtd.2016070101 ◽

2016 ◽

Vol 7 (3) ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Sahar A. El-Rahman Ismail ◽

Dalal Al Makhdhub ◽

Amal A. Al Qahtani ◽

Ghadah A. Al Shabanat ◽

Nouf M. Omair ◽

...

Keyword(s):

Data Mining ◽

Information Hiding ◽

Signal To Noise Ratio ◽

Low Frequency ◽

Privacy Preserving ◽

Sensitive Information ◽

Signal To Noise ◽

Privacy Preserving Data Mining ◽

Frequency Domains ◽

Privacy And Confidentiality

We live in an information era where sensitive information extracted from data mining systems is vulnerable to exploitation. Privacy preserving data mining aims to prevent the discovery of sensitive information. Information hiding systems provide excellent privacy and confidentiality, where securing confidential communications in public channels can be achieved using steganography. A cover media are exploited using steganography techniques where they hide the payload's existence within appropriate multimedia carriers. This paper aims to study steganography techniques in spatial and frequency domains, and then analyzes the performance of Discrete Cosine Transform (DCT) based steganography using the low frequency and the middle frequency to compare their performance using Peak Signal to Noise Ratio (PSNR) and Mean Square Error (MSE). The experimental results show that middle frequency has the larger message capacity and best performance.

Download Full-text

Privacy Preserving Data Mining

Computer Security, Privacy and Politics ◽

10.4018/978-1-59904-804-8.ch005 ◽

2008 ◽

pp. 70-93 ◽

Cited By ~ 2

Author(s):

Madhu V. Ahluwalia ◽

Aryya Gangopadhyay

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Accurate Data ◽

Privacy Preserving Data Mining ◽

Confidential Data ◽

Vast Area ◽

Major Data

This chapter gives a synopsis of the techniques that exist in the area of privacy preserving data mining. Privacy preserving data mining is important because there is a need to develop accurate data mining models without using confidential data items in individual records. In providing a neat categorization of the current algorithms that preserve privacy for major data mining tasks, the authors hope that students, teachers and researchers can gain an understanding of this vast area and apply the knowledge gained to find new ways of simultaneously preserving privacy and conducting mining.

Download Full-text

Privacy Preserving Data Mining, Concepts, Techniques, and Evaluation Methodologies

Successes and New Directions in Data Mining ◽

10.4018/978-1-59904-645-7.ch012 ◽

2008 ◽

pp. 277-301

Author(s):

Igor Nai Fovino

Keyword(s):

Data Mining ◽

Knowledge Discovery ◽

Privacy Preserving ◽

Research Field ◽

Sensitive Information ◽

Discovery Process ◽

Privacy Preserving Data Mining ◽

Mining Technology ◽

Evaluation Methodologies ◽

New Research

Intense work in the area of data mining technology and in its applications to several domains has resulted in the development of a large variety of techniques and tools able to automatically and intelligently transform large amounts of data in knowledge relevant to users. However, as with other kinds of useful technologies, the knowledge discovery process can be misused. It can be used, for example, by malicious subjects in order to reconstruct sensitive information for which they do not have an explicit access authorization. This type of “attack” cannot easily be detected, because, usually, the data used to guess the protected information, is freely accessible. For this reason, many research efforts have been recently devoted to addressing the problem of privacy preserving in data mining. The mission of this chapter is therefore to introduce the reader in this new research field and to provide the proper instruments (in term of concepts, techniques and example) in order to allow a critical comprehension of the advantages, the limitations and the open issues of the Privacy Preserving Data Mining Techniques.

Download Full-text

Bit Transformation Perturbative Masking Technique for Protecting Sensitive Information In Privacy Preserving Data Mining

International Journal of Database Management Systems ◽

10.5121/ijdms.2010.2409 ◽

2010 ◽

Vol 2 (4) ◽

pp. 107-114 ◽

Cited By ~ 1

Author(s):

S Vijayarani ◽

A Tamilarasi

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Sensitive Information ◽

Privacy Preserving Data Mining

Download Full-text

Collusion-Free Privacy Preserving Data Mining

International Journal of Intelligent Information Technologies ◽

10.4018/jiit.2010100103 ◽

2010 ◽

Vol 6 (4) ◽

pp. 30-45 ◽

Cited By ~ 7

Author(s):

M. Rajalakshmi ◽

T. Purusothaman ◽

S. Pratheeba

Keyword(s):

Data Mining ◽

Association Rule ◽

Privacy Preserving ◽

Frequent Itemsets ◽

Data Sources ◽

Sensitive Information ◽

Distributed Data ◽

Distributed Environment ◽

Rule Mining ◽

Privacy Preserving Data Mining

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing approaches, collusion among the participating sites reveal sensitive information about the other sites. In this paper, the authors propose a collusion-free algorithm for mining global frequent itemsets in a distributed environment with minimal communication among sites. This algorithm uses the techniques of splitting and sanitizing the itemsets and communicates to random sites in two different phases, thus making it difficult for the colluders to retrieve sensitive information. Results show that the consequence of collusion is reduced to a greater extent without affecting mining performance and confirms optimal communication among sites.

Download Full-text

Collusion-Free Privacy Preserving Data Mining

Insights into Advancements in Intelligent Information Technologies ◽

10.4018/978-1-4666-0158-1.ch015 ◽

2012 ◽

pp. 269-284

Author(s):

T. Purusothaman ◽

M. Rajalakshmi ◽

S. Pratheeba

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Frequent Itemsets ◽

Data Sources ◽

Sensitive Information ◽

Distributed Data ◽

Distributed Environment ◽

Rule Mining ◽

Privacy Preserving Data Mining ◽

Distributed Association

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing approaches, collusion among the participating sites reveal sensitive information about the other sites. In this paper, the authors propose a collusion-free algorithm for mining global frequent itemsets in a distributed environment with minimal communication among sites. This algorithm uses the techniques of splitting and sanitizing the itemsets and communicates to random sites in two different phases, thus making it difficult for the colluders to retrieve sensitive information. Results show that the consequence of collusion is reduced to a greater extent without affecting mining performance and confirms optimal communication among sites.

Download Full-text

A Grid-Based Swarm Intelligence Algorithm for Privacy-Preserving Data Mining

Applied Sciences ◽

10.3390/app9040774 ◽

2019 ◽

Vol 9 (4) ◽

pp. 774 ◽

Cited By ~ 6

Author(s):

Tsu-Yang Wu ◽

Jerry Lin ◽

Yuyu Zhang ◽

Chun-Hao Chen

Keyword(s):

Data Mining ◽

Side Effects ◽

Evolutionary Process ◽

Privacy Preserving ◽

Optimal Solutions ◽

Confidential Information ◽

Nsga Ii ◽

Privacy Preserving Data Mining ◽

Single Objective ◽

Grid Based

Privacy-preserving data mining (PPDM) has become an interesting and emerging topic in recent years because it helps hide confidential information, while allowing useful knowledge to be discovered at the same time. Data sanitization is a common way to perturb a database, and thus sensitive or confidential information can be hidden. PPDM is not a trivial task and can be concerned an Non-deterministic Polynomial-time (NP)-hard problem. Many algorithms have been studied to derive optimal solutions using the evolutionary process, although most are based on straightforward or single-objective methods used to discover the candidate transactions/items for sanitization. In this paper, we present a multi-objective algorithm using a grid-based method (called GMPSO) to find optimal solutions as candidates for sanitization. The designed GMPSO uses two strategies for updating gbest and pbest during the evolutionary process. Moreover, the pre-large concept is adapted herein to speed up the evolutionary process, and thus multiple database scans during each evolutionary process can be reduced. From the designed GMPSO, multiple Pareto solutions rather than single-objective algorithms can be derived based on Pareto dominance. In addition, the side effects of the sanitization process can be significantly reduced. Experiments have shown that the designed GMPSO achieves better side effects than the previous single-objective algorithm and the NSGA-II-based approach, and the pre-large concept can also help with speeding up the computational cost compared to the NSGA-II-based algorithm.

Download Full-text

Association Rule Hiding in Privacy Preserving Data Mining

Research Anthology on Privatizing and Securing Data ◽

10.4018/978-1-7998-8954-0.ch044 ◽

2021 ◽

pp. 963-986

Author(s):

S. Vijayarani Mohan ◽

Tamilarasi Angamuthu

Keyword(s):

Data Mining ◽

Genetic Algorithm ◽

Association Rules ◽

Association Rule ◽

Privacy Preserving ◽

Sensitive Information ◽

Hidden Information ◽

Privacy Preserving Data Mining ◽

Marketing Information ◽

Mining Association Rule

This article describes how privacy preserving data mining has become one of the most important and interesting research directions in data mining. With the help of data mining techniques, people can extract hidden information and discover patterns and relationships between the data items. In most of the situations, the extracted knowledge contains sensitive information about individuals and organizations. Moreover, this sensitive information can be misused for various purposes which violate the individual's privacy. Association rules frequently predetermine significant target marketing information about a business. Significant association rules provide knowledge to the data miner as they effectively summarize the data, while uncovering any hidden relations among items that hold in the data. Association rule hiding techniques are used for protecting the knowledge extracted by the sensitive association rules during the process of association rule mining. Association rule hiding refers to the process of modifying the original database in such a way that certain sensitive association rules disappear without seriously affecting the data and the non-sensitive rules. In this article, two new hiding techniques are proposed namely hiding technique based on genetic algorithm (HGA) and dummy items creation (DIC) technique. Hiding technique based on genetic algorithm is used for hiding sensitive association rules and the dummy items creation technique hides the sensitive rules as well as it creates dummy items for the modified sensitive items. Experimental results show the performance of the proposed techniques.

Download Full-text

Privacy Preserving Data Mining

Advances in Data Mining and Database Management - Data Mining in Public and Private Sectors ◽

10.4018/978-1-60566-906-9.ch007 ◽

2010 ◽

pp. 125-141

Author(s):

Aris Gkoulalas-Divanis ◽

Vassilios S. Verykios

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Future Research ◽

Sensitive Information ◽

Privacy Preserving Data Mining ◽

Research Directions ◽

Mobility Data ◽

Future Research Directions ◽

The Government ◽

Existing Data

Since its inception in 2000, privacy preserving data mining has gained increasing popularity in the data mining research community. This line of research can be primarily attributed to the growing concern of individuals, organizations and the government regarding the violation of privacy in the mining of their data by the existing data mining technology. As a result, a whole new body of research was introduced to allow for the mining of data, while at the same time prohibiting the leakage of any private and sensitive information. In this chapter, the authors introduce the readers to the field of privacy preserving data mining; they discuss the reasons that led to its inception, the most prominent research directions, as well as some important methodologies per direction. Following that, the authors focus their attention on very recently investigated methodologies for the offering of privacy during the mining of user mobility data. In the end of the chapter, they provide a roadmap along with potential future research directions both with respect to the field of privacy-aware mobility data mining and to privacy preserving data mining at large.

Download Full-text