(p+, α, t)-Anonymity Technique Against Privacy Attacks

Sowmyarani C. N.; Veena Gadad;  Dayananda P.

doi:10.4018/ijisp.2021040104

(p+, α, t)-Anonymity Technique Against Privacy Attacks

International Journal of Information Security and Privacy ◽

10.4018/ijisp.2021040104 ◽

2021 ◽

Vol 15 (2) ◽

pp. 68-86

Author(s):

Sowmyarani C. N. ◽

Veena Gadad ◽

Dayananda P.

Keyword(s):

Private Information ◽

Data Analytics ◽

Privacy Preservation ◽

Original Form ◽

Sensitive Information ◽

Data Anonymization ◽

Attack Model ◽

Target Individual ◽

Anonymized Data ◽

Privacy Attack

Privacy preservation is a major concern in current technology where enormous amounts of data are being collected and published for carrying out analysis. These data may contain sensitive information related to individual who owns them. If the data is published in their original form, they may lead to privacy disclosure which threats privacy requirements. Hence, the data should be anonymized before publishing so that it becomes challenging for intruders to obtain sensitive information by means of any privacy attack model. There are popular data anonymization techniques such as k-anonymity, l-diversity, p-sensitive k-anonymity, (l, m, d) anonymity, and t-closeness, which are vulnerable to different privacy attacks discussed in this paper. The proposed technique called (p+, α, t)-anonymity aims to anonymize the data in such a way that even though intruder has sufficient background knowledge on the target individual he will not be able to infer anything and breach private information. The anonymized data also provide sufficient data utility by allowing various data analytics to be performed.

Download Full-text

Data Privacy Preservation and Security Approaches for Sensitive Data in Big Data

10.3233/apc210221 ◽

2021 ◽

Author(s):

Rohit Ravindra Nikam ◽

Rekha Shahapurkar

Keyword(s):

Data Mining ◽

Data Analytics ◽

Data Privacy ◽

Privacy Preservation ◽

Large Data ◽

Research Area ◽

Data Sets ◽

Sensitive Information ◽

Sensitive Data ◽

Data Mining Techniques

Data mining is a technique that explores the necessary data is extracted from large data sets. Privacy protection of data mining is about hiding the sensitive information or identity of breach security or without losing data usability. Sensitive data contains confidential information about individuals, businesses, and governments who must not agree upon before sharing or publishing his privacy data. Conserving data mining privacy has become a critical research area. Various evaluation metrics such as performance in terms of time efficiency, data utility, and degree of complexity or resistance to data mining techniques are used to estimate the privacy preservation of data mining techniques. Social media and smart phones produce tons of data every minute. To decision making, the voluminous data produced from the different sources can be processed and analyzed. But data analytics are vulnerable to breaches of privacy. One of the data analytics frameworks is recommendation systems commonly used by e-commerce sites such as Amazon, Flip Kart to recommend items to customers based on their purchasing habits that lead to characterized. This paper presents various techniques of privacy conservation, such as data anonymization, data randomization, generalization, data permutation, etc. such techniques which existing researchers use. We also analyze the gap between various processes and privacy preservation methods and illustrate how to overcome such issues with new innovative methods. Finally, our research describes the outcome summary of the entire literature.

Download Full-text

Utility-based SK-Clustering algorithm for Privacy Preservation of Anonymized Data in Healthcare

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666190920100913 ◽

2019 ◽

Vol 13 ◽

Author(s):

Shobana G ◽

S. Shankar

Keyword(s):

Relative Error ◽

Privacy Preservation ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Sensitive Information ◽

K Value ◽

Identity Disclosure ◽

Anonymized Data ◽

High Utility ◽

Privacy And Confidentiality

Background: The increasing need for various data publishers to release or share the healthcare datasets has imparted a threat for the privacy and confidentiality of the Electronic Medical Records. However, the main goal is to share useful information thereby maximizing utility as well as ensuring that sensitive information is not disclosed. There always exist utility-privacy tradeoff which needs to be handled properly for the researchers to learn statistical properties of the datasets. Objective: The objective of the research article is to introduce a novel SK-Clustering algorithm that overcomes identity disclosure, attribute disclosure and similarity attacks. The algorithm is evaluated using metrics such as discernability measure and relative error so as to show its performance compared with other clustering algorithms. Methodology: The SK-Clustering algorithm flexibly adjusts the level of protection for high utility. Also the size of the clusters is minimized dynamically based on the requirements of the protection required and we add extra tuples accordingly. This will drastically reduce information loss thereby increasing utilization. Result and Conclusion: For a k-value of 50 the discernabilty measure of SK algorithm is 65000 whereas the Mondrian algorithm exhibits 70000 discernability measure and the Anatomy algorithm has a discernability measure of 150000. Similarly, the relative error of our algorithm is less than 10% for a tuple count of 35000 when compared to other k-anonymity algorithms. The proposed algorithm executes more competently in terms of minimal discernability measure as well as relative error, thereby proving higher data utility compared with traditionally available algorithms.

Download Full-text

Ameliorating the Privacy on Large Scale Aviation Dataset by Implementing MapReduce Multidimensional Hybrid k-Anonymization

International Journal of Web Portals ◽

10.4018/ijwp.2019070102 ◽

2019 ◽

Vol 11 (2) ◽

pp. 14-40

Author(s):

Stephen Dass A. ◽

Prabhu J.

Keyword(s):

Data Storage ◽

Data Analytics ◽

Large Scale ◽

Privacy Preservation ◽

Processing Time ◽

Big Data Analytics ◽

Generation Process ◽

Data Generation ◽

Privacy Issue ◽

Anonymized Data

In this fast growing data universe, data generation and data storage are moving into the next-generation process by generating petabytes and gigabytes in an hour. This leads to data accumulation where privacy and preservation are certainly misplaced. This data contains some sensitive and high privacy data which is to be hidden or removed using hashing or anonymization algorithms. In this article, the authors propose a hybrid k anonymity algorithm to handle large scale aircraft datasets with combined concepts of Big Data analytics and privacy preservation of storing the dataset with the help of MapReduce. This published anonymized data are moved by MapReduce to the Hive database for data storage. The authors propose a multi-dimensional hybrid k-anonymity technique to solve the privacy issue and compare the proposed system with other two anonymization methods such as BUG and TDS. Three experiments were performed for evaluating classifier error, calculating disruption value and p% hybrid anonymity and estimation of processing time.

Download Full-text

Survey on privacy preserving data mining techniques in health care databases

Acta Universitatis Sapientiae Informatica ◽

10.2478/ausi-2014-0017 ◽

2014 ◽

Vol 6 (1) ◽

pp. 33-55 ◽

Cited By ~ 1

Author(s):

Tamás Zoltán Gál ◽

Gábor Kovács ◽

Zsolt T. Kardkovács

Keyword(s):

Data Mining ◽

Health Care ◽

Case Studies ◽

Private Information ◽

Privacy Preservation ◽

State Of The Art ◽

Legal Environment ◽

Privacy Preserving Data Mining ◽

Data Anonymization ◽

Health Care Data

Abstract In health care databases, there are tireless and antagonistic interests between data mining research and privacy preservation, the more you try to hide sensitive private information, the less valuable it is for analysis. In this paper, we give an outlook on data anonymization problems by case studies. We give a summary on the state-of-the-art health care data anonymization issues including legal environment and expectations, the most common attacking strategies on privacy, and the proposed metrics for evaluating usefulness and privacy preservation for anonymization. Finally, we summarize the strength and the shortcomings of different approaches and techniques from the literature based on these evaluations.

Download Full-text

Personalized trajectory privacy-preserving method based on sensitive attribute generalization and location perturbation

Intelligent Data Analysis ◽

10.3233/ida-205306 ◽

2021 ◽

Vol 25 (5) ◽

pp. 1247-1271

Author(s):

Chuanming Chen ◽

Wenshi Lin ◽

Shuanggui Zhang ◽

Zitong Ye ◽

Qingying Yu ◽

...

Keyword(s):

Private Information ◽

Privacy Protection ◽

Privacy Preservation ◽

Background Knowledge ◽

Sensitive Information ◽

Frequent Patterns ◽

Trajectory Data ◽

Preservation Method ◽

Sensitive Attribute ◽

Theoretical Analyses

Trajectory data may include the user’s occupation, medical records, and other similar information. However, attackers can use specific background knowledge to analyze published trajectory data and access a user’s private information. Different users have different requirements regarding the anonymity of sensitive information. To satisfy personalized privacy protection requirements and minimize data loss, we propose a novel trajectory privacy preservation method based on sensitive attribute generalization and trajectory perturbation. The proposed method can prevent an attacker who has a large amount of background knowledge and has exchanged information with other attackers from stealing private user information. First, a trajectory dataset is clustered and frequent patterns are mined according to the clustering results. Thereafter, the sensitive attributes found within the frequent patterns are generalized according to the user requirements. Finally, the trajectory locations are perturbed to achieve trajectory privacy protection. The results of theoretical analyses and experimental evaluations demonstrate the effectiveness of the proposed method in preserving personalized privacy in published trajectory data.

Download Full-text

PURA-SCIS Protocol: A Novel Solution for Cloud-Based Information Sharing Protection for Sectoral Organizations

Symmetry ◽

10.3390/sym13122347 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2347

Author(s):

Fandi Aditya Putra ◽

Kalamullah Ramli ◽

Nur Hayati ◽

Teddy Surya Gunawan

Keyword(s):

Information Sharing ◽

Private Information ◽

Privacy Protection ◽

Data Privacy ◽

Privacy Preservation ◽

Cloud Services ◽

Sensitive Information ◽

Public And Private ◽

Related Information ◽

Public Verifiability

Over recent years, the incidence of data breaches and cyberattacks has increased significantly. This has highlighted the need for sectoral organizations to share information about such events so that lessons can be learned to mitigate the prevalence and severity of cyber incidents against other organizations. Sectoral organizations embody a governance relationship between cross-sector public and private entities, called public-private partnerships (PPPs). However, organizations are hesitant to share such information due to a lack of trust and business-critical confidentially issues. This problem occurs because of the absence of any protocols that guarantee privacy protection and protect sensitive information. To address this issue, this paper proposes a novel protocol, Putra-Ramli Secure Cyber-incident Information Sharing (PURA-SCIS), to secure cyber incident information sharing. PURA-SCIS has been designed to offer exceptional data and privacy protection and run on the cloud services of sectoral organizations. The relationship between organizations in PURA-SCIS is symmetrical, where the entities must collectively maintain the security of classified cyber incident information. Furthermore, the organizations must be legitimate entities in the PURA-SCIS protocol. The Scyther tool was used for protocol verification in PURA-SCIS. The experimental results showed that the proposed PURA-SCIS protocol provided good security properties, including public verifiability for all entities, blockless verification, data privacy preservation, identity privacy preservation and traceability, and private information sharing. PURA-SCIS also provided a high degree of confidentiality to protect the security and integrity of cyber-incident-related information exchanged among sectoral organizations via cloud services.

Download Full-text

Guarantees of Differential Privacy in Cloud of Things: A Multilevel Data Publication Scheme

International Journal of Engineering Research in Africa ◽

10.4028/www.scientific.net/jera.56.199 ◽

2021 ◽

Vol 56 ◽

pp. 199-212

Author(s):

Olga Kengni Ngangmo ◽

Ado Adamou Abba Ari ◽

Alidou Mohamadou ◽

Ousmane Thiare ◽

Dina Taiwe Kolyang

Keyword(s):

Private Information ◽

Medical Information ◽

Differential Privacy ◽

Smart Devices ◽

Data Anonymization ◽

Anonymized Data ◽

Multi Level ◽

Privacy Breaches ◽

Financial Transactions ◽

New Generation

Nowadays, the cloud computing technology combined with the new generation networks and internet of things facilitate the networking of numerous smart devices. Moreover, the advent of the smart web requires massive data backup from the smart connected devices to the cloud. Unfortunately, the publication of several of these data, such as medical information and financial transactions, could lead to serious privacy breaches, which is becoming the most serious issue in cloud of things. For instance, passive attacks can launched in order to get access to private information. For this reason, several data anonymization techniques have emerged in order to keep data as confidential as possible. However, these different techniques are making the data unusable the most of time. Meanwhile, differential privacy that has been used in a number of cyber physical systems recently emerged as an efficient technique for ensuring the privacy of cloud of things stored data. In this exploratory paper, we study the guarantees of differential privacy of a multi-level anonymization scheme of data graphs. The considered scheme disturbs the structure of the graph by adding false edges, groups the vertices in distinct sets and permutes the vertices in these groups. Particularly, we demonstrated the guarantees that the anonymized data by this algorithm remain exploitable while guaranteeing the anonymity of users.

Download Full-text

K-Anonymity technique for privacy protection: a proof of concept study

10.5753/sbseg.2019.13987 ◽

2019 ◽

Author(s):

Italo Santos ◽

Emanuel Coutinho ◽

Leonardo Moreira

Keyword(s):

System Architecture ◽

Privacy Protection ◽

Personal Information ◽

Personal Space ◽

Sensitive Information ◽

Proof Of Concept ◽

Data Set ◽

Data Anonymization ◽

Privacy Model ◽

Anonymized Data

Privacy is a concept directly related to people's interest in maintaining personal space without the interference of others. In this paper, we focus on study the k-anonymity technique since many generalization algorithms are based on this privacy model. Due to this, we develop a proof of concept that uses the k-anonymity technique for data anonymization to anonymize data raw and generate a new ﬁle with anonymized data. We present the system architecture and detailed an experiment using the adult data set which has sensitive information, where each record corresponds to the personal information for a person. Finally, we summarize our work and discuss future works.

Download Full-text

Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

International Journal of Distributed and Cloud Computing ◽

10.21863/ijdcc/2015.3.2.009 ◽

2015 ◽

Vol 3 (2) ◽

Author(s):

Shalin Eliabeth S. ◽

Sarju S.

Keyword(s):

Big Data ◽

Data Privacy ◽

Privacy Preservation ◽

Experimental Result ◽

Map Reduce ◽

Distributed Environment ◽

Top Down ◽

Two Phase ◽

Data Anonymization ◽

Big Data Privacy

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.

Download Full-text

Study A Public Key in RSA Algorithm

European Journal of Engineering Research and Science ◽

10.24018/ejers.2020.5.4.1843 ◽

2020 ◽

Vol 5 (4) ◽

pp. 395-398

Author(s):

Taleb Samad Obaid

Keyword(s):

Private Information ◽

Original Data ◽

Prime Numbers ◽

Public Key ◽

The Internet ◽

Sensitive Information ◽

Rsa Algorithm ◽

Encryption And Decryption ◽

Main Disadvantage ◽

Encryption Decryption

To transmit sensitive information over the unsafe communication network like the internet network, the security is precarious tasks to protect this information. Always, we have much doubt that there are more chances to uncover the information that is being sent through network terminals or the internet by professional/amateur parasitical persons. To protect our information we may need a secure way to safeguard our transferred information. So, encryption/decryption, stenographic and vital cryptography may be adapted to care for the required important information. In system cryptography, the information transferred between both sides sender/receiver in the network must be scrambled using the encryption algorithm. The second side (receiver) should be outlook the original data using the decryption algorithms. Some encryption techniques applied the only one key in the cooperation of encryption and decryption algorithms. When the similar key used in both proceeds is called symmetric algorithm. Other techniques may use two different keys in encryption/decryption in transferring information which is known as the asymmetric key. In general, the algorithms that implicated asymmetric keys are much more secure than others using one key. RSA algorithm used asymmetric keys; one of them for encryption the message, and is known as a public key and another used to decrypt the encrypted message and is called a private key. The main disadvantage of the RSA algorithm is that extra time is taken to perform the encryption process. In this study, the MATLAB library functions are implemented to achieve the work. The software helps us to hold very big prime numbers to generate the required keys which enhanced the security of transmitted information and we expected to be difficult for a hacker to interfere with the private information. The algorithms are implemented successfully on different sizes of messages files.

Download Full-text