scholarly journals Gradual Release of Sensitive Data under Differential Privacy

2017 ◽  
Vol 7 (2) ◽  
Author(s):  
Fragkiskos Koufogiannis ◽  
Shuo Han ◽  
George J. Pappas

We introduce the problem of releasing private data under differential privacy when the privacy level is subject to change over time. Existing work assumes that privacy level is determined by the system designer as a fixed value before private data is released. For certain applications, however, users may wish to relax the privacy level for subsequent releases of the same data after either a re-evaluation of the privacy concerns or the need for better accuracy. Specifically, given a database containing private data, we assume that a  response y1 that preserves  \( \epsilon _1\)-differential privacy has already been published. Then, the privacy level is relaxed to  \( \epsilon _2\), with \( \epsilon _2 > \epsilon _1\), and we wish to publish a more accurate response y2 while the joint response (y1,y2) preserves \( \epsilon _2\)-differential privacy. How much accuracy is lost in the scenario of gradually releasing two responses y1 and y2 compared to the scenario of releasing a single response that is \( \epsilon _2\)-differentially private? Our results consider the more general case with multiple privacy level relaxations and show that there exists a composite mechanism that achieves no loss in accuracy.  We consider the case in which the private data lies within Rn with an adjacency relation induced by the \( \ell _1\)-norm, and we initially focus on mechanisms that approximate identity queries. We show that the same accuracy can be achieved in the case of gradual release through a mechanism whose outputs can be described by a lazy Markov stochastic process. This stochastic process has a closed form expression and can be efficiently sampled. Moreover, our results extend beyond identity queries to a more general family of privacy-preserving mechanisms. To this end, we demonstrate the applicability of our tool to multiple scenarios including Google’s project RAPPOR, trading of private data, and controlled  transmission of private data in a social network. Finally, we derive similar results for the approximated differential privacy.  

2021 ◽  
Author(s):  
Jude TCHAYE-KONDI ◽  
Yanlong Zhai ◽  
Liehuang Zhu

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>


Author(s):  
Divya Asok ◽  
Chitra P. ◽  
Bharathiraja Muthurajan

In the past years, the usage of internet and quantity of digital data generated by large organizations, firms, and governments have paved the way for the researchers to focus on security issues of private data. This collected data is usually related to a definite necessity. For example, in the medical field, health record systems are used for the exchange of medical data. In addition to services based on users' current location, many potential services rely on users' location history or their spatial-temporal provenance. However, most of the collected data contain data identifying individual which is sensitive. With the increase of machine learning applications around every corner of the society, it could significantly contribute to the preservation of privacy of both individuals and institutions. This chapter gives a wider perspective on the current literature on privacy ML and deep learning techniques, along with the non-cryptographic differential privacy approach for ensuring sensitive data privacy.


2021 ◽  
Author(s):  
Syed Usama Khalid Bukhari ◽  
Anum Qureshi ◽  
Adeel Anjum ◽  
Munam Ali Shah

<div> <div> <div> <p>Privacy preservation of high-dimensional healthcare data is an emerging problem. Privacy breaches are becoming more common than before and affecting thousands of people. Every individual has sensitive and personal information which needs protection and security. Uploading and storing data directly to the cloud without taking any precautions can lead to serious privacy breaches. It’s a serious struggle to publish a large amount of sensitive data while minimizing privacy concerns. This leads us to make crucial decisions for the privacy of outsourced high-dimensional healthcare data. Many types of privacy preservation techniques have been presented to secure high-dimensional data while keeping its utility and privacy at the same time but every technique has its pros and cons. In this paper, a novel privacy preservation NRPP model for high-dimensional data is proposed. The model uses a privacy-preserving generative technique for releasing sensitive data, which is deferentially private. The contribution of this paper is twofold. First, a state-of-the-art anonymization model for high-dimensional healthcare data is proposed using a generative technique. Second, achieved privacy is evaluated using the concept of differential privacy. The experiment shows that the proposed model performs better in terms of utility. </p> </div> </div> </div>


2021 ◽  
Author(s):  
Syed Usama Khalid Bukhari ◽  
Anum Qureshi ◽  
Adeel Anjum ◽  
Munam Ali Shah

<div> <div> <div> <p>Privacy preservation of high-dimensional healthcare data is an emerging problem. Privacy breaches are becoming more common than before and affecting thousands of people. Every individual has sensitive and personal information which needs protection and security. Uploading and storing data directly to the cloud without taking any precautions can lead to serious privacy breaches. It’s a serious struggle to publish a large amount of sensitive data while minimizing privacy concerns. This leads us to make crucial decisions for the privacy of outsourced high-dimensional healthcare data. Many types of privacy preservation techniques have been presented to secure high-dimensional data while keeping its utility and privacy at the same time but every technique has its pros and cons. In this paper, a novel privacy preservation NRPP model for high-dimensional data is proposed. The model uses a privacy-preserving generative technique for releasing sensitive data, which is deferentially private. The contribution of this paper is twofold. First, a state-of-the-art anonymization model for high-dimensional healthcare data is proposed using a generative technique. Second, achieved privacy is evaluated using the concept of differential privacy. The experiment shows that the proposed model performs better in terms of utility. </p> </div> </div> </div>


2019 ◽  
Author(s):  
Iago Chaves ◽  
Javam Machado

Privacy concerns are growing fast because of data protection regulations around the world. Many works have built private algorithms avoiding sensitive information leakage through data publication. Differential privacy, based on formal definitions, is a strong guarantee for individual privacy and the cutting edge for designing private algorithms. This work proposes a differentially private group-by algorithm for data publication under the exponential mechanism. Our method publishes data groups according to a specified attribute while maintaining the desired privacy level and trustworthy utility results.


2021 ◽  
Author(s):  
Jude TCHAYE-KONDI ◽  
Yanlong Zhai ◽  
Liehuang Zhu

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>


Computation ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 6
Author(s):  
Maria Eleni Skarkala ◽  
Manolis Maragoudakis ◽  
Stefanos Gritzalis ◽  
Lilian Mitrou

Distributed medical, financial, or social databases are analyzed daily for the discovery of patterns and useful information. Privacy concerns have emerged as some database segments contain sensitive data. Data mining techniques are used to parse, process, and manage enormous amounts of data while ensuring the preservation of private information. Cryptography, as shown by previous research, is the most accurate approach to acquiring knowledge while maintaining privacy. In this paper, we present an extension of a privacy-preserving data mining algorithm, thoroughly designed and developed for both horizontally and vertically partitioned databases, which contain either nominal or numeric attribute values. The proposed algorithm exploits the multi-candidate election schema to construct a privacy-preserving tree-augmented naive Bayesian classifier, a more robust variation of the classical naive Bayes classifier. The exploitation of the Paillier cryptosystem and the distinctive homomorphic primitive shows in the security analysis that privacy is ensured and the proposed algorithm provides strong defences against common attacks. Experiments deriving the benefits of real world databases demonstrate the preservation of private data while mining processes occur and the efficient handling of both database partition types.


2021 ◽  
Author(s):  
Jude TCHAYE-KONDI ◽  
Yanlong Zhai ◽  
Liehuang Zhu

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>


Author(s):  
Divya Asok ◽  
Chitra P. ◽  
Bharathiraja Muthurajan

In the past years, the usage of internet and quantity of digital data generated by large organizations, firms, and governments have paved the way for the researchers to focus on security issues of private data. This collected data is usually related to a definite necessity. For example, in the medical field, health record systems are used for the exchange of medical data. In addition to services based on users' current location, many potential services rely on users' location history or their spatial-temporal provenance. However, most of the collected data contain data identifying individual which is sensitive. With the increase of machine learning applications around every corner of the society, it could significantly contribute to the preservation of privacy of both individuals and institutions. This chapter gives a wider perspective on the current literature on privacy ML and deep learning techniques, along with the non-cryptographic differential privacy approach for ensuring sensitive data privacy.


2021 ◽  
Vol 21 (2) ◽  
pp. 1-22
Author(s):  
Abhinav Kumar ◽  
Sanjay Kumar Singh ◽  
K Lakshmanan ◽  
Sonal Saxena ◽  
Sameer Shrivastava

The advancements in the Internet of Things (IoT) and cloud services have enabled the availability of smart e-healthcare services in a distant and distributed environment. However, this has also raised major privacy and efficiency concerns that need to be addressed. While sharing clinical data across the cloud that often consists of sensitive patient-related information, privacy is a major challenge. Adequate protection of patients’ privacy helps to increase public trust in medical research. Additionally, DL-based models are complex, and in a cloud-based approach, efficient data processing in such models is complicated. To address these challenges, we propose an efficient and secure cancer diagnostic framework for histopathological image classification by utilizing both differential privacy and secure multi-party computation. For efficient computation, instead of performing the whole operation on the cloud, we decouple the layers into two modules: one for feature extraction using the VGGNet module at the user side and the remaining layers for private prediction over the cloud. The efficacy of the framework is validated on two datasets composed of histopathological images of the canine mammary tumor and human breast cancer. The application of differential privacy preserving to the proposed model makes the model secure and capable of preserving the privacy of sensitive data from any adversary, without significantly compromising the model accuracy. Extensive experiments show that the proposed model efficiently achieves the trade-off between privacy and model performance.


Sign in / Sign up

Export Citation Format

Share Document