Gradual Release of Sensitive Data under Differential Privacy

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>

Download Full-text

Privacy Preserving Machine Learning and Deep Learning Techniques

Handbook of Research on Applications and Implementations of Machine Learning Techniques - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-9902-9.ch012 ◽

2020 ◽

pp. 222-235

Author(s):

Divya Asok ◽

Chitra P. ◽

Bharathiraja Muthurajan

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Privacy ◽

Differential Privacy ◽

Digital Data ◽

Sensitive Data ◽

Security Issues ◽

Private Data ◽

Learning Techniques ◽

Machine Learning Applications

In the past years, the usage of internet and quantity of digital data generated by large organizations, firms, and governments have paved the way for the researchers to focus on security issues of private data. This collected data is usually related to a definite necessity. For example, in the medical field, health record systems are used for the exchange of medical data. In addition to services based on users' current location, many potential services rely on users' location history or their spatial-temporal provenance. However, most of the collected data contain data identifying individual which is sensitive. With the increase of machine learning applications around every corner of the society, it could significantly contribute to the preservation of privacy of both individuals and institutions. This chapter gives a wider perspective on the current literature on privacy ML and deep learning techniques, along with the non-cryptographic differential privacy approach for ensuring sensitive data privacy.

Download Full-text

A Non-Reversible Privacy Preservation Model for Outsourced High-Dimensional Healthcare Data

10.36227/techrxiv.15169089 ◽

2021 ◽

Author(s):

Syed Usama Khalid Bukhari ◽

Anum Qureshi ◽

Adeel Anjum ◽

Munam Ali Shah

Keyword(s):

Privacy Preservation ◽

Differential Privacy ◽

Personal Information ◽

High Dimensional Data ◽

High Dimensional ◽

Sensitive Data ◽

Privacy Concerns ◽

Healthcare Data ◽

Pros And Cons ◽

Privacy Breaches

<div> <div> <div> <p>Privacy preservation of high-dimensional healthcare data is an emerging problem. Privacy breaches are becoming more common than before and affecting thousands of people. Every individual has sensitive and personal information which needs protection and security. Uploading and storing data directly to the cloud without taking any precautions can lead to serious privacy breaches. It’s a serious struggle to publish a large amount of sensitive data while minimizing privacy concerns. This leads us to make crucial decisions for the privacy of outsourced high-dimensional healthcare data. Many types of privacy preservation techniques have been presented to secure high-dimensional data while keeping its utility and privacy at the same time but every technique has its pros and cons. In this paper, a novel privacy preservation NRPP model for high-dimensional data is proposed. The model uses a privacy-preserving generative technique for releasing sensitive data, which is deferentially private. The contribution of this paper is twofold. First, a state-of-the-art anonymization model for high-dimensional healthcare data is proposed using a generative technique. Second, achieved privacy is evaluated using the concept of differential privacy. The experiment shows that the proposed model performs better in terms of utility. </p> </div> </div> </div>

Download Full-text

A Non-Reversible Privacy Preservation Model for Outsourced High-Dimensional Healthcare Data

10.36227/techrxiv.15169089.v1 ◽

2021 ◽

Author(s):

Syed Usama Khalid Bukhari ◽

Anum Qureshi ◽

Adeel Anjum ◽

Munam Ali Shah

Keyword(s):

Privacy Preservation ◽

Differential Privacy ◽

Personal Information ◽

High Dimensional Data ◽

High Dimensional ◽

Sensitive Data ◽

Privacy Concerns ◽

Healthcare Data ◽

Pros And Cons ◽

Privacy Breaches

<div> <div> <div> <p>Privacy preservation of high-dimensional healthcare data is an emerging problem. Privacy breaches are becoming more common than before and affecting thousands of people. Every individual has sensitive and personal information which needs protection and security. Uploading and storing data directly to the cloud without taking any precautions can lead to serious privacy breaches. It’s a serious struggle to publish a large amount of sensitive data while minimizing privacy concerns. This leads us to make crucial decisions for the privacy of outsourced high-dimensional healthcare data. Many types of privacy preservation techniques have been presented to secure high-dimensional data while keeping its utility and privacy at the same time but every technique has its pros and cons. In this paper, a novel privacy preservation NRPP model for high-dimensional data is proposed. The model uses a privacy-preserving generative technique for releasing sensitive data, which is deferentially private. The contribution of this paper is twofold. First, a state-of-the-art anonymization model for high-dimensional healthcare data is proposed using a generative technique. Second, achieved privacy is evaluated using the concept of differential privacy. The experiment shows that the proposed model performs better in terms of utility. </p> </div> </div> </div>

Download Full-text

Differentially Private Group-by Data Releasing Algorithm

10.5753/sbbd.2019.8835 ◽

2019 ◽

Author(s):

Iago Chaves ◽

Javam Machado

Keyword(s):

Differential Privacy ◽

Information Leakage ◽

Sensitive Information ◽

Privacy Concerns ◽

Data Publication ◽

Private Group ◽

Individual Privacy ◽

Formal Definitions ◽

The World ◽

Privacy Level

Privacy concerns are growing fast because of data protection regulations around the world. Many works have built private algorithms avoiding sensitive information leakage through data publication. Differential privacy, based on formal definitions, is a strong guarantee for individual privacy and the cutting edge for designing private algorithms. This work proposes a differentially private group-by algorithm for data publication under the exponential mechanism. Our method publishes data groups according to a specified attribute while maintaining the desired privacy level and trustworthy utility results.

Download Full-text

Inductive learning and local differential privacy for privacy-preserving offloading in mobile edge intelligent systems

10.36227/techrxiv.13698883.v1 ◽

2021 ◽

Author(s):

Jude TCHAYE-KONDI ◽

Yanlong Zhai ◽

Liehuang Zhu

Keyword(s):

Intelligent Systems ◽

Differential Privacy ◽

Inductive Learning ◽

Random Noise ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Concerns ◽

Feature Extractor ◽

Data Source ◽

Series Of Experiments

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>

Download Full-text

PPDM-TAN: A Privacy-Preserving Multi-Party Classifier

Computation ◽

10.3390/computation9010006 ◽

2021 ◽

Vol 9 (1) ◽

pp. 6

Author(s):

Maria Eleni Skarkala ◽

Manolis Maragoudakis ◽

Stefanos Gritzalis ◽

Lilian Mitrou

Keyword(s):

Data Mining ◽

Private Information ◽

Security Analysis ◽

Privacy Preserving ◽

Data Mining Algorithm ◽

Bayes Classifier ◽

Sensitive Data ◽

Privacy Concerns ◽

Information Privacy Concerns ◽

Private Data

Distributed medical, financial, or social databases are analyzed daily for the discovery of patterns and useful information. Privacy concerns have emerged as some database segments contain sensitive data. Data mining techniques are used to parse, process, and manage enormous amounts of data while ensuring the preservation of private information. Cryptography, as shown by previous research, is the most accurate approach to acquiring knowledge while maintaining privacy. In this paper, we present an extension of a privacy-preserving data mining algorithm, thoroughly designed and developed for both horizontally and vertically partitioned databases, which contain either nominal or numeric attribute values. The proposed algorithm exploits the multi-candidate election schema to construct a privacy-preserving tree-augmented naive Bayesian classifier, a more robust variation of the classical naive Bayes classifier. The exploitation of the Paillier cryptosystem and the distinctive homomorphic primitive shows in the security analysis that privacy is ensured and the proposed algorithm provides strong defences against common attacks. Experiments deriving the benefits of real world databases demonstrate the preservation of private data while mining processes occur and the efficient handling of both database partition types.

Download Full-text

Inductive learning and local differential privacy for privacy-preserving offloading in mobile edge intelligent systems

10.36227/techrxiv.13698883.v2 ◽

2021 ◽

Author(s):

Jude TCHAYE-KONDI ◽

Yanlong Zhai ◽

Liehuang Zhu

Keyword(s):

Intelligent Systems ◽

Differential Privacy ◽

Inductive Learning ◽

Random Noise ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Concerns ◽

Feature Extractor ◽

Data Source ◽

Series Of Experiments

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>

Download Full-text

Privacy Preserving Machine Learning and Deep Learning Techniques

Research Anthology on Privatizing and Securing Data ◽

10.4018/978-1-7998-8954-0.ch078 ◽

2021 ◽

pp. 1621-1634

Author(s):

Divya Asok ◽

Chitra P. ◽

Bharathiraja Muthurajan

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Privacy ◽

Differential Privacy ◽

Digital Data ◽

Sensitive Data ◽

Security Issues ◽

Private Data ◽

Learning Techniques ◽

Machine Learning Applications

In the past years, the usage of internet and quantity of digital data generated by large organizations, firms, and governments have paved the way for the researchers to focus on security issues of private data. This collected data is usually related to a definite necessity. For example, in the medical field, health record systems are used for the exchange of medical data. In addition to services based on users' current location, many potential services rely on users' location history or their spatial-temporal provenance. However, most of the collected data contain data identifying individual which is sensitive. With the increase of machine learning applications around every corner of the society, it could significantly contribute to the preservation of privacy of both individuals and institutions. This chapter gives a wider perspective on the current literature on privacy ML and deep learning techniques, along with the non-cryptographic differential privacy approach for ensuring sensitive data privacy.

Download Full-text

A Novel Cloud-Assisted Secure Deep Feature Classification Framework for Cancer Histopathology Images

ACM Transactions on Internet Technology ◽

10.1145/3424221 ◽

2021 ◽

Vol 21 (2) ◽

pp. 1-22

Author(s):

Abhinav Kumar ◽

Sanjay Kumar Singh ◽

K Lakshmanan ◽

Sonal Saxena ◽

Sameer Shrivastava

Keyword(s):

Differential Privacy ◽

Model Performance ◽

Healthcare Services ◽

Cloud Services ◽

Efficient Computation ◽

Sensitive Data ◽

Classification Framework ◽

Proposed Model ◽

Diagnostic Framework ◽

Histopathological Image Classification

The advancements in the Internet of Things (IoT) and cloud services have enabled the availability of smart e-healthcare services in a distant and distributed environment. However, this has also raised major privacy and efficiency concerns that need to be addressed. While sharing clinical data across the cloud that often consists of sensitive patient-related information, privacy is a major challenge. Adequate protection of patients’ privacy helps to increase public trust in medical research. Additionally, DL-based models are complex, and in a cloud-based approach, efficient data processing in such models is complicated. To address these challenges, we propose an efficient and secure cancer diagnostic framework for histopathological image classification by utilizing both differential privacy and secure multi-party computation. For efficient computation, instead of performing the whole operation on the cloud, we decouple the layers into two modules: one for feature extraction using the VGGNet module at the user side and the remaining layers for private prediction over the cloud. The efficacy of the framework is validated on two datasets composed of histopathological images of the canine mammary tumor and human breast cancer. The application of differential privacy preserving to the proposed model makes the model secure and capable of preserving the privacy of sensitive data from any adversary, without significantly compromising the model accuracy. Extensive experiments show that the proposed model efficiently achieves the trade-off between privacy and model performance.

Download Full-text