Differentially Private Group-by Data Releasing Algorithm

Mapping Intimacies ◽

10.5753/sbbd.2019.8835 ◽

2019 ◽

Author(s):

Iago Chaves ◽

Javam Machado

Keyword(s):

Differential Privacy ◽

Information Leakage ◽

Sensitive Information ◽

Privacy Concerns ◽

Data Publication ◽

Private Group ◽

Individual Privacy ◽

Formal Definitions ◽

The World ◽

Privacy Level

Privacy concerns are growing fast because of data protection regulations around the world. Many works have built private algorithms avoiding sensitive information leakage through data publication. Differential privacy, based on formal definitions, is a strong guarantee for individual privacy and the cutting edge for designing private algorithms. This work proposes a differentially private group-by algorithm for data publication under the exponential mechanism. Our method publishes data groups according to a specified attribute while maintaining the desired privacy level and trustworthy utility results.

Download Full-text

Preserving Differential Privacy for Similarity Measurement in Smart Environments

The Scientific World JOURNAL ◽

10.1155/2014/581426 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Kok-Seng Wong ◽

Myung Ho Kim

Keyword(s):

Privacy Protection ◽

Differential Privacy ◽

Smart Environments ◽

Coefficient Function ◽

Sensitive Information ◽

Smart Environment ◽

Privacy Concerns ◽

Privacy Model ◽

The Subject ◽

Measurement Metric

Advances in both sensor technologies and network infrastructures have encouraged the development of smart environments to enhance people’s life and living styles. However, collecting and storing user’s data in the smart environments pose severe privacy concerns because these data may contain sensitive information about the subject. Hence, privacy protection is now an emerging issue that we need to consider especially when data sharing is essential for analysis purpose. In this paper, we consider the case where two agents in the smart environment want to measure the similarity of their collected or stored data. We use similarity coefficient functionFSCas the measurement metric for the comparison with differential privacy model. Unlike the existing solutions, our protocol can facilitate more than one request to computeFSCwithout modifying the protocol. Our solution ensures privacy protection for both the inputs and the computedFSCresults.

Download Full-text

Gradual Release of Sensitive Data under Differential Privacy

Journal of Privacy and Confidentiality ◽

10.29012/jpc.v7i2.649 ◽

2017 ◽

Vol 7 (2) ◽

Author(s):

Fragkiskos Koufogiannis ◽

Shuo Han ◽

George J. Pappas

Keyword(s):

Stochastic Process ◽

Differential Privacy ◽

Closed Form Expression ◽

Sensitive Data ◽

Privacy Concerns ◽

Private Data ◽

Single Response ◽

Multiple Scenarios ◽

Gradual Release ◽

Privacy Level

We introduce the problem of releasing private data under differential privacy when the privacy level is subject to change over time. Existing work assumes that privacy level is determined by the system designer as a fixed value before private data is released. For certain applications, however, users may wish to relax the privacy level for subsequent releases of the same data after either a re-evaluation of the privacy concerns or the need for better accuracy. Specifically, given a database containing private data, we assume that a response y1 that preserves \( \epsilon _1\)-differential privacy has already been published. Then, the privacy level is relaxed to \( \epsilon _2\), with \( \epsilon _2 > \epsilon _1\), and we wish to publish a more accurate response y2 while the joint response (y1,y2) preserves \( \epsilon _2\)-differential privacy. How much accuracy is lost in the scenario of gradually releasing two responses y1 and y2 compared to the scenario of releasing a single response that is \( \epsilon _2\)-differentially private? Our results consider the more general case with multiple privacy level relaxations and show that there exists a composite mechanism that achieves no loss in accuracy. We consider the case in which the private data lies within Rn with an adjacency relation induced by the \( \ell _1\)-norm, and we initially focus on mechanisms that approximate identity queries. We show that the same accuracy can be achieved in the case of gradual release through a mechanism whose outputs can be described by a lazy Markov stochastic process. This stochastic process has a closed form expression and can be efficiently sampled. Moreover, our results extend beyond identity queries to a more general family of privacy-preserving mechanisms. To this end, we demonstrate the applicability of our tool to multiple scenarios including Google’s project RAPPOR, trading of private data, and controlled transmission of private data in a social network. Finally, we derive similar results for the approximated differential privacy.

Download Full-text

Differential Privacy for Evolving Network Based on GHRG

Mathematical Problems in Engineering ◽

10.1155/2020/6783949 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Jing Yang ◽

Yuye Wang ◽

Jianpei Zhang

Keyword(s):

Differential Privacy ◽

Dynamic Environment ◽

Sensitive Information ◽

Data Utility ◽

High Data ◽

Evolving Networks ◽

Calculating Method ◽

Individual Privacy ◽

Local Areas ◽

Privacy Analysis

Releasing evolving networks which contain sensitive information could compromise individual privacy. In this paper, we study the problem of releasing evolving networks under differential privacy. We explore the possibility of designing a differentially private evolving networks releasing algorithm. We found that the majority of traditional methods provide a snapshot of the networks under differential privacy over a brief period of time. As the network structure only changes in local part, the amount of required noise entirely is large and it leads to an inefficient utility. To this end, we propose GHRG-DP, a novel differentially private evolving networks releasing algorithm which reduces the noise scale and achieves high data utility. In the GHRG-DP algorithm, we learn the online connection probabilities between vertices in the evolving networks by generalized hierarchical random graph (GHRG) model. To fit the dynamic environment, a dendrogram structure adjusting method in local areas is proposed to reduce the noise scale in the whole period of time. Moreover, to avoid the unhelpful outcome of the connection probabilities, a Bayesian noisy probabilities calculating method is proposed. Through formal privacy analysis, we show that the GHRG-DP algorithm is ε -differentially private. Experiments on real evolving network datasets illustrate that GHRG-DP algorithm can privately release evolving networks with high accuracy.

Download Full-text

Privacy-preserving of patients with Differential Privacy: an experimental evaluation in COVID-19 dataset

Journal of Information and Data Management ◽

10.5753/jidm.2021.1947 ◽

2021 ◽

Vol 12 (5) ◽

Author(s):

Manuel E. B. Filho ◽

Eduardo R. Duarte Neto ◽

Javam C. Machado

Keyword(s):

Health Care Providers ◽

Private Information ◽

Data Privacy ◽

Differential Privacy ◽

Health Data ◽

Care Providers ◽

Privacy Concerns ◽

Effective Strategies ◽

The World ◽

New Challenges

The pandemic of the new coronavirus (COVID-19) has brought new challenges to health systems in almost every corner of the world, many of them overburdened. The data analysis has given support in the fight against the coronavirus. Through this analysis, government authorities, together with health care providers, adopted effective strategies. Yet, those strategies can not be careless of privacy concerns. The individuals’ privacy is a right of each citizen. Privacy techniques guarantee the analysis of health data without exposing individuals’ private information. However, a balance between data privacy and utility is essential for a good analysis of the data. This work will demonstrate that it is possible to guarantee the privacy of infected patients and maintain the utility of the data, allowing a sound analysis on them, from the visualization of the application of differentially private mechanisms on queries in the data of patients tested in the State of Ceará - Brazil.

Download Full-text

A Multi-view Approach to Preserve Privacy and Utility in Network Trace Anonymization

ACM Transactions on Privacy and Security ◽

10.1145/3439732 ◽

2021 ◽

Vol 24 (3) ◽

pp. 1-36

Author(s):

Meisam Mohammady ◽

Momen Oqaily ◽

Lingyu Wang ◽

Yuan Hong ◽

Habib Louafi ◽

...

Keyword(s):

Prior Knowledge ◽

Network Flows ◽

Computational Cost ◽

Information Leakage ◽

Significant Loss ◽

Third Party ◽

Sensitive Information ◽

Privacy Concerns ◽

Data Sanitization ◽

Data Owner

As network security monitoring grows more sophisticated, there is an increasing need for outsourcing such tasks to third-party analysts. However, organizations are usually reluctant to share their network traces due to privacy concerns over sensitive information, e.g., network and system configuration, which may potentially be exploited for attacks. In cases where data owners are convinced to share their network traces, the data are typically subjected to certain anonymization techniques, e.g., CryptoPAn, which replaces real IP addresses with prefix-preserving pseudonyms. However, most such techniques either are vulnerable to adversaries with prior knowledge about some network flows in the traces or require heavy data sanitization or perturbation, which may result in a significant loss of data utility. In this article, we aim to preserve both privacy and utility through shifting the trade-off from between privacy and utility to between privacy and computational cost. The key idea is for the analysts to generate and analyze multiple anonymized views of the original network traces: Those views are designed to be sufficiently indistinguishable even to adversaries armed with prior knowledge, which preserves the privacy, whereas one of the views will yield true analysis results privately retrieved by the data owner, which preserves the utility. We formally analyze the privacy of our solution and experimentally evaluate it using real network traces provided by a major ISP. The experimental results show that our approach can significantly reduce the level of information leakage (e.g., less than 1% of the information leaked by CryptoPAn) with comparable utility.

Download Full-text

Effective Privacy-Preserving Collection of Health Data from a User’s Wearable Device

Applied Sciences ◽

10.3390/app10186396 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6396

Author(s):

Jong Wook Kim ◽

Su-Mee Moon ◽

Sang-ug Kang ◽

Beakcheol Jang

Keyword(s):

Differential Privacy ◽

Service Providers ◽

Healthcare Services ◽

Wearable Devices ◽

Privacy Preserving ◽

Health Data ◽

Experimental Results ◽

Sensitive Information ◽

Privacy Concerns ◽

Primary Means

The popularity of wearable devices equipped with a variety of sensors that can measure users’ health status and monitor their lifestyle has been increasing. In fact, healthcare service providers have been utilizing these devices as a primary means to collect considerable health data from users. Although the health data collected via wearable devices are useful for providing healthcare services, the indiscriminate collection of an individual’s health data raises serious privacy concerns. This is because the health data measured and monitored by wearable devices contain sensitive information related to the wearer’s personal health and lifestyle. Therefore, we propose a method to aggregate health data obtained from users’ wearable devices in a privacy-preserving manner. The proposed method leverages local differential privacy, which is a de facto standard for privacy-preserving data processing and aggregation, to collect sensitive health data. In particular, to mitigate the error incurred by the perturbation mechanism of location differential privacy, the proposed scheme first samples a small number of salient data that best represents the original health data, after which the scheme collects the sampled salient data instead of the entire set of health data. Our experimental results show that the proposed sampling-based collection scheme achieves significant improvement in the estimated accuracy when compared with straightforward solutions. Furthermore, the experimental results verify that an effective tradeoff between the level of privacy protection and the accuracy of aggregate statistics can be achieved with the proposed approach.

Download Full-text

Differential Privacy under Dependent Tuples - The Case of Genomic Privacy

Bioinformatics ◽

10.1093/bioinformatics/btz837 ◽

2019 ◽

Author(s):

Nour Almadhoun ◽

Erman Ayday ◽

Özgür Ulusoy

Keyword(s):

Differential Privacy ◽

Genomic Data ◽

Privacy Preserving ◽

Supplementary Information ◽

Sensitive Information ◽

Genomic Databases ◽

Privacy Concerns ◽

Rigorous Approach ◽

Genomic Studies ◽

Inference Attack

Abstract Motivation The rapid progress in genome sequencing has led to high availability of genomic data. However, due to growing privacy concerns about the participant’s sensitive information, accessing results and data of genomic studies is restricted to only trusted individuals. On the other hand, paving the way to biomedical discoveries requires granting open access to genomic databases. Privacy-preserving mechanisms can be a solution for granting wider access to such data while protecting their owners. In particular, there has been growing interest in applying the concept of differential privacy (DP) while sharing summary statistics about genomic data. DP provides a mathematically rigorous approach but it does not consider the dependence between tuples in a database, which may degrade the privacy guarantees offered by the DP. Results In this work, focusing on genomic databases, we show this drawback of DP and we propose techniques to mitigate it. First, using a real-world genomic dataset, we demonstrate the feasibility of an inference attack on differentially private query results by utilizing the correlations between the tuples in the dataset. The results show that the adversary can infer sensitive genomic data about a user from the differentially private query results by exploiting correlations between genomes of family members. Second, we propose a mechanism for privacy-preserving sharing of statistics from genomic datasets to attain privacy guarantees while taking into consideration the dependence between tuples. By evaluating our mechanism on different genomic datasets, we empirically demonstrate that our proposed mechanism can achieve up to 50% better privacy than traditional DP-based solutions. Availability https://github.com/nourmadhoun/Differential-privacy-genomic-inference-attack. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Achieving correlated differential privacy of big data publication

Computers & Security ◽

10.1016/j.cose.2018.12.017 ◽

2019 ◽

Vol 82 ◽

pp. 184-195 ◽

Cited By ~ 6

Author(s):

Denglong Lv ◽

Shibing Zhu

Keyword(s):

Big Data ◽

Differential Privacy ◽

Data Publication

Download Full-text

DIFFERENTIAL PRIVACY FOR ENERGY DATA PUBLICATION

10.1049/icp.2021.1688 ◽

2021 ◽

Author(s):

G. Agoua ◽

P. Cauchois ◽

O. Chaouy ◽

I. Gazeau ◽

B. Grossin

Keyword(s):

Differential Privacy ◽

Energy Data ◽

Data Publication

Download Full-text

Manipulation Attacks in Local Differential Privacy

Journal of Privacy and Confidentiality ◽

10.29012/jpc.754 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Albert Cheu ◽

Adam Smith ◽

Jonathan Ullman

Keyword(s):

Systematic Study ◽

Differential Privacy ◽

Domain Size ◽

User Data ◽

Large Systems ◽

Cryptographic Techniques ◽

Fundamental Limitation ◽

Theoretical Results ◽

Different Levels ◽

Privacy Level

Local differential privacy is a widely studied restriction on distributed algorithms that collect aggregates about sensitive user data, and is now deployed in several large systems. We initiate a systematic study of a fundamental limitation of locally differentially private protocols: they are highly vulnerable to adversarial manipulation. While any algorithm can be manipulated by adversaries who lie about their inputs, we show that any noninteractive locally differentially private protocol can be manipulated to a much greater extent---when the privacy level is high, or the domain size is large, a small fraction of users in the protocol can completely obscure the distribution of the honest users' input. We also construct protocols that are optimally robust to manipulation for a variety of common tasks in local differential privacy. Finally, we give simple experiments validating our theoretical results, and demonstrating that protocols that are optimal without manipulation can have dramatically different levels of robustness to manipulation. Our results suggest caution when deploying local differential privacy and reinforce the importance of efficient cryptographic techniques for the distributed emulation of centrally differentially private mechanisms.

Download Full-text