Study on the Technical Evaluation of De-Identification Procedures for Personal Data in the Automotive Sector

Mapping Intimacies ◽

10.21248/gups.63413 ◽

2021 ◽

Author(s):

Kai Rannenberg ◽

Sebastian Pape ◽

Frédéric Tronnier ◽

Sascha Löbner

Keyword(s):

Data Privacy ◽

Differential Privacy ◽

Homomorphic Encryption ◽

Personal Data ◽

Legal Aspects ◽

Use Cases ◽

Use Case ◽

User Privacy ◽

External Data ◽

Identification Techniques

The aim of this study was to identify and evaluate different de-identification techniques that may be used in several mobility-related use cases. To do so, four use cases have been defined in accordance with a project partner that focused on the legal aspects of this project, as well as with the VDA/FAT working group. Each use case aims to create different legal and technical issues with regards to the data and information that are to be gathered, used and transferred in the specific scenario. Use cases should therefore differ in the type and frequency of data that is gathered as well as the level of privacy and the speed of computation that is needed for the data. Upon identifying use cases, a systematic literature review has been performed to identify suitable de-identification techniques to provide data privacy. Additionally, external databases have been considered as data that is expected to be anonymous might be reidentified through the combination of existing data with such external data. For each case, requirements and possible attack scenarios were created to illustrate where exactly privacy-related issues could occur and how exactly such issues could impact data subjects, data processors or data controllers. Suitable de-identification techniques should be able to withstand these attack scenarios. Based on a series of additional criteria, de-identification techniques are then analyzed for each use case. Possible solutions are then discussed individually in chapters 6.1 - 6.2. It is evident that no one-size-fits-all approach to protect privacy in the mobility domain exists. While all techniques that are analyzed in detail in this report, e.g., homomorphic encryption, differential privacy, secure multiparty computation and federated learning, are able to successfully protect user privacy in certain instances, their overall effectiveness differs depending on the specifics of each use case.

Download Full-text

Clouded data: Privacy and the promise of encryption

Big Data & Society ◽

10.1177/2053951719848781 ◽

2019 ◽

Vol 6 (1) ◽

pp. 205395171984878

Author(s):

Luke Munn ◽

Tsvetelina Hristova ◽

Liam Magee

Keyword(s):

Data Privacy ◽

Differential Privacy ◽

Homomorphic Encryption ◽

Personal Data ◽

Individual Identification ◽

Student Survey ◽

Public And Private ◽

The Public ◽

Synthetic Datasets ◽

Computational Resources

Personal data is highly vulnerable to security exploits, spurring moves to lock it down through encryption, to cryptographically ‘cloud’ it. But personal data is also highly valuable to corporations and states, triggering moves to unlock its insights by relocating it in the cloud. We characterise this twinned condition as ‘clouded data’. Clouded data constructs a political and technological notion of privacy that operates through the intersection of corporate power, computational resources and the ability to obfuscate, gain insights from and valorise a dependency between public and private. First, we survey prominent clouded data approaches (blockchain, multiparty computation, differential privacy, and homomorphic encryption), suggesting their particular affordances produce distinctive versions of privacy. Next, we perform two notional code-based experiments using synthetic datasets. In the field of health, we submit a patient’s blood pressure to a notional cloud-based diagnostics service; in education, we construct a student survey that enables aggregate reporting without individual identification. We argue that these technical affordances legitimate new political claims to capture and commodify personal data. The final section broadens the discussion to consider the political force of clouded data and its reconstitution of traditional notions such as the public and the private.

Download Full-text

Achieving Incentive, Security, and Scalable Privacy Protection in Mobile Crowdsensing Services

Wireless Communications and Mobile Computing ◽

10.1155/2018/8959635 ◽

2018 ◽

Vol 2018 ◽

pp. 1-12 ◽

Cited By ~ 11

Author(s):

Jinbo Xiong ◽

Rong Ma ◽

Lei Chen ◽

Youliang Tian ◽

Li Lin ◽

...

Keyword(s):

Data Privacy ◽

Differential Privacy ◽

Homomorphic Encryption ◽

Security Analysis ◽

Mobile Crowdsensing ◽

Sensing Data ◽

Noise Data ◽

And Performance ◽

Further Development ◽

The Internet Of Things

Mobile crowdsensing as a novel service schema of the Internet of Things (IoT) provides an innovative way to implement ubiquitous social sensing. How to establish an effective mechanism to improve the participation of sensing users and the authenticity of sensing data, protect the users’ data privacy, and prevent malicious users from providing false data are among the urgent problems in mobile crowdsensing services in IoT. These issues raise a gargantuan challenge hindering the further development of mobile crowdsensing. In order to tackle the above issues, in this paper, we propose a reliable hybrid incentive mechanism for enhancing crowdsensing participations by encouraging and stimulating sensing users with both reputation and service returns in mobile crowdsensing tasks. Moreover, we propose a privacy preserving data aggregation scheme, where the mediator and/or sensing users may not be fully trusted. In this scheme, differential privacy mechanism is utilized through allowing different sensing users to add noise data, then employing homomorphic encryption for protecting the sensing data, and finally uploading ciphertext to the mediator, who is able to obtain the collection of ciphertext of the sensing data without actual decryption. Even in the case of partial sensing data leakage, differential privacy mechanism can still ensure the security of the sensing user’s privacy. Finally, we introduce a novel secure multiparty auction mechanism based on the auction game theory and secure multiparty computation, which effectively solves the problem of prisoners’ dilemma incurred in the sensing data transaction between the service provider and mediator. Security analysis and performance evaluation demonstrate that the proposed scheme is secure and efficient.

Download Full-text

Towards Understanding the Risks of Gradient Inversion in Federated Learning

10.21203/rs.3.rs-1147182/v2 ◽

2021 ◽

Author(s):

Ali Hatamizadeh ◽

Hongxu Yin ◽

Pavlo Molchanov ◽

Andriy Myronenko ◽

Wenqi Li ◽

...

Keyword(s):

Neural Networks ◽

Data Privacy ◽

Deep Neural Networks ◽

Differential Privacy ◽

Use Cases ◽

Training Data ◽

Model Accuracy ◽

Healthcare Applications ◽

Raw Data ◽

Collaborative Training

Abstract Federated learning (FL) allows the collaborative training of AI models without needing to share raw data. This capability makes it especially interesting for healthcare applications where patient and data privacy is of utmost concern. However, recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data. In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack that works for more realistic scenarios where the clients’ training involves updating the Batch Normalization (BN) statistics. Furthermore, we present new ways to measure and visualize potential data leakage in FL. Our work is a step towards establishing reproducible methods of measuring data leakage in FL and could help determine the optimal tradeoffs between privacy-preserving techniques, such as differential privacy, and model accuracy based on quantifiable metrics.

Download Full-text

Data confidentiality and computations hiding in cloud services for public administration

Computer Science and Mathematical Modelling ◽

10.5604/01.3001.0012.2001 ◽

2018 ◽

Vol 0 (7/2018) ◽

pp. 11-18

Author(s):

Aleksandra Horubała ◽

Daniel Waszkiewicz ◽

Michał Andrzejczak ◽

Piotr Sapiecha

Keyword(s):

Public Administration ◽

Data Privacy ◽

Homomorphic Encryption ◽

Personal Data ◽

Cloud Services ◽

Security And Privacy ◽

Data Confidentiality ◽

Encryption Schemes

Cloud services are gaining interest and are very interesting option for public administration. Although, there is a lot of concern about security and privacy of storing personal data in cloud. In this work mathematical tools for securing data and hiding computations are presented. Data privacy is obtained by using homomorphic encryption schemes. Computation hiding is done by algorithm cryptographic obfuscation. Both primitives are presented and their application for public administration is discussed.

Download Full-text

Towards Understanding the Risks of Gradient Inversion in Federated Learning

10.21203/rs.3.rs-1147182/v1 ◽

2021 ◽

Author(s):

Ali Hatamizadeh ◽

Hongxu Yin ◽

Pavlo Molchanov ◽

Andriy Myronenko ◽

Wenqi Li ◽

...

Keyword(s):

Neural Networks ◽

Data Privacy ◽

Deep Neural Networks ◽

Differential Privacy ◽

Use Cases ◽

Training Data ◽

Model Accuracy ◽

Healthcare Applications ◽

Raw Data ◽

Collaborative Training

Download Full-text

Designing a Distributed Ledger Technology System for Interoperable and General Data Protection Regulation–Compliant Health Data Exchange: A Use Case in Blood Glucose Data (Preprint)

10.2196/preprints.13665 ◽

2019 ◽

Author(s):

David Hawig ◽

Chao Zhou ◽

Sebastian Fuhrhop ◽

Andre S Fialho ◽

Navin Ramachandran

Keyword(s):

Blood Glucose ◽

Data Privacy ◽

Data Exchange ◽

Personal Data ◽

Use Case ◽

Concept System ◽

General Data Protection Regulation ◽

The Public ◽

Distributed Ledger ◽

General Data

BACKGROUND Distributed ledger technology (DLT) holds great potential to improve health information exchange. However, the immutable and transparent character of this technology may conflict with data privacy regulations and data processing best practices. OBJECTIVE The aim of this paper is to develop a proof-of-concept system for immutable, interoperable, and General Data Protection Regulation (GDPR)–compliant exchange of blood glucose data. METHODS Given that there is no ideal design for a DLT-based patient-provider data exchange solution, we proposed two different variations for our proof-of-concept system. One design was based purely on the public IOTA distributed ledger (a directed acyclic graph-based DLT) and the second used the same public IOTA ledger in combination with a private InterPlanetary File System (IPFS) cluster. Both designs were assessed according to (1) data reversal risk, (2) data linkability risks, (3) processing time, (4) file size compatibility, and (5) overall system complexity. RESULTS The public IOTA design slightly increased the risk of personal data linkability, had an overall low processing time (requiring mean 6.1, SD 1.9 seconds to upload one blood glucose data sample into the DLT), and was relatively simple to implement. The combination of the public IOTA with a private IPFS cluster minimized both reversal and linkability risks, allowed for the exchange of large files (3 months of blood glucose data were uploaded into the DLT in mean 38.1, SD 13.4 seconds), but involved a relatively higher setup complexity. CONCLUSIONS For the specific use case of blood glucose explored in this study, both designs presented a suitable performance in enabling the interoperable exchange of data between patients and providers. Additionally, both systems were designed considering the latest guidelines on personal data processing, thereby maximizing the alignment with recent GDPR requirements. For future works, these results suggest that the conflict between DLT and data privacy regulations can be addressed if careful considerations are made regarding the use case and the design of the data exchange system.

Download Full-text

Towards formalizing the GDPR’s notion of singling out

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1914598117 ◽

2020 ◽

Vol 117 (15) ◽

pp. 8344-8352 ◽

Cited By ~ 1

Author(s):

Aloni Cohen ◽

Kobbi Nissim

Keyword(s):

Data Privacy ◽

Differential Privacy ◽

Mathematical Thinking ◽

Personal Data ◽

Mathematical Concept ◽

Release Mechanism ◽

General Data Protection Regulation ◽

Legal Standards ◽

Statistical Disclosure Limitation ◽

Data Release

There is a significant conceptual gap between legal and mathematical thinking around data privacy. The effect is uncertainty as to which technical offerings meet legal standards. This uncertainty is exacerbated by a litany of successful privacy attacks demonstrating that traditional statistical disclosure limitation techniques often fall short of the privacy envisioned by regulators. We define “predicate singling out,” a type of privacy attack intended to capture the concept of singling out appearing in the General Data Protection Regulation (GDPR). An adversary predicate singles out a dataset x using the output of a data-release mechanism M(x) if it finds a predicate p matching exactly one row in x with probability much better than a statistical baseline. A data-release mechanism that precludes such attacks is “secure against predicate singling out” (PSO secure). We argue that PSO security is a mathematical concept with legal consequences. Any data-release mechanism that purports to “render anonymous” personal data under the GDPR must prevent singling out and, hence, must be PSO secure. We analyze the properties of PSO security, showing that it fails to compose. Namely, a combination of more than logarithmically many exact counts, each individually PSO secure, facilitates predicate singling out. Finally, we ask whether differential privacy and k-anonymity are PSO secure. Leveraging a connection to statistical generalization, we show that differential privacy implies PSO security. However, and in contrast with current legal guidance, k-anonymity does not: There exists a simple predicate singling out attack under mild assumptions on the k-anonymizer and the data distribution.

Download Full-text

Design Preference Prediction With Data Privacy Safeguards: A Preliminary Study

Volume 7: 29th International Conference on Design Theory and Methodology ◽

10.1115/detc2017-68366 ◽

2017 ◽

Author(s):

Alexander Burnap ◽

Panos Y. Papalambros

Keyword(s):

Data Privacy ◽

Personal Data ◽

Population Level ◽

Smart Device ◽

Design Development ◽

User Privacy ◽

User Data ◽

Preference Models ◽

Design Preference ◽

Device Use

Design preference models are used widely in product planning and design development. Their prediction accuracy requires large amounts of personal user data including purchase and other personal choice records. With increased Internet and smart device use, sources of personal data are becoming more varied and their capture more ubiquitous. This situation leads to questioning whether there is a trade off between improving products and compromising individual user privacy. To advance this conversation, we analyze how privacy safeguards may affect design preference modeling. We conduct an experiment using real user data to study the performance of design preference models under different levels of privacy. Results indicate there is a tradeoff between accuracy and privacy. However, with enough data, models with privacy safeguards can still be sufficiently accurate to answer population-level design questions.

Download Full-text

A homomorphic-encryption-based vertical federated learning scheme for rick management

Computer Science and Information Systems ◽

10.2298/csis190923022o ◽

2020 ◽

Vol 17 (3) ◽

pp. 819-834

Author(s):

Wei Ou ◽

Jianhuan Zeng ◽

Zijun Guo ◽

Wanqin Yan ◽

Dingwan Liu ◽

...

Keyword(s):

Artificial Intelligence ◽

Data Privacy ◽

Homomorphic Encryption ◽

Rapid Development ◽

Financial Education ◽

Learning System ◽

User Privacy ◽

Massive Growth ◽

User Data ◽

User Data Privacy

With continuous improvements of computing power, great progresses in algorithms and massive growth of data, artificial intelligence technologies have entered the third rapid development era. However, With the great improvements in artificial intelligence and the arrival of the era of big data, contradictions between data sharing and user data privacy have become increasingly prominent. Federated learning is a technology that can ensure the user privacy and train a better model from different data providers. In this paper, we design a vertical federated learning system for the for Bayesian machine learning with the homomorphic encryption. During the training progress, raw data are leaving locally, and encrypted model information is exchanged. The model trained by this system is comparable (up to 90%) to those models trained by a single union server under the consideration of privacy. This system can be widely used in risk control, medical, financial, education and other fields. It is of great significance to solve data islands problem and protect users? privacy.

Download Full-text

Is Homomorphic Encryption-Based Deep Learning Secure Enough?

Sensors ◽

10.3390/s21237806 ◽

2021 ◽

Vol 21 (23) ◽

pp. 7806

Author(s):

Jinmyeong Shin ◽

Seok-Hwan Choi ◽

Yoon-Ho Choi

Keyword(s):

Deep Learning ◽

Data Privacy ◽

Homomorphic Encryption ◽

Communication Link ◽

Sensitive Information ◽

Learning Technology ◽

User Privacy ◽

Security Vulnerabilities ◽

Adversarial Attack ◽

Privacy Problem

As the amount of data collected and analyzed by machine learning technology increases, data that can identify individuals is also being collected in large quantities. In particular, as deep learning technology—which requires a large amount of analysis data—is activated in various service fields, the possibility of exposing sensitive information of users increases, and the user privacy problem is growing more than ever. As a solution to this user’s data privacy problem, homomorphic encryption technology, which is an encryption technology that supports arithmetic operations using encrypted data, has been applied to various field including finance and health care in recent years. If so, is it possible to use the deep learning service while preserving the data privacy of users by using the data to which homomorphic encryption is applied? In this paper, we propose three attack methods to infringe user’s data privacy by exploiting possible security vulnerabilities in the process of using homomorphic encryption-based deep learning services for the first time. To specify and verify the feasibility of exploiting possible security vulnerabilities, we propose three attacks: (1) an adversarial attack exploiting communication link between client and trusted party; (2) a reconstruction attack using the paired input and output data; and (3) a membership inference attack by malicious insider. In addition, we describe real-world exploit scenarios for financial and medical services. From the experimental evaluation results, we show that the adversarial example and reconstruction attacks are a practical threat to homomorphic encryption-based deep learning models. The adversarial attack decreased average classification accuracy from 0.927 to 0.043, and the reconstruction attack showed average reclassification accuracy of 0.888, respectively.

Download Full-text