scholarly journals Study on the Technical Evaluation of De-Identification Procedures for Personal Data in the Automotive Sector

2021 ◽  
Author(s):  
Kai Rannenberg ◽  
Sebastian Pape ◽  
Frédéric Tronnier ◽  
Sascha Löbner

The aim of this study was to identify and evaluate different de-identification techniques that may be used in several mobility-related use cases. To do so, four use cases have been defined in accordance with a project partner that focused on the legal aspects of this project, as well as with the VDA/FAT working group. Each use case aims to create different legal and technical issues with regards to the data and information that are to be gathered, used and transferred in the specific scenario. Use cases should therefore differ in the type and frequency of data that is gathered as well as the level of privacy and the speed of computation that is needed for the data. Upon identifying use cases, a systematic literature review has been performed to identify suitable de-identification techniques to provide data privacy. Additionally, external databases have been considered as data that is expected to be anonymous might be reidentified through the combination of existing data with such external data. For each case, requirements and possible attack scenarios were created to illustrate where exactly privacy-related issues could occur and how exactly such issues could impact data subjects, data processors or data controllers. Suitable de-identification techniques should be able to withstand these attack scenarios. Based on a series of additional criteria, de-identification techniques are then analyzed for each use case. Possible solutions are then discussed individually in chapters 6.1 - 6.2. It is evident that no one-size-fits-all approach to protect privacy in the mobility domain exists. While all techniques that are analyzed in detail in this report, e.g., homomorphic encryption, differential privacy, secure multiparty computation and federated learning, are able to successfully protect user privacy in certain instances, their overall effectiveness differs depending on the specifics of each use case.

2019 ◽  
Vol 6 (1) ◽  
pp. 205395171984878
Author(s):  
Luke Munn ◽  
Tsvetelina Hristova ◽  
Liam Magee

Personal data is highly vulnerable to security exploits, spurring moves to lock it down through encryption, to cryptographically ‘cloud’ it. But personal data is also highly valuable to corporations and states, triggering moves to unlock its insights by relocating it in the cloud. We characterise this twinned condition as ‘clouded data’. Clouded data constructs a political and technological notion of privacy that operates through the intersection of corporate power, computational resources and the ability to obfuscate, gain insights from and valorise a dependency between public and private. First, we survey prominent clouded data approaches (blockchain, multiparty computation, differential privacy, and homomorphic encryption), suggesting their particular affordances produce distinctive versions of privacy. Next, we perform two notional code-based experiments using synthetic datasets. In the field of health, we submit a patient’s blood pressure to a notional cloud-based diagnostics service; in education, we construct a student survey that enables aggregate reporting without individual identification. We argue that these technical affordances legitimate new political claims to capture and commodify personal data. The final section broadens the discussion to consider the political force of clouded data and its reconstitution of traditional notions such as the public and the private.


2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Jinbo Xiong ◽  
Rong Ma ◽  
Lei Chen ◽  
Youliang Tian ◽  
Li Lin ◽  
...  

Mobile crowdsensing as a novel service schema of the Internet of Things (IoT) provides an innovative way to implement ubiquitous social sensing. How to establish an effective mechanism to improve the participation of sensing users and the authenticity of sensing data, protect the users’ data privacy, and prevent malicious users from providing false data are among the urgent problems in mobile crowdsensing services in IoT. These issues raise a gargantuan challenge hindering the further development of mobile crowdsensing. In order to tackle the above issues, in this paper, we propose a reliable hybrid incentive mechanism for enhancing crowdsensing participations by encouraging and stimulating sensing users with both reputation and service returns in mobile crowdsensing tasks. Moreover, we propose a privacy preserving data aggregation scheme, where the mediator and/or sensing users may not be fully trusted. In this scheme, differential privacy mechanism is utilized through allowing different sensing users to add noise data, then employing homomorphic encryption for protecting the sensing data, and finally uploading ciphertext to the mediator, who is able to obtain the collection of ciphertext of the sensing data without actual decryption. Even in the case of partial sensing data leakage, differential privacy mechanism can still ensure the security of the sensing user’s privacy. Finally, we introduce a novel secure multiparty auction mechanism based on the auction game theory and secure multiparty computation, which effectively solves the problem of prisoners’ dilemma incurred in the sensing data transaction between the service provider and mediator. Security analysis and performance evaluation demonstrate that the proposed scheme is secure and efficient.


2021 ◽  
Author(s):  
Ali Hatamizadeh ◽  
Hongxu Yin ◽  
Pavlo Molchanov ◽  
Andriy Myronenko ◽  
Wenqi Li ◽  
...  

Abstract Federated learning (FL) allows the collaborative training of AI models without needing to share raw data. This capability makes it especially interesting for healthcare applications where patient and data privacy is of utmost concern. However, recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data. In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack that works for more realistic scenarios where the clients’ training involves updating the Batch Normalization (BN) statistics. Furthermore, we present new ways to measure and visualize potential data leakage in FL. Our work is a step towards establishing reproducible methods of measuring data leakage in FL and could help determine the optimal tradeoffs between privacy-preserving techniques, such as differential privacy, and model accuracy based on quantifiable metrics.


2018 ◽  
Vol 0 (7/2018) ◽  
pp. 11-18
Author(s):  
Aleksandra Horubała ◽  
Daniel Waszkiewicz ◽  
Michał Andrzejczak ◽  
Piotr Sapiecha

Cloud services are gaining interest and are very interesting option for public administration. Although, there is a lot of concern about security and privacy of storing personal data in cloud. In this work mathematical tools for securing data and hiding computations are presented. Data privacy is obtained by using homomorphic encryption schemes. Computation hiding is done by algorithm cryptographic obfuscation. Both primitives are presented and their application for public administration is discussed.


2021 ◽  
Author(s):  
Ali Hatamizadeh ◽  
Hongxu Yin ◽  
Pavlo Molchanov ◽  
Andriy Myronenko ◽  
Wenqi Li ◽  
...  

Abstract Federated learning (FL) allows the collaborative training of AI models without needing to share raw data. This capability makes it especially interesting for healthcare applications where patient and data privacy is of utmost concern. However, recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data. In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack that works for more realistic scenarios where the clients’ training involves updating the Batch Normalization (BN) statistics. Furthermore, we present new ways to measure and visualize potential data leakage in FL. Our work is a step towards establishing reproducible methods of measuring data leakage in FL and could help determine the optimal tradeoffs between privacy-preserving techniques, such as differential privacy, and model accuracy based on quantifiable metrics.


2019 ◽  
Author(s):  
David Hawig ◽  
Chao Zhou ◽  
Sebastian Fuhrhop ◽  
Andre S Fialho ◽  
Navin Ramachandran

BACKGROUND Distributed ledger technology (DLT) holds great potential to improve health information exchange. However, the immutable and transparent character of this technology may conflict with data privacy regulations and data processing best practices. OBJECTIVE The aim of this paper is to develop a proof-of-concept system for immutable, interoperable, and General Data Protection Regulation (GDPR)–compliant exchange of blood glucose data. METHODS Given that there is no ideal design for a DLT-based patient-provider data exchange solution, we proposed two different variations for our proof-of-concept system. One design was based purely on the public IOTA distributed ledger (a directed acyclic graph-based DLT) and the second used the same public IOTA ledger in combination with a private InterPlanetary File System (IPFS) cluster. Both designs were assessed according to (1) data reversal risk, (2) data linkability risks, (3) processing time, (4) file size compatibility, and (5) overall system complexity. RESULTS The public IOTA design slightly increased the risk of personal data linkability, had an overall low processing time (requiring mean 6.1, SD 1.9 seconds to upload one blood glucose data sample into the DLT), and was relatively simple to implement. The combination of the public IOTA with a private IPFS cluster minimized both reversal and linkability risks, allowed for the exchange of large files (3 months of blood glucose data were uploaded into the DLT in mean 38.1, SD 13.4 seconds), but involved a relatively higher setup complexity. CONCLUSIONS For the specific use case of blood glucose explored in this study, both designs presented a suitable performance in enabling the interoperable exchange of data between patients and providers. Additionally, both systems were designed considering the latest guidelines on personal data processing, thereby maximizing the alignment with recent GDPR requirements. For future works, these results suggest that the conflict between DLT and data privacy regulations can be addressed if careful considerations are made regarding the use case and the design of the data exchange system.


2020 ◽  
Vol 117 (15) ◽  
pp. 8344-8352 ◽  
Author(s):  
Aloni Cohen ◽  
Kobbi Nissim

There is a significant conceptual gap between legal and mathematical thinking around data privacy. The effect is uncertainty as to which technical offerings meet legal standards. This uncertainty is exacerbated by a litany of successful privacy attacks demonstrating that traditional statistical disclosure limitation techniques often fall short of the privacy envisioned by regulators. We define “predicate singling out,” a type of privacy attack intended to capture the concept of singling out appearing in the General Data Protection Regulation (GDPR). An adversary predicate singles out a dataset x using the output of a data-release mechanism M(x) if it finds a predicate p matching exactly one row in x with probability much better than a statistical baseline. A data-release mechanism that precludes such attacks is “secure against predicate singling out” (PSO secure). We argue that PSO security is a mathematical concept with legal consequences. Any data-release mechanism that purports to “render anonymous” personal data under the GDPR must prevent singling out and, hence, must be PSO secure. We analyze the properties of PSO security, showing that it fails to compose. Namely, a combination of more than logarithmically many exact counts, each individually PSO secure, facilitates predicate singling out. Finally, we ask whether differential privacy and k-anonymity are PSO secure. Leveraging a connection to statistical generalization, we show that differential privacy implies PSO security. However, and in contrast with current legal guidance, k-anonymity does not: There exists a simple predicate singling out attack under mild assumptions on the k-anonymizer and the data distribution.


Author(s):  
Alexander Burnap ◽  
Panos Y. Papalambros

Design preference models are used widely in product planning and design development. Their prediction accuracy requires large amounts of personal user data including purchase and other personal choice records. With increased Internet and smart device use, sources of personal data are becoming more varied and their capture more ubiquitous. This situation leads to questioning whether there is a trade off between improving products and compromising individual user privacy. To advance this conversation, we analyze how privacy safeguards may affect design preference modeling. We conduct an experiment using real user data to study the performance of design preference models under different levels of privacy. Results indicate there is a tradeoff between accuracy and privacy. However, with enough data, models with privacy safeguards can still be sufficiently accurate to answer population-level design questions.


2020 ◽  
Vol 17 (3) ◽  
pp. 819-834
Author(s):  
Wei Ou ◽  
Jianhuan Zeng ◽  
Zijun Guo ◽  
Wanqin Yan ◽  
Dingwan Liu ◽  
...  

With continuous improvements of computing power, great progresses in algorithms and massive growth of data, artificial intelligence technologies have entered the third rapid development era. However, With the great improvements in artificial intelligence and the arrival of the era of big data, contradictions between data sharing and user data privacy have become increasingly prominent. Federated learning is a technology that can ensure the user privacy and train a better model from different data providers. In this paper, we design a vertical federated learning system for the for Bayesian machine learning with the homomorphic encryption. During the training progress, raw data are leaving locally, and encrypted model information is exchanged. The model trained by this system is comparable (up to 90%) to those models trained by a single union server under the consideration of privacy. This system can be widely used in risk control, medical, financial, education and other fields. It is of great significance to solve data islands problem and protect users? privacy.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7806
Author(s):  
Jinmyeong Shin ◽  
Seok-Hwan Choi ◽  
Yoon-Ho Choi

As the amount of data collected and analyzed by machine learning technology increases, data that can identify individuals is also being collected in large quantities. In particular, as deep learning technology—which requires a large amount of analysis data—is activated in various service fields, the possibility of exposing sensitive information of users increases, and the user privacy problem is growing more than ever. As a solution to this user’s data privacy problem, homomorphic encryption technology, which is an encryption technology that supports arithmetic operations using encrypted data, has been applied to various field including finance and health care in recent years. If so, is it possible to use the deep learning service while preserving the data privacy of users by using the data to which homomorphic encryption is applied? In this paper, we propose three attack methods to infringe user’s data privacy by exploiting possible security vulnerabilities in the process of using homomorphic encryption-based deep learning services for the first time. To specify and verify the feasibility of exploiting possible security vulnerabilities, we propose three attacks: (1) an adversarial attack exploiting communication link between client and trusted party; (2) a reconstruction attack using the paired input and output data; and (3) a membership inference attack by malicious insider. In addition, we describe real-world exploit scenarios for financial and medical services. From the experimental evaluation results, we show that the adversarial example and reconstruction attacks are a practical threat to homomorphic encryption-based deep learning models. The adversarial attack decreased average classification accuracy from 0.927 to 0.043, and the reconstruction attack showed average reclassification accuracy of 0.888, respectively.


Sign in / Sign up

Export Citation Format

Share Document