Beyond Differential Privacy: Synthetic Micro-Data Generation with Deep Generative Neural Networks

Synthesis and Parameterization of Gas Sensor Models

Proceedings of Southwest State University ◽

10.21869/2223-1560-2021-25-1-138-161 ◽

2021 ◽

Vol 25 (1) ◽

pp. 138-161

Author(s):

O. G. Bondar ◽

E. O. Brezhneva ◽

O. G. Dobroserdov ◽

K. G. Andreev ◽

N. V. Polyakov

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Mathematical Models ◽

Output Signal ◽

Electrochemical Sensors ◽

Training Data ◽

Data Generation ◽

Technical Documentation ◽

Cross Sensitivity ◽

Artificial Neural

Purpose of research: search and analysis of existing models of gas-sensitive sensors. Development of mathematical models of gas-sensitive sensors of various types (semiconductor, thermocatalytic, optical, electrochemical) for their subsequent use in the training of artificial neural networks (INS). Investigation of main physicochemical patterns underlying the principles of sensor operation, consideration of the influence of environmental factors and cross-sensitivity on the sensor output signal. Comparison of simulation results with actual characteristics produced by the sensor industry. The concept of creating mathematical models is described. Their parameterization, research and assessment of adequacy are carried out.Methods. Numerical methods, computer modeling methods, electrical circuit theory, the theory of chemosorption and heterogeneous catalysis, the Freundlich and Langmuir equations, the Buger-Lambert-Behr law, the foundations of electrochemistry were used in creating mathematical models. Standard deviation (MSE) and relative error were calculated to assess the adequacy of the models.Results. The concept of creating mathematical models of sensors based on physicochemical patterns is described. This concept allows the process of data generation for training artificial neural networks used in multi-component gas analyzers for the purpose of joint information processing to be automated. Models of semiconductor, thermocatalytic, optical and electrochemical sensors were obtained and upgraded, considering the influence of additional factors on the sensor signal. Parameterization and assessment of adequacy and extrapolation properties of models by graphical dependencies presented in technical documentation of sensors were carried out. Errors (relative and RMS) of discrepancy of real data and results of simulation of gas-sensitive sensors by basic parameters are determined. The standard error of reproduction of the main characteristics of the sensors did not exceed 0.5%.Conclusion. Multivariable mathematical models of gas-sensitive sensors are synthesized, considering the influence of main gas and external factors (pressure, temperature, humidity, cross-sensitivity) on the output signal and allowing to generate training data for sensors of various types.

Download Full-text

Maximizing the Prediction Accuracy in Tweet Sentiment Extraction using Tensor Flow based Deep Neural Networks

Journal of Ubiquitous Computing and Communication Technologies - December 2019 ◽

10.36548/jucct.2021.2.001 ◽

2021 ◽

Vol 3 (2) ◽

pp. 61-79

Author(s):

S Thivaharan ◽

G Srivatsun

Keyword(s):

Neural Networks ◽

Social Media ◽

Prediction Accuracy ◽

Deep Neural Networks ◽

Modern Technology ◽

The Other ◽

Data Generation ◽

Class Prediction ◽

The Social ◽

Communication Devices

The amount of data generated by modern communication devices is enormous, reaching petabytes. The rate of data generation is also increasing at an unprecedented rate. Though modern technology supports storage in massive amounts, the industry is reluctant in retaining the data, which includes the following characteristics: redundancy in data, unformatted records with outdated information, data that misleads the prediction and data with no impact on the class prediction. Out of all of this data, social media plays a significant role in data generation. As compared to other data generators, the ratio at which the social media generates the data is comparatively higher. Industry and governments are both worried about the circulation of mischievous or malcontents, as they are extremely susceptible and are used by criminals. So it is high time to develop a model to classify the social media contents as fair and unfair. The developed model should have higher accuracy in predicting the class of contents. In this article, tensor flow based deep neural networks are deployed with a fixed Epoch count of 15, in order to attain 25% more accuracy over the other existing models. Activation methods like “Relu” and “Sigmoid”, which are specific for Tensor flow platforms support to attain the improved prediction accuracy.

Download Full-text

Differentially-private Federated Neural Architecture Search

10.36227/techrxiv.12503420 ◽

2020 ◽

Author(s):

Ishika Singh ◽

Haoyi Zhou ◽

Kunlin Yang ◽

Meng Ding ◽

Bill Lin ◽

...

Keyword(s):

Neural Networks ◽

Differential Privacy ◽

Random Noise ◽

Privacy Concerns ◽

Max Pooling ◽

Neural Architecture ◽

Remarkable Progress

Neural architecture search, which aims to automatically search for architectures (e.g., convolution, max pooling) of neural networks that maximize validation performance, has achieved remarkable progress recently. In many application scenarios, several parties would like to collaboratively search for a shared neural architecture by leveraging data from all parties. However, due to privacy concerns, no party wants its data to be seen by other parties. To address this problem, we propose federated neural architecture search (FNAS), where different parties collectively search for a differentiable architecture by exchanging gradients of architecture variables without exposing their data to other parties. To further preserve privacy, we study differentially-private FNAS (DP-FNAS), which adds random noise to the gradients of architecture variables. We provide theoretical guarantees of DP-FNAS in achieving differential privacy. Experiments show that DP-FNAS can search highly-performant neural architectures while protecting the privacy of individual parties. The code is available at https://github.com/UCSD-AI4H/DP-FNAS

Download Full-text

Towards Understanding the Risks of Gradient Inversion in Federated Learning

10.21203/rs.3.rs-1147182/v2 ◽

2021 ◽

Author(s):

Ali Hatamizadeh ◽

Hongxu Yin ◽

Pavlo Molchanov ◽

Andriy Myronenko ◽

Wenqi Li ◽

...

Keyword(s):

Neural Networks ◽

Data Privacy ◽

Deep Neural Networks ◽

Differential Privacy ◽

Use Cases ◽

Training Data ◽

Model Accuracy ◽

Healthcare Applications ◽

Raw Data ◽

Collaborative Training

Abstract Federated learning (FL) allows the collaborative training of AI models without needing to share raw data. This capability makes it especially interesting for healthcare applications where patient and data privacy is of utmost concern. However, recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data. In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack that works for more realistic scenarios where the clients’ training involves updating the Batch Normalization (BN) statistics. Furthermore, we present new ways to measure and visualize potential data leakage in FL. Our work is a step towards establishing reproducible methods of measuring data leakage in FL and could help determine the optimal tradeoffs between privacy-preserving techniques, such as differential privacy, and model accuracy based on quantifiable metrics.

Download Full-text

Private FL-GAN: Differential Privacy Synthetic Data Generation Based on Federated Learning

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9054559 ◽

2020 ◽

Cited By ~ 1

Author(s):

Bangzhou Xin ◽

Wei Yang ◽

Yangyang Geng ◽

Sheng Chen ◽

Shaowei Wang ◽

...

Keyword(s):

Differential Privacy ◽

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation

Download Full-text

Density estimation using deep generative neural networks

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2101344118 ◽

2021 ◽

Vol 118 (15) ◽

pp. e2101344118

Author(s):

Qiao Liu ◽

Jiaze Xu ◽

Rui Jiang ◽

Wing Hung Wong

Keyword(s):

Neural Networks ◽

Density Estimation ◽

General Purpose ◽

Generative Models ◽

Generative Adversarial Networks ◽

Data Generation ◽

Diverse Range ◽

Statistical Framework ◽

Adversarial Networks ◽

Density Values

Density estimation is one of the fundamental problems in both statistics and machine learning. In this study, we propose Roundtrip, a computational framework for general-purpose density estimation based on deep generative neural networks. Roundtrip retains the generative power of deep generative models, such as generative adversarial networks (GANs) while it also provides estimates of density values, thus supporting both data generation and density estimation. Unlike previous neural density estimators that put stringent conditions on the transformation from the latent space to the data space, Roundtrip enables the use of much more general mappings where target density is modeled by learning a manifold induced from a base density (e.g., Gaussian distribution). Roundtrip provides a statistical framework for GAN models where an explicit evaluation of density values is feasible. In numerical experiments, Roundtrip exceeds state-of-the-art performance in a diverse range of density estimation tasks.

Download Full-text

Differential Privacy at Risk: Bridging Randomness and Privacy Budget

Proceedings on Privacy Enhancing Technologies ◽

10.2478/popets-2021-0005 ◽

2021 ◽

Vol 2021 (1) ◽

pp. 64-84

Author(s):

Ashish Dandekar ◽

Debabrota Basu ◽

Stéphane Bressan

Keyword(s):

At Risk ◽

Differential Privacy ◽

Cost Model ◽

Privacy Preserving ◽

Fine Tuning ◽

Data Generation ◽

The Cost ◽

Privacy Budget ◽

Composition Theorem ◽

Privacy Level

AbstractThe calibration of noise for a privacy-preserving mechanism depends on the sensitivity of the query and the prescribed privacy level. A data steward must make the non-trivial choice of a privacy level that balances the requirements of users and the monetary constraints of the business entity.Firstly, we analyse roles of the sources of randomness, namely the explicit randomness induced by the noise distribution and the implicit randomness induced by the data-generation distribution, that are involved in the design of a privacy-preserving mechanism. The finer analysis enables us to provide stronger privacy guarantees with quantifiable risks. Thus, we propose privacy at risk that is a probabilistic calibration of privacy-preserving mechanisms. We provide a composition theorem that leverages privacy at risk. We instantiate the probabilistic calibration for the Laplace mechanism by providing analytical results.Secondly, we propose a cost model that bridges the gap between the privacy level and the compensation budget estimated by a GDPR compliant business entity. The convexity of the proposed cost model leads to a unique fine-tuning of privacy level that minimises the compensation budget. We show its effectiveness by illustrating a realistic scenario that avoids overestimation of the compensation budget by using privacy at risk for the Laplace mechanism. We quantitatively show that composition using the cost optimal privacy at risk provides stronger privacy guarantee than the classical advanced composition. Although the illustration is specific to the chosen cost model, it naturally extends to any convex cost model. We also provide realistic illustrations of how a data steward uses privacy at risk to balance the trade-off between utility and privacy.

Download Full-text

Emerging Technologies in a Modern Competitive Scenario

Digital Transformation and Challenges to Data Security and Privacy - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-4201-9.ch001 ◽

2021 ◽

pp. 1-16

Author(s):

George Leal Jamil ◽

Alexis Rocha da Silva

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Statistical Analysis ◽

Differential Privacy ◽

Social Issues ◽

Training Data ◽

Sensitive Data ◽

Machine Learning Model ◽

Highly Sensitive ◽

Inference Attacks

Users' personal, highly sensitive data such as photos and voice recordings are kept indefinitely by the companies that collect it. Users can neither delete nor restrict the purposes for which it is used. Learning how to machine learning that protects privacy, we can make a huge difference in solving many social issues like curing disease, etc. Deep neural networks are susceptible to various inference attacks as they remember information about their training data. In this chapter, the authors introduce differential privacy, which ensures that different kinds of statistical analysis don't compromise privacy and federated learning, training a machine learning model on a data to which we do not have access to.

Download Full-text