scholarly journals Utility Optimization of Federated Learning with Differential Privacy

2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Jianzhe Zhao ◽  
Keming Mao ◽  
Chenxi Huang ◽  
Yuyang Zeng

Secure and trusted cross-platform knowledge sharing is significant for modern intelligent data analysis. To address the trade-off problems between privacy and utility in complex federated learning, a novel differentially private federated learning framework is proposed. First, the impact of data heterogeneity of participants on global model accuracy is analyzed quantitatively based on 1-Wasserstein distance. Then, we design a multilevel and multiparticipant dynamic allocation method of privacy budget to reduce the injected noise, and the utility can be improved efficiently. Finally, they are integrated, and a novel adaptive differentially private federated learning algorithm (A-DPFL) is designed. Comprehensive experiments on redefined non-I.I.D MNIST and CIFAR-10 datasets are conducted, and the results demonstrate the superiority of model accuracy, convergence, and robustness.

2021 ◽  
Vol 2021 ◽  
pp. 1-20
Author(s):  
Chao Liu ◽  
Jing Yang ◽  
Weinan Zhao ◽  
Yining Zhang ◽  
Jingyou Li ◽  
...  

As an information carrier, face images contain abundant sensitive information. Due to its natural weak privacy, direct publishing may divulge privacy. Anonymization Technology and Data Encryption Technology are limited by the background knowledge and attack means of attackers, which cannot completely content the needs of face image privacy protection. Therefore, this paper proposes a face image publishing SWP (sliding window publication) algorithm, which satisfies the differential privacy. Firstly, the SWP translates the image gray matrix into a one-dimensional ordered data stream by using image segmentation technology. The purpose of this step is to transform the image privacy protection problem into the data stream privacy protection problem. Then, the sliding window model is used to model the data flow. By comparing the similarity of data in adjacent sliding windows, the privacy budget is dynamically allocated, and Laplace noise is added. In SWP, the data in the sliding window comes from the image. To present the image features contained in the data more comprehensively and use the privacy budget more reasonably, this paper proposes a fusion similarity measurement EM (exact mechanism) mechanism and a dynamic privacy budget allocation DA (dynamic allocation) mechanism. Also, for further improving the usability of human face images and reducing the impact of noise, a sort-SWP algorithm based on the SWP method is proposed in the paper. Through the analysis, it can be seen that ordered input can further improve the usability of the SWP algorithm, but direct sorting of data will destroy the ε -differential privacy. Therefore, this paper proposes a sorting method-SAS method, which satisfies the ε -differential privacy; SAS obtain an initial sort by using an exponential mechanism firstly. And then an approximate correct sort is obtained by using the Annealing algorithm to optimize the initial sort. Compared with LAP algorithm and SWP algorithm, the average accuracy rate of sort-SWP algorithm in ORL, Yale is increased by 56.63% and 21.55%, the recall rate is increased by 6.85% and 3.32%, and F1-sroce is improved by 55.62% and 16.55%.


2021 ◽  
Vol 13 (15) ◽  
pp. 2935
Author(s):  
Chunhua Qian ◽  
Hequn Qiang ◽  
Feng Wang ◽  
Mingyang Li

Building a high-precision, stable, and universal automatic extraction model of the rocky desertification information is the premise for exploring the spatiotemporal evolution of rocky desertification. Taking Guizhou province as the research area and based on MODIS and continuous forest inventory data in China, we used a machine learning algorithm to build a rocky desertification model with bedrock exposure rate, temperature difference, humidity, and other characteristic factors and considered improving the model accuracy from the spatial and temporal dimensions. The results showed the following: (1) The supervised classification method was used to build a rocky desertification model, and the logical model, RF model, and SVM model were constructed separately. The accuracies of the models were 73.8%, 78.2%, and 80.6%, respectively, and the kappa coefficients were 0.61, 0.672, and 0.707, respectively. SVM performed the best. (2) Vegetation types and vegetation seasonal phases are closely related to rocky desertification. After combining them, the model accuracy and kappa coefficient improved to 91.1% and 0.861. (3) The spatial distribution characteristics of rocky desertification in Guizhou are obvious, showing a pattern of being heavy in the west, light in the east, heavy in the south, and light in the north. Rocky desertification has continuously increased from 2001 to 2019. In conclusion, combining the vertical spatial structure of vegetation and the differences in seasonal phase is an effective method to improve the modeling accuracy of rocky desertification, and the SVM model has the highest rocky desertification classification accuracy. The research results provide data support for exploring the spatiotemporal evolution pattern of rocky desertification in Guizhou.


2021 ◽  
Vol 11 (13) ◽  
pp. 5895
Author(s):  
Kristina Serec ◽  
Sanja Dolanski Babić

The double-stranded B-form and A-form have long been considered the two most important native forms of DNA, each with its own distinct biological roles and hence the focus of many areas of study, from cellular functions to cancer diagnostics and drug treatment. Due to the heterogeneity and sensitivity of the secondary structure of DNA, there is a need for tools capable of a rapid and reliable quantification of DNA conformation in diverse environments. In this work, the second paper in the series that addresses conformational transitions in DNA thin films utilizing FTIR spectroscopy, we exploit popular chemometric methods: the principal component analysis (PCA), support vector machine (SVM) learning algorithm, and principal component regression (PCR), in order to quantify and categorize DNA conformation in thin films of different hydrated states. By complementing FTIR technique with multivariate statistical methods, we demonstrate the ability of our sample preparation and automated spectral analysis protocol to rapidly and efficiently determine conformation in DNA thin films based on the vibrational signatures in the 1800–935 cm−1 range. Furthermore, we assess the impact of small hydration-related changes in FTIR spectra on automated DNA conformation detection and how to avoid discrepancies by careful sampling.


2021 ◽  
Author(s):  
Thomas Weripuo Gyeera

<div>The National Institute of Standards and Technology defines the fundamental characteristics of cloud computing as: on-demand computing, offered via the network, using pooled resources, with rapid elastic scaling and metered charging. The rapid dynamic allocation and release of resources on demand to meet heterogeneous computing needs is particularly challenging for data centres, which process a huge amount of data characterised by its high volume, velocity, variety and veracity (4Vs model). Data centres seek to regulate this by monitoring and adaptation, typically reacting to service failures after the fact. We present a real cloud test bed with the capabilities of proactively monitoring and gathering cloud resource information for making predictions and forecasts. This contrasts with the state-of-the-art reactive monitoring of cloud data centres. We argue that the behavioural patterns and Key Performance Indicators (KPIs) characterizing virtualized servers, networks, and database applications can best be studied and analysed with predictive models. Specifically, we applied the Boosted Decision Tree machine learning algorithm in making future predictions on the KPIs of a cloud server and virtual infrastructure network, yielding an R-Square of 0.9991 at a 0.2 learning rate. This predictive framework is beneficial for making short- and long-term predictions for cloud resources.</div>


2018 ◽  
Vol 146 (4) ◽  
pp. 1197-1218
Author(s):  
Michèle De La Chevrotière ◽  
John Harlim

This paper demonstrates the efficacy of data-driven localization mappings for assimilating satellite-like observations in a dynamical system of intermediate complexity. In particular, a sparse network of synthetic brightness temperature measurements is simulated using an idealized radiative transfer model and assimilated to the monsoon–Hadley multicloud model, a nonlinear stochastic model containing several thousands of model coordinates. A serial ensemble Kalman filter is implemented in which the empirical correlation statistics are improved using localization maps obtained from a supervised learning algorithm. The impact of the localization mappings is assessed in perfect-model observing system simulation experiments (OSSEs) as well as in the presence of model errors resulting from the misspecification of key convective closure parameters. In perfect-model OSSEs, the localization mappings that use adjacent correlations to improve the correlation estimated from small ensemble sizes produce robust accurate analysis estimates. In the presence of model error, the filter skills of the localization maps trained on perfect- and imperfect-model data are comparable.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Qian Huang ◽  
Xue Wen Li

Big data is a massive and diverse form of unstructured data, which needs proper analysis and management. It is another great technological revolution after the Internet, the Internet of Things, and cloud computing. This paper firstly studies the related concepts and basic theories as the origin of research. Secondly, it analyzes in depth the problems and challenges faced by Chinese government management under the impact of big data. Again, we explore the opportunities that big data brings to government management in terms of management efficiency, administrative capacity, and public services and believe that governments should seize opportunities to make changes. Brainlike computing attempts to simulate the structure and information processing process of biological neural network. This paper firstly analyzes the development status of e-government at home and abroad, studies the service-oriented architecture (SOA) and web services technology, deeply studies the e-government and SOA theory, and discusses this based on the development status of e-government in a certain region. Then, the deep learning algorithm is used to construct the monitoring platform to monitor the government behavior in real time, and the deep learning algorithm is used to conduct in-depth mining to analyze the government's intention behavior.


2016 ◽  
Author(s):  
Bethany Signal ◽  
Brian S Gloss ◽  
Marcel E Dinger ◽  
Timothy R Mercer

ABSTRACTBackgroundThe branchpoint element is required for the first lariat-forming reaction in splicing. However due to difficulty in experimentally mapping at a genome-wide scale, current catalogues are incomplete.ResultsWe have developed a machine-learning algorithm trained with empirical human branchpoint annotations to identify branchpoint elements from primary genome sequence alone. Using this approach, we can accurately locate branchpoints elements in 85% of introns in current gene annotations. Consistent with branchpoints as basal genetic elements, we find our annotation is unbiased towards gene type and expression levels. A major fraction of introns was found to encode multiple branchpoints raising the prospect that mutational redundancy is encoded in key genes. We also confirmed all deleterious branchpoint mutations annotated in clinical variant databases, and further identified thousands of clinical and common genetic variants with similar predicted effects.ConclusionsWe propose the broad annotation of branchpoints constitutes a valuable resource for further investigations into the genetic encoding of splicing patterns, and interpreting the impact of common- and disease-causing human genetic variation on gene splicing.


Author(s):  
Pinar Demetci ◽  
Rebecca Santorella ◽  
Björn Sandstede ◽  
William Stafford Noble ◽  
Ritambhara Singh

AbstractData integration of single-cell measurements is critical for understanding cell development and disease, but the lack of correspondence between different types of measurements makes such efforts challenging. Several unsupervised algorithms can align heterogeneous single-cell measurements in a shared space, enabling the creation of mappings between single cells in different data domains. However, these algorithms require hyperparameter tuning for high-quality alignments, which is difficult in an unsupervised setting without correspondence information for validation. We present Single-Cell alignment using Optimal Transport (SCOT), an unsupervised learning algorithm that uses Gromov Wasserstein-based optimal transport to align single-cell multi-omics datasets. We compare the alignment performance of SCOT with state-of-the-art algorithms on four simulated and two real-world datasets. SCOT performs on par with state-of-the-art methods but is faster and requires tuning fewer hyperparameters. Furthermore, we provide an algorithm for SCOT to use Gromov Wasserstein distance to guide the parameter selection. Thus, unlike previous methods, SCOT aligns well without using any orthogonal correspondence information to pick the hyperparameters. Our source code and scripts for replicating the results are available at https://github.com/rsinghlab/SCOT.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Hanlin Liu ◽  
Linqiang Yang ◽  
Linchao Li

A variety of climate factors influence the precision of the long-term Global Navigation Satellite System (GNSS) monitoring data. To precisely analyze the effect of different climate factors on long-term GNSS monitoring records, this study combines the extended seven-parameter Helmert transformation and a machine learning algorithm named Extreme Gradient boosting (XGboost) to establish a hybrid model. We established a local-scale reference frame called stable Puerto Rico and Virgin Islands reference frame of 2019 (PRVI19) using ten continuously operating long-term GNSS sites located in the rigid portion of the Puerto Rico and Virgin Islands (PRVI) microplate. The stability of PRVI19 is approximately 0.4 mm/year and 0.5 mm/year in the horizontal and vertical directions, respectively. The stable reference frame PRVI19 can avoid the risk of bias due to long-term plate motions when studying localized ground deformation. Furthermore, we applied the XGBoost algorithm to the postprocessed long-term GNSS records and daily climate data to train the model. We quantitatively evaluated the importance of various daily climate factors on the GNSS time series. The results show that wind is the most influential factor with a unit-less index of 0.013. Notably, we used the model with climate and GNSS records to predict the GNSS-derived displacements. The results show that the predicted displacements have a slightly lower root mean square error compared to the fitted results using spline method (prediction: 0.22 versus fitted: 0.31). It indicates that the proposed model considering the climate records has the appropriate predict results for long-term GNSS monitoring.


Sign in / Sign up

Export Citation Format

Share Document