Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning

Daniel M Bittner; Alejandro E Brito; Mohsen Ghassemi; Shantanu Rane; Anand D Sarwate; Rebecca N Wright

doi:10.29012/jpc.720

Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning

Journal of Privacy and Confidentiality ◽

10.29012/jpc.720 ◽

2020 ◽

Vol 10 (2) ◽

Author(s):

Daniel M Bittner ◽

Alejandro E Brito ◽

Mohsen Ghassemi ◽

Shantanu Rane ◽

Anand D Sarwate ◽

...

Keyword(s):

Active Learning ◽

Web Application ◽

Differential Privacy ◽

Learning Algorithm ◽

Privacy Preserving ◽

General Purpose ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Expert User ◽

Vector Machines

We consider privacy-preserving learning in the context of online learning. Insettings where data instances arrive sequentially in streaming fashion, incremental trainingalgorithms such as stochastic gradient descent (SGD) can be used to learn and updateprediction models. When labels are costly to acquire, active learning methods can beused to select samples to be labeled from a stream of unlabeled data. These labeled datasamples are then used to update the machine learning models. Privacy-preserving onlinelearning can be used to update predictors on data streams containing sensitive information.The differential privacy framework quantifies the privacy risk in such settings. This workproposes a differentially private online active learning algorithm using stochastic gradientdescent (SGD) to retrain the classifiers. We propose two methods for selecting informativesamples. We incorporated this into a general-purpose web application that allows a non-expert user to evaluate the privacy-aware classifier and visualize key privacy-utility tradeoffs.Our application supports linear support vector machines and logistic regression and enablesan analyst to configure and visualize the effect of using differentially private online activelearning versus a non-private counterpart. The application is useful for comparing theprivacy/utility tradeoff of different algorithms, which can be useful to decision makers inchoosing which algorithms and parameters to use. Additionally, we use the application toevaluate our SGD-based solution and to show that it generates predictions with a superiorprivacy-utility tradeoff than earlier methods.

Download Full-text

Comparison of SVM, RF and SGD Methods for Determination of Programmer's Performance Classification Model in Social Media Activities

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v4i2.1770 ◽

2020 ◽

Vol 4 (2) ◽

pp. 329-335

Author(s):

Rusydi Umar ◽

Imam Riadi ◽

Purwono

Keyword(s):

Social Media ◽

Gradient Descent ◽

Classification Model ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Svm Algorithm ◽

Vector Machines ◽

Performance Patterns ◽

A Company

The failure of most startups in Indonesia is caused by team performance that is not solid and competent. Programmers are an integral profession in a startup team. The development of social media can be used as a strategic tool for recruiting the best programmer candidates in a company. This strategic tool is in the form of an automatic classification system of social media posting from prospective programmers. The classification results are expected to be able to predict the performance patterns of each candidate with a predicate of good or bad performance. The classification method with the best accuracy needs to be chosen in order to get an effective strategic tool so that a comparison of several methods is needed. This study compares classification methods including the Support Vector Machines (SVM) algorithm, Random Forest (RF) and Stochastic Gradient Descent (SGD). The classification results show the percentage of accuracy with k = 10 cross validation for the SVM algorithm reaches 81.3%, RF at 74.4%, and SGD at 80.1% so that the SVM method is chosen as a model of programmer performance classification on social media activities.

Download Full-text

Inconsistency-based active learning for support vector machines

Pattern Recognition ◽

10.1016/j.patcog.2012.03.022 ◽

2012 ◽

Vol 45 (10) ◽

pp. 3751-3767 ◽

Cited By ~ 23

Author(s):

Ran Wang ◽

Sam Kwong ◽

Degang Chen

Keyword(s):

Support Vector Machines ◽

Active Learning ◽

Support Vector ◽

Vector Machines

Download Full-text

Adapting SVM for data sparseness and imbalance: a case study in information extraction

Natural Language Engineering ◽

10.1017/s1351324908004968 ◽

2009 ◽

Vol 15 (2) ◽

pp. 241-271 ◽

Cited By ~ 31

Author(s):

YAOYONG LI ◽

KALINA BONTCHEVA ◽

HAMISH CUNNINGHAM

Keyword(s):

Active Learning ◽

Language Learning ◽

Information Extraction ◽

Language Processing ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Passive Learning ◽

Wide Range

AbstractSupport Vector Machines (SVM) have been used successfully in many Natural Language Processing (NLP) tasks. The novel contribution of this paper is in investigating two techniques for making SVM more suitable for language learning tasks. Firstly, we propose an SVM with uneven margins (SVMUM) model to deal with the problem of imbalanced training data. Secondly, SVM active learning is employed in order to alleviate the difficulty in obtaining labelled training data. The algorithms are presented and evaluated on several Information Extraction (IE) tasks, where they achieved better performance than the standard SVM and the SVM with passive learning, respectively. Moreover, by combining SVMUM with the active learning algorithm, we achieve the best reported results on the seminars and jobs corpora, which are benchmark data sets used for evaluation and comparison of machine learning algorithms for IE. In addition, we also evaluate the token based classification framework for IE with three different entity tagging schemes. In comparison to previous methods dealing with the same problems, our methods are both effective and efficient, which are valuable features for real-world applications. Due to the similarity in the formulation of the learning problem for IE and for other NLP tasks, the two techniques are likely to be beneficial in a wide range of applications1.

Download Full-text

Differentially Private Image Classification Using Support Vector Machine and Differential Privacy

Machine Learning and Knowledge Extraction ◽

10.3390/make1010029 ◽

2019 ◽

Vol 1 (1) ◽

pp. 483-491 ◽

Cited By ~ 6

Author(s):

Makhamisa Senekane

Keyword(s):

Support Vector Machine ◽

Data Analysis ◽

Image Classification ◽

Differential Privacy ◽

Privacy Preserving ◽

Global Optimum ◽

Support Vector ◽

Sensitive Data ◽

Radiological Images ◽

Golden Standard

The ubiquity of data, including multi-media data such as images, enables easy mining and analysis of such data. However, such an analysis might involve the use of sensitive data such as medical records (including radiological images) and financial records. Privacy-preserving machine learning is an approach that is aimed at the analysis of such data in such a way that privacy is not compromised. There are various privacy-preserving data analysis approaches such as k-anonymity, l-diversity, t-closeness and Differential Privacy (DP). Currently, DP is a golden standard of privacy-preserving data analysis due to its robustness against background knowledge attacks. In this paper, we report a scheme for privacy-preserving image classification using Support Vector Machine (SVM) and DP. SVM is chosen as a classification algorithm because unlike variants of artificial neural networks, it converges to a global optimum. SVM kernels used are linear and Radial Basis Function (RBF), while ϵ -differential privacy was the DP framework used. The proposed scheme achieved an accuracy of up to 98%. The results obtained underline the utility of using SVM and DP for privacy-preserving image classification.

Download Full-text

Efficient methodology for seismic fragility curves estimation by active learning on Support Vector Machines

Structural Safety ◽

10.1016/j.strusafe.2020.101972 ◽

2020 ◽

Vol 86 ◽

pp. 101972

Author(s):

Rémi Sainct ◽

Cyril Feau ◽

Jean-Marc Martinez ◽

Josselin Garnier

Keyword(s):

Support Vector Machines ◽

Active Learning ◽

Fragility Curves ◽

Seismic Fragility ◽

Support Vector ◽

Vector Machines

Download Full-text

New Incremental Learning Algorithm With Support Vector Machines

IEEE Transactions on Systems Man and Cybernetics Systems ◽

10.1109/tsmc.2018.2791511 ◽

2019 ◽

Vol 49 (11) ◽

pp. 2230-2241 ◽

Cited By ~ 18

Author(s):

Jie Xu ◽

Chen Xu ◽

Bin Zou ◽

Yuan Yan Tang ◽

Jiangtao Peng ◽

...

Keyword(s):

Support Vector Machines ◽

Incremental Learning ◽

Learning Algorithm ◽

Support Vector ◽

Vector Machines

Download Full-text

Active learning with support vector machines

Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery ◽

10.1002/widm.1132 ◽

2014 ◽

Vol 4 (4) ◽

pp. 313-326 ◽

Cited By ~ 34

Author(s):

Jan Kremer ◽

Kim Steenstrup Pedersen ◽

Christian Igel

Keyword(s):

Support Vector Machines ◽

Active Learning ◽

Support Vector ◽

Vector Machines

Download Full-text

Least Squares Support Vector Machine for Fault Diagnosis Optimization

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.505 ◽

2013 ◽

Vol 347-350 ◽

pp. 505-508

Author(s):

Si Yang Liang ◽

Jian Hong Lv

Keyword(s):

Fault Diagnosis ◽

Least Squares ◽

Learning Algorithm ◽

Test Sample ◽

Digital Circuit ◽

Training Sample ◽

Support Vector ◽

Vector Machines ◽

Sample Data ◽

Diagnosis Method

In order to improve the diagnostic accuracy of digital circuit, the fault diagnosis method based on support vector machines (SVM) is proposed. The input is fault characteristics of digital circuit; the output is the fault style. The connection of fault characteristics and style was established. Network learning algorithm using least squares, the training sample data is formed by the simulation, the test sample data is formed by the untrained simulation. The method achieved the classification of faulted digital circuits, and the results show that the method has the features of fast and high accuracy.

Download Full-text

Active Learning with Support Vector Machines in the Drug Discovery Process

Journal of Chemical Information and Computer Sciences ◽

10.1021/ci025620t ◽

2003 ◽

Vol 43 (2) ◽

pp. 667-673 ◽

Cited By ~ 181

Author(s):

Manfred K. Warmuth ◽

Jun Liao ◽

Gunnar Rätsch ◽

Michael Mathieson ◽

Santosh Putta ◽

...

Keyword(s):

Drug Discovery ◽

Support Vector Machines ◽

Active Learning ◽

Support Vector ◽

Discovery Process ◽

Drug Discovery Process ◽

Vector Machines

Download Full-text

Combining active learning and transductive support vector machines for sea ice detection

Journal of Applied Remote Sensing ◽

10.1117/1.jrs.12.026016 ◽

2018 ◽

Vol 12 (02) ◽

pp. 1 ◽

Cited By ~ 2

Author(s):

Yanling Han ◽

Peng Li ◽

Yun Zhang ◽

Zhonghua Hong ◽

Kaichen Liu ◽

...

Keyword(s):

Support Vector Machines ◽

Active Learning ◽

Sea Ice ◽

Support Vector ◽

Vector Machines ◽

Ice Detection

Download Full-text