scholarly journals Differentially Private Ordinary Least Squares

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Or Sheffet

Linear regression is one of the most prevalent techniques in machine learning; however, it is also common to use linear regression for its explanatory capabilities rather than label prediction. Ordinary Least Squares (OLS) is often used in statistics to establish a correlation between an attribute (e.g. gender) and a label (e.g. income) in the presence of other (potentially correlated) features. OLS assumes a particular model that randomly generates the data, and derives t-values - representing the likelihood of each real value to be the true correlation. Using t-values, OLS can release a confidence interval, which is an interval on the reals that is likely to contain the true correlation; and when this interval does not intersect the origin, we can reject the null hypothesis as it is likely that the true correlation is non-zero.Our work aims at achieving similar guarantees on data under differentially private estimators. First, we show that for well-spread data, the Gaussian Johnson-Lindenstrauss Transform (JLT) gives a very good approximation of t-values; secondly, when JLT approximates Ridge regression (linear regression with l2-regularization) we derive, under certain conditions, confidence intervals using the projected data; lastly, we derive, under different conditions, confidence intervals for the "Analyze Gauss" algorithm of Dwork et al (STOC 2014).

2009 ◽  
Vol 2009 ◽  
pp. 1-8 ◽  
Author(s):  
Janet Myhre ◽  
Daniel R. Jeske ◽  
Michael Rennie ◽  
Yingtao Bi

A heteroscedastic linear regression model is developed from plausible assumptions that describe the time evolution of performance metrics for equipment. The inherited motivation for the related weighted least squares analysis of the model is an essential and attractive selling point to engineers with interest in equipment surveillance methodologies. A simple test for the significance of the heteroscedasticity suggested by a data set is derived and a simulation study is used to evaluate the power of the test and compare it with several other applicable tests that were designed under different contexts. Tolerance intervals within the context of the model are derived, thus generalizing well-known tolerance intervals for ordinary least squares regression. Use of the model and its associated analyses is illustrated with an aerospace application where hundreds of electronic components are continuously monitored by an automated system that flags components that are suspected of unusual degradation patterns.


2014 ◽  
Vol 71 (1) ◽  
Author(s):  
Bello Abdulkadir Rasheed ◽  
Robiah Adnan ◽  
Seyed Ehsan Saffari ◽  
Kafi Dano Pati

In a linear regression model, the ordinary least squares (OLS) method is considered the best method to estimate the regression parameters if the assumptions are met. However, if the data does not satisfy the underlying assumptions, the results will be misleading. The violation for the assumption of constant variance in the least squares regression is caused by the presence of outliers and heteroscedasticity in the data. This assumption of constant variance (homoscedasticity) is very important in linear regression in which the least squares estimators enjoy the property of minimum variance. Therefor e robust regression method is required to handle the problem of outlier in the data. However, this research will use the weighted least square techniques to estimate the parameter of regression coefficients when the assumption of error variance is violated in the data. Estimation of WLS is the same as carrying out the OLS in a transformed variables procedure. The WLS can easily be affected by outliers. To remedy this, We have suggested a strong technique for the estimation of regression parameters in the existence of heteroscedasticity and outliers. Here we apply the robust regression of M-estimation using iterative reweighted least squares (IRWLS) of Huber and Tukey Bisquare function and resistance regression estimator of least trimmed squares to estimating the model parameters of state-wide crime of united states in 1993. The outcomes from the study indicate the estimators obtained from the M-estimation techniques and the least trimmed method are more effective compared with those obtained from the OLS.


Author(s):  
Scott Alfeld ◽  
Ara Vartanian ◽  
Lucas Newman-Johnson ◽  
Benjamin I.P. Rubinstein

While machine learning systems are known to be vulnerable to data-manipulation attacks at both training and deployment time, little is known about how to adapt attacks when the defender transforms data prior to model estimation. We consider the setting where the defender Bob first transforms the data then learns a model from the result; Alice, the attacker, perturbs Bob’s input data prior to him transforming it. We develop a general-purpose “plug and play” framework for gradient-based attacks based on matrix differentials, focusing on ordinary least-squares linear regression. This allows learning algorithms and data transformations to be paired and composed arbitrarily: attacks can be adapted through the use of the chain rule—analogous to backpropagation on neural network parameters—to compositional learning maps. Bestresponse attacks can be computed through matrix multiplications from a library of attack matrices for transformations and learners. Our treatment of linear regression extends state-ofthe-art attacks at training time, by permitting the attacker to affect both features and targets optimally and simultaneously. We explore several transformations broadly used across machine learning with a driving motivation for our work being autogressive modeling. There, Bob transforms a univariate time series into a matrix of observations and vector of target values which can then be fed into standard learners. Under this learning reduction, a perturbation from Alice to a single value of the time series affects features of several data points along with target values.


2019 ◽  
Vol 13 (1) ◽  
pp. 37-58
Author(s):  
Ilma Yuni Rosita ◽  
Lilis Imamah Ichdayati ◽  
Rizki Adi Puspita Sari

This study aims to analyze the factors that affect the volume of Indonesian cocoa exports to Malaysia. Multiple linear regression and ordinary least squares (OLS) were employed to analyze time series of data from 2005 until 2013. Based on the analysis, it is obtained that factors that significantly effect the volume of Indonesian cocoa exports to Malaysia with a significance level (α) five percent are the real prices of Indonesian cocoa exports to Malaysia and the real prices of cocoa beans the international market.


2004 ◽  
Vol 50 (11) ◽  
pp. 81-88 ◽  
Author(s):  
J.-L. Bertrand-Krajewski

In order to replace traditional sampling and analysis techniques, turbidimeters can be used to estimate TSS concentration in sewers, by means of sensor and site specific empirical equations established by linear regression of on-site turbidity T values with TSS concentrations C measured in corresponding samples. As the ordinary least-squares method is not able to account for measurement uncertainties in both T and C variables, an appropriate regression method is used to solve this difficulty and to evaluate correctly the uncertainty in TSS concentrations estimated from measured turbidity. The regression method is described, including detailed calculations of variances and covariance in the regression parameters. An example of application is given for a calibrated turbidimeter used in a combined sewer system, with data collected during three dry weather days. In order to show how the established regression could be used, an independent 24 hours long dry weather turbidity data series recorded at 2 min time interval is used, transformed into estimated TSS concentrations, and compared to TSS concentrations measured in samples. The comparison appears as satisfactory and suggests that turbidity measurements could replace traditional samples. Further developments, including wet weather periods and other types of sensors, are suggested.


Sign in / Sign up

Export Citation Format

Share Document