A comparative analysis on linear regression and support vector regression

Author(s):  
Kavitha S ◽  
Varuna S ◽  
Ramya R
2018 ◽  
Vol 11 (6) ◽  
pp. 3717-3735 ◽  
Author(s):  
Alessandro Bigi ◽  
Michael Mueller ◽  
Stuart K. Grange ◽  
Grazia Ghermandi ◽  
Christoph Hueglin

Abstract. Low cost sensors for measuring atmospheric pollutants are experiencing an increase in popularity worldwide among practitioners, academia and environmental agencies, and a large amount of data by these devices are being delivered to the public. Notwithstanding their behaviour, performance and reliability are not yet fully investigated and understood. In the present study we investigate the medium term performance of a set of NO and NO2 electrochemical sensors in Switzerland using three different regression algorithms within a field calibration approach. In order to mimic a realistic application of these devices, the sensors were initially co-located at a rural regulatory monitoring site for a 4-month calibration period, and subsequently deployed for 4 months at two distant regulatory urban sites in traffic and urban background conditions, where the performance of the calibration algorithms was explored. The applied algorithms were Multivariate Linear Regression, Support Vector Regression and Random Forest; these were tested, along with the sensors, in terms of generalisability, selectivity, drift, uncertainty, bias, noise and suitability for spatial mapping intra-urban pollution gradients with hourly resolution. Results from the deployment at the urban sites show a better performance of the non-linear algorithms (Support Vector Regression and Random Forest) achieving RMSE  <  5 ppb, R2 between 0.74 and 0.95 and MAE between 2 and 4 ppb. The combined use of both NO and NO2 sensor output in the estimate of each pollutant showed some contribution by NO sensor to NO2 estimate and vice-versa. All algorithms exhibited a drift ranging between 5 and 10 ppb for Random Forest and 15 ppb for Multivariate Linear Regression at the end of the deployment. The lowest concentration correctly estimated, with a 25 % relative expanded uncertainty, resulted in ca. 15–20 ppb and was provided by the non-linear algorithms. As an assessment for the suitability of the tested sensors for a targeted application, the probability of resolving hourly concentration difference in cities was investigated. It was found that NO concentration differences of 5–10 ppb (8–10 for NO2) can reliably be detected (90 % confidence), depending on the air pollution level. The findings of this study, although derived from a specific sensor type and sensor model, are based on a flexible methodology and have extensive potential for exploring the performance of other low cost sensors, that are different in their target pollutant and sensing technology.


Author(s):  
Jiaqi Lyu ◽  
Souran Manoochehri

Abstract With the development of Fused Deposition Modeling (FDM) technology, the quality of fabricated parts is getting more attention. The present study highlights the predictive model for dimensional accuracy in the FDM process. Three process parameters, namely extruder temperature, layer thickness, and infill density, are considered in the model. To achieve better prediction accuracy, three models are studied, namely multivariate linear regression, Artificial Neural Network (ANN), and Support Vector Regression (SVR). The models are used to characterize the complex relationship between the input variables and dimensions of fabricated parts. Based on the experimental data set, it is found that the ANN model performs better than the multivariate linear regression and SVR models. The ANN model is able to study more quality characteristics of fabricated parts with more process parameters of FDM.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Yanshuang Zhou ◽  
Na Li ◽  
Hong Li ◽  
Yongqiang Zhang

As cloud data center consumes more and more energy, both researchers and engineers aim to minimize energy consumption while keeping its services available. A good energy model can reflect the relationships between running tasks and the energy consumed by hardware and can be further used to schedule tasks for saving energy. In this paper, we analyzed linear and nonlinear regression energy model based on performance counters and system utilization and proposed a support vector regression energy model. For performance counters, we gave a general linear regression framework and compared three linear regression models. For system utilization, we compared our support vector regression model with linear regression and three nonlinear regression models. The experiments show that linear regression model is good enough to model performance counters, nonlinear regression is better than linear regression model for modeling system utilization, and support vector regression model is better than polynomial and exponential regression models.


2011 ◽  
Vol 460-461 ◽  
pp. 786-791
Author(s):  
Huan Da Lu ◽  
Kang Sheng Liu

A novel hybrid method based on SVM and linear regression for short-term load forecasting was presented. It is well known that temperature information is very important for load forecasting, but the local structure of temperature sensitive information is not adopted in the literature. The proposed model adopts an integrated architecture to handle the local temperature sensitive information. Firstly, the input load data set is clustered into several temperature similar days subsets by the k-means algorithm in an unsupervised manner, Then compute the temperature correction in each subsets and split the time point (5 minutes, 288/day) into three time zone: the most temperature sensitive time zone, the less temperature sensitive time zone, and partial temperature sensitive time zone. The most temperature sensitive time zone is forecasted by the linear regression, the less temperature sensitive time zone is coded only using past load data, and then use the generic support vector regression to forecast the next day load in that time point, the partial temperature sensitive time zone is coded combining the past load and temperature information and then using the support vector regression same as the less temperature sensitive time zone. Finally, we smooth the whole forecasted load curve using linear programming. The empirical results indicate that our hybrid method results in better forecasting performance than the original generic support vector regression.


Sign in / Sign up

Export Citation Format

Share Document