Study of Cubic Splines and Fourier Series as Interpolation Techniques for Filling in Short Periods of Missing Building Energy Use and Weather Data

Solar Energy ◽

10.1115/sed2002-1031 ◽

2002 ◽

Author(s):

Juan-Carlos Baltazar ◽

David E. Claridge

Keyword(s):

Time Series ◽

Fourier Series ◽

Energy Use ◽

Linear Interpolation ◽

Cubic Splines ◽

Dew Point ◽

Weather Data ◽

Data Set ◽

Data Points ◽

Interpolation Techniques

A study of cubic splines and Fourier series as interpolation techniques for filling in missing data in energy and meteorological time series is presented. The followed procedure created artificially missing points (pseudo-gaps) in measured data sets and was based on the local behavior of the data set around those pseudo-gaps. Five variants of the cubic spline technique and 12 variants of Fourier series were tested and compared with linear interpolation, for filling in gaps of 1 to 6 hours of data in 20 samples of energy use and weather data. Each of the samples is at least one year in length. The analysis showed that linear interpolation is superior to the spline and Fourier series techniques for filling in 1–6 hour gaps in time series dry bulb and dew point temperature data. For filling 1–6 hour gaps in building cooling and heating use, the Fourier series approach with 24 data points before and after each gap and six constants was found to be the most suitable. In cases where there are insufficient data points for the application of this approach, simple linear interpolation is recommended.

Download Full-text

Study of Cubic Splines and Fourier Series as Interpolation Techniques for Filling in Short Periods of Missing Building Energy Use and Weather Data

Journal of Solar Energy Engineering ◽

10.1115/1.2189872 ◽

2005 ◽

Vol 128 (2) ◽

pp. 226-230 ◽

Cited By ~ 7

Author(s):

Juan-Carlos Baltazar ◽

David E. Claridge

Keyword(s):

Time Series ◽

Fourier Series ◽

Energy Use ◽

Time Series Data ◽

Linear Interpolation ◽

Cubic Splines ◽

Weather Data ◽

Series Data ◽

Data Sets ◽

Data Points

A study of cubic splines and Fourier series as interpolation techniques for filling in missing hourly data in energy and meteorological time series data sets is presented. The procedure developed in this paper is based on the local patterns of the data around the gaps. Artificial gaps, or “pseudogaps,” created by deleting consecutive data points from the measured data sets, were filled using four variants of the cubic spline technique and 12 variants of the Fourier series technique. The accuracy of these techniques was compared to the accuracy of results obtained using linear interpolation to fill the same pseudogaps. The pseudogaps filled were 1–6 data points in length created in 18 year-long sets of hourly energy use and weather data. More than 1000 pseudogaps of each gap length were created in each of the 18 data sets and filled using each of the 17 techniques evaluated. Use of mean bias error as the selection criterion found that linear interpolation is superior to the cubic spline and Fourier series methodologies for filling gaps of dry bulb and dew point temperature time series data. For hourly building cooling and heating use data, the Fourier series approach with 24 data points before and after each gap and six terms was found to be the most suitable; where there are insufficient data points to apply this approach, simple linear interpolation is recommended.

Download Full-text

Measuring electric energy efficiency in Portuguese households

Management of Environmental Quality An International Journal ◽

10.1108/meq-03-2014-0035 ◽

2015 ◽

Vol 26 (3) ◽

pp. 407-422 ◽

Cited By ~ 23

Author(s):

Thomas Weyman-Jones ◽

Júlia Mendonça Boucinha ◽

Catarina Feteira Inácio

Keyword(s):

Time Series ◽

Energy Efficiency ◽

Energy Demand ◽

Energy Use ◽

Time Series Data ◽

Electric Energy ◽

Electrical Energy ◽

Series Data ◽

Data Set ◽

Content Type

Purpose – There is a great interest from the European Union in measuring the efficiency of energy use in households, and this is an area where EDP has done research in both data collection and methodology. This paper reports on a survey of electric energy use in Portuguese households, and reviews and extends the analysis of how efficiently households use electrical energy. The purpose of this paper is to evaluate household electrical energy efficiency in different regions using econometric analysis of the survey data. In addition, the same methodology was applied to a time-series data set, to evaluate recent developments in energy efficiency. Design/methodology/approach – The paper describes the application to Portuguese households of a new approach to evaluate energy efficiency, developed by Filippini and Hunt (2011, 2012) in which an econometric energy demand model was estimated to control for exogenous variables determining energy demand. The variation in energy efficiency over time and space could then be estimated by applying econometric efficiency analysis to determine the variation in energy efficiency. Findings – The results obtained allowed the identification of priority regions and consumer bands to reduce inefficiency in electricity consumption. The time-series data set shows that the expected electricity savings from the efficiency measures recently introduced by official authorities were fully realized. Research limitations/implications – This approach gives some guidance on how to introduce electricity saving measures in a more cost effective way. Originality/value – This paper outlines a new procedure for developing useful tools for modelling energy efficiency.

Download Full-text

Evaluating the performance of Gulf of Alaska walleye pollock (Theragra chalcogramma) recruitment forecasting models using a Monte Carlo resampling strategy

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f08-203 ◽

2009 ◽

Vol 66 (3) ◽

pp. 367-381 ◽

Cited By ~ 7

Author(s):

Yong-Woo Lee ◽

Bernard A. Megrey ◽

S. Allen Macklin

Keyword(s):

Time Series ◽

Monte Carlo ◽

Gulf Of Alaska ◽

Data Set ◽

Forecasting Models ◽

Walleye Pollock Theragra Chalcogramma ◽

Sample Data ◽

Out Of Sample ◽

Data Points ◽

Recruitment Model

Multiple linear regressions (MLRs), generalized additive models (GAMs), and artificial neural networks (ANNs) were compared as methods to forecast recruitment of Gulf of Alaska walleye pollock ( Theragra chalcogramma ). Each model, based on a conceptual model, was applied to a 41-year time series of recruitment, spawner biomass, and environmental covariates. A subset of the available time series, an in-sample data set consisting of 35 of the 41 data points, was used to fit an environment-dependent recruitment model. Influential covariates were identified through statistical variable selection methods to build the best explanatory recruitment model. An out-of-sample set of six data points was retained for model validation. We tested each model’s ability to forecast recruitment by applying them to an out-of-sample data set. For a more robust evaluation of forecast accuracy, models were tested with Monte Carlo resampling trials. The ANNs outperformed the other techniques during the model fitting process. For forecasting, the ANNs were not statistically different from MLRs or GAMs. The results indicated that more complex models tend to be more susceptible to an overparameterization problem. The procedures described in this study show promise for building and testing recruitment forecasting models for other fish species.

Download Full-text

Relationship among Continuous Probability Distributions and Interpolation

Asian Journal of Probability and Statistics ◽

10.9734/ajpas/2021/v15i430374 ◽

2021 ◽

pp. 196-210

Author(s):

Kelachi P. Enwere ◽

Uchenna P. Ogoke

Keyword(s):

Probability Distribution ◽

Probability Distributions ◽

Linear Interpolation ◽

Data Set ◽

Statistical Probability ◽

Practical Applications ◽

Gamma Distributions ◽

Chi Squared ◽

Interpolation Techniques ◽

The Difference

Aims: The Study seeks to determine the relationship that exists among Continuous Probability Distributions and the use of Interpolation Techniques to estimate unavailable but desired value of a given probability distribution. Study Design: Statistical Probability tables for Normal, Student t, Chi-squared, F and Gamma distributions were used to compare interpolated values with statistical tabulated values. Charts and Tables were used to represent the relationships among the five probability distributions. Methodology: Linear Interpolation Technique was employed to interpolate unavailable but desired values so as to obtain approximate values from the statistical tables. The data were analyzed for interpolation of unavailable but desired values at 95% a-level from the five continuous probability distribution. Results: Interpolated values are as close as possible to the exact values and the difference between the exact value and the interpolated value is not pronounced. The table and chart established showed that relationships do exist among the Normal, Student-t, Chi-squared, F and Gamma distributions. Conclusion: Interpolation techniques can be applied to obtain unavailable but desired information in a data set. Thus, uncertainty found in a data set can be discovered, then analyzed and interpreted to produce desired results. However, understanding of how these probability distributions are related to each other can inform how best these distributions can be used interchangeably by Statisticians and other Researchers who apply statistical methods employed in practical applications.

Download Full-text

WINNING ENTRY OF THE K. U. LEUVEN TIME-SERIES PREDICTION COMPETITION

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127499001048 ◽

1999 ◽

Vol 09 (08) ◽

pp. 1485-1500 ◽

Cited By ~ 26

Author(s):

J. McNAMES ◽

J. A. K. SUYKENS ◽

J. VANDEWALLE

Keyword(s):

Time Series ◽

Exploratory Data Analysis ◽

Nonlinear Modeling ◽

International Workshop ◽

Time Series Prediction ◽

Data Set ◽

Local Averaging ◽

Local Modeling ◽

Data Points ◽

Exploratory Data

In this paper we describe the winning entry of the time-series prediction competition which was part of the International Workshop on Advanced Black-Box Techniques for Nonlinear Modeling, held at K. U. Leuven, Belgium on July 8–10, 1998. We also describe the source of the data set, a nonlinear transform of a 5-scroll generalized Chua's circuit. Participants were given 2000 data points and were asked to predict the next 200 points in the series. The winning entry exploited symmetry that was discovered during exploratory data analysis and a method of local modeling designed specifically for the prediction of chaotic time-series. This method includes an exponentially weighted metric, a nearest trajectory algorithm, integrated local averaging, and a novel multistep ahead cross-validation estimation of model error for the purpose of parameter optimization.

Download Full-text

Weighted Double-Logistic Function Fitting Method for Reconstructing the High-Quality Sentinel-2 NDVI Time Series Data Set

Remote Sensing ◽

10.3390/rs11202342 ◽

2019 ◽

Vol 11 (20) ◽

pp. 2342 ◽

Cited By ~ 4

Author(s):

Yang ◽

Luo ◽

Huang ◽

Wu ◽

Sun

Keyword(s):

Time Series ◽

Land Surface ◽

Vegetation Index ◽

Logistic Function ◽

Series Data ◽

Data Set ◽

Free Data ◽

Data Points ◽

Noise Factors ◽

Sentinel 2

The time series (TS) of the normalized difference vegetation index (NDVI) has been widely used to trace the temporal and spatial variability of terrestrial vegetation. However, many factors such as atmospheric noise and radiometric correction residuals conceal the actual variation in the land surface, and thus hamper the TS information extraction. To minimize the negative effects of these noise factors, we propose a new method to produce a synthetic gap-free NDVI TS from the original contaminated observation. First, the key temporal points are identified from the NDVI time profiles based on a generally used rule-based strategy, making the TS segmented into several adjacent segments. Then, the observed data points in each segment are fitted with a weighted double-logistic function. The proposed dynamic weight reassignment process effectively emphasizes cloud-free points and deemphasizes cloud-contaminated points. Finally, the proposed method is evaluated on more than 3,000 test points from three selected Sentinel-2 tiles, and is compared with the generally used Savitzky-Golay (S-G) and harmonic analysis of time series (HANTS) methods from qualitative and quantitative aspects. The results indicate that the proposed method has a higher capability of retaining cloud-free data points and identifying outliers than the others, and can generate a gap-free NDVI time profile derived from a medium-resolution satellite sensor.

Download Full-text

EXTREME VALUES OF SEA SURFACE TEMPERATURE ASSOCIATED WITH LONG-PERIOD PHENOMENA OCCURRED DURING 1960-2015 IN THE COLOMBIAN PACIFIC OCEAN

Proceedings of International Conference "Managinag risks to coastal regions and communities in a changinag world" (EMECS'11 - SeaCoasts XXVI) ◽

10.31519/conferencearticle_5b1b943a9e4336.75393991 ◽

2017 ◽

Author(s):

Diaz Juan Navia ◽

Bolaños Nancy Villegas ◽

Igor Malikov ◽

...

Keyword(s):

Time Series ◽

Pacific Ocean ◽

Surface Temperature ◽

Sea Surface Temperature ◽

Extreme Values ◽

Sea Surface ◽

Absolute Maximum ◽

Data Set ◽

Arctic Oscillation Index ◽

Colombian Pacific

Sea Surface Temperature Anomalies (SSTA), in four coastal hydrographic stations of Colombian Pacific Ocean, were analyzed. The selected hydrographic stations were: Tumaco (1°48'N-78°45'W), Gorgona island (2°58'N-78°11'W), Solano Bay (6°13'N-77°24'W) and Malpelo island (4°0'N-81°36'W). SSTA time series for 1960-2015 were calculated from monthly Sea Surface Temperature obtained from International Comprehensive Ocean Atmosphere Data Set (ICOADS). SSTA time series, Oceanic Nino Index (ONI), Pacific Decadal Oscillation index (PDO), Arctic Oscillation index (AO) and sunspots number (associated to solar activity), were compared. It was found that the SSTA absolute minimum has occurred in Tumaco (-3.93°C) in March 2009, in Gorgona (-3.71°C) in October 2007, in Solano Bay (-4.23°C) in April 2014 and Malpelo (-4.21°C) in December 2005. The SSTA absolute maximum was observed in Tumaco (3.45°C) in January 2002, in Gorgona (5.01°C) in July 1978, in Solano Bay (5.27°C) in March 1998 and Malpelo (3.64°C) in July 2015. A high correlation between SST and ONI in large part of study period, followed by a good correlation with PDO, was identified. The AO and SSTA have showed an inverse relationship in some periods. Solar Cycle has showed to be a modulator of behavior of SSTA in the selected stations. It was determined that extreme values of SST are related to the analyzed large scale oscillations.

Download Full-text

The Lyapunov Exponent Variance of an Electronic Manufacturing Enterprise’s Daily Qualified Rate Time Series by Improved Small Data Sets Approach

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.197.271 ◽

2012 ◽

Vol 197 ◽

pp. 271-277

Author(s):

Zhu Ping Gong

Keyword(s):

Time Series ◽

Lyapunov Exponent ◽

Chaotic Time Series ◽

Quality System ◽

Quality Level ◽

Largest Lyapunov Exponent ◽

Small Data ◽

Data Set ◽

Electronic Manufacturing ◽

Small Data Set

Small data set approach is used for the estimation of Largest Lyapunov Exponent (LLE). Primarily, the mean period drawback of Small data set was corrected. On this base, the LLEs of daily qualified rate time series of HZ, an electronic manufacturing enterprise, were estimated and all positive LLEs were taken which indicate that this time series is a chaotic time series and the corresponding produce process is a chaotic process. The variance of the LLEs revealed the struggle between the divergence nature of quality system and quality control effort. LLEs showed sharp increase in getting worse quality level coincide with the company shutdown. HZ’s daily qualified rate, a chaotic time series, shows us the predictable nature of quality system in a short-run.

Download Full-text

Measuring Congestion and Reliability Impacts of Safety Projects

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211006729 ◽

2021 ◽

pp. 036119812110067

Author(s):

Simona Babiceanu ◽

Sanhita Lahiri ◽

Mena Lockwood

Keyword(s):

Performance Measures ◽

Positive Impact ◽

Operating Conditions ◽

Vehicle Miles Traveled ◽

Data Set ◽

Data Points ◽

Practical Recommendations

This study uses a suite of performance measures that was developed by taking into consideration various aspects of congestion and reliability, to assess impacts of safety projects on congestion. Safety projects are necessary to help move Virginia’s roadways toward safer operation, but can contribute to congestion and unreliability during execution, and can affect operations after execution. However, safety projects are assessed primarily for safety improvements, not for congestion. This study identifies an appropriate suite of measures, and quantifies and compares the congestion and reliability impacts of safety projects on roadways for the periods before, during, and after project execution. The paper presents the performance measures, examines their sensitivity based on operating conditions, defines thresholds for congestion and reliability, and demonstrates the measures using a set of Virginia safety projects. The data set consists of 10 projects totalling 92 mi and more than 1M data points. The study found that, overall, safety projects tended to have a positive impact on congestion and reliability after completion, and the congestion variability measures were sensitive to the threshold of reliability. The study concludes with practical recommendations for primary measures that may be used to measure overall impacts of safety projects: percent vehicle miles traveled (VMT) reliable with a customized threshold for Virginia; percent VMT delayed; and time to travel 10 mi. However, caution should be used when applying the results directly to other situations, because of the limited number of projects used in the study.

Download Full-text

The Study of Multiple Classes Boosting Classification Method Based on Local Similarity

Algorithms ◽

10.3390/a14020037 ◽

2021 ◽

Vol 14 (2) ◽

pp. 37

Author(s):

Shixun Wang ◽

Qiang Chen

Keyword(s):

Image Retrieval ◽

Loss Function ◽

Single Mode ◽

Local Similarity ◽

Text And Image ◽

Data Set ◽

Standard Data ◽

Weak Learner ◽

Great Progress ◽

Data Points

Boosting of the ensemble learning model has made great progress, but most of the methods are Boosting the single mode. For this reason, based on the simple multiclass enhancement framework that uses local similarity as a weak learner, it is extended to multimodal multiclass enhancement Boosting. First, based on the local similarity as a weak learner, the loss function is used to find the basic loss, and the logarithmic data points are binarized. Then, we find the optimal local similarity and find the corresponding loss. Compared with the basic loss, the smaller one is the best so far. Second, the local similarity of the two points is calculated, and then the loss is calculated by the local similarity of the two points. Finally, the text and image are retrieved from each other, and the correct rate of text and image retrieval is obtained, respectively. The experimental results show that the multimodal multi-class enhancement framework with local similarity as the weak learner is evaluated on the standard data set and compared with other most advanced methods, showing the experience proficiency of this method.

Download Full-text