Scalable Clustering of Individual Electrical Curves for Profiling and Bottom-Up Forecasting

Benjamin Auder; Jairo Cugliari; Yannig Goude; Jean-Michel Poggi

doi:10.3390/en11071893

Scalable Clustering of Individual Electrical Curves for Profiling and Bottom-Up Forecasting

Energies ◽

10.3390/en11071893 ◽

2018 ◽

Vol 11 (7) ◽

pp. 1893 ◽

Cited By ~ 10

Author(s):

Benjamin Auder ◽

Jairo Cugliari ◽

Yannig Goude ◽

Jean-Michel Poggi

Keyword(s):

Data Analysis ◽

Data Storage ◽

Smart Grids ◽

Load Forecasting ◽

Nonparametric Model ◽

Software Environment ◽

Bottom Up ◽

Nested Partitions ◽

Scalable Clustering ◽

The One

Smart grids require flexible data driven forecasting methods. We propose clustering tools for bottom-up short-term load forecasting. We focus on individual consumption data analysis which plays a major role for energy management and electricity load forecasting. The first section is dedicated to the industrial context and a review of individual electrical data analysis. Then, we focus on hierarchical time-series for bottom-up forecasting. The idea is to decompose the global signal and obtain disaggregated forecasts in such a way that their sum enhances the prediction. This is done in three steps: identify a rather large number of super-consumers by clustering their energy profiles, generate a hierarchy of nested partitions and choose the one that minimize a prediction criterion. Using a nonparametric model to handle forecasting, and wavelets to define various notions of similarity between load curves, this disaggregation strategy gives a 16% improvement in forecasting accuracy when applied to French individual consumers. Then, this strategy is implemented using R—the free software environment for statistical computing—so that it can scale when dealing with massive datasets. The proposed solution is to make the algorithm scalable combine data storage, parallel computing and double clustering step to define the super-consumers. The resulting software is openly available.

Download Full-text

Scalable Clustering of Individual Electrical Curves for Profiling and Bottom-up Forecasting

10.20944/preprints201807.0019.v1 ◽

2018 ◽

Author(s):

Benjamin Auder ◽

Jairo Cugliari ◽

Yannig Goude ◽

Jean-Michel Poggi

Keyword(s):

Data Analysis ◽

Data Storage ◽

Smart Grids ◽

Load Forecasting ◽

Forecast Accuracy ◽

Nonparametric Model ◽

Software Environment ◽

Bottom Up ◽

Scalable Clustering ◽

Curve Clustering

Smart grids require flexible data driven forecasting methods. We propose clustering tools for bottom-up short-term load forecasting. We focus on individual consumption data analysis which plays a major role for energy management and electricity load forecasting. The two first sections are dedicated to the industrial context and a review of individual electrical data analysis. We are interested in hierarchical time-series for bottom-up forecasting. The idea is to disaggregate the signal in such a way that the sum of disaggregated forecasts improves the direct prediction. The 3-steps strategy defines numerous super-consumers by curve clustering, builds a hierarchy of partitions and selects the best one minimizing a forecast criterion. Using a nonparametric model to handle forecasting, and wavelets to define various notions of similarity between load curves, this disaggregation strategy applied to French individual consumers leads to a gain of 16\% in forecast accuracy. We then explore the upscaling capacity of this strategy facing massive data and implement proposals using R, the free software environment for statistical computing. The proposed solutions to make the algorithm scalable combines data storage, parallel computing and double clustering step to define the super-consumers.

Download Full-text

Metaphor Translation towards Cilinaye Manuscript

International Journal of Linguistics Literature and Culture ◽

10.21744/ijllc.v3i4.521 ◽

2017 ◽

Vol 3 (4) ◽

pp. 43

Author(s):

Sarwadi

Keyword(s):

Data Analysis ◽

Bottom Up ◽

Metaphorical Expression ◽

East West ◽

Culture And Language

Cilinaye manuscript was one script in Sasak language that was written on Aksara Jejawen or it was known as Akasara Hanacaraka. It has a remarkable meaning especially a metaphorical expression due to be not everyone has an ability to use metaphors however everyone can understand its meaning in the same culture and language unlike Suku Sasak (Sasak Tribe). The present study was intended to find out what metaphors were found in Cilinaye manuscript and the concept of metaphor found on it. The results of the present research included 1) The meaning of metaphor in Sasak language can mean different with the use of the same symbol when attached by morpheme e.g. 'lauk daye' attached morpheme 'be' become 'belauk bedaye'. 2) The concept of metaphor according to Ching. Ed. (1980) includes human, animate, living, objective, terrestrial, substantial, energy, cosmic, and being is not completed due to in the data analysis, the researchers found there are metaphors that use directions like bottom up, front behind, east west, south north.

Download Full-text

Upgraded Railway Front Searchlight Design Plastic Deformations by its Vibrations with Resonance Frequencies

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.684.111 ◽

2016 ◽

Vol 684 ◽

pp. 111-119 ◽

Cited By ~ 4

Author(s):

Stanislav Rafaelevich Abulkhanov ◽

Dmitrii Sergeevich Goryainov

Keyword(s):

Natural Frequencies ◽

Ansys Software ◽

Software Environment ◽

Low Frequencies ◽

Plastic Deformations ◽

Vibration Resistance ◽

First Case ◽

Resonance Frequencies ◽

The One ◽

Electric Lamp

Natural frequencies of the four upgraded front searchlight designs were received in ANSYS software environment. In the first case serial front searchlight incandescent electric lamp was replaced by a LED group which was mounted on the one-piece cylinder backing. The second front searchlight design had the backing which was upgraded by a radial ribs and concentric rigidity ferrules. Analyze of the backing deformation character by vibrations with the natural frequencies established a number of design solutions which make it possible to raise front searchlight vibration resistance. By the front searchlight model were established that the natural frequencies of the searchlight with the one-piece backing appertain to the whole range of the train vibrations. Natural frequencies of the backing with perforation, rigidity ferrules, and radial ribs appertain to the low frequencies of the railway locomotive vibrations spectrum. On basis of devised methodology of analyze of the deformation and natural frequencies of the surface carrying a LED group the vibration-proof searchlight design was introduced and researched.

Download Full-text

Linking climate, gross primary productivity, and site index across forests of the western United States

Canadian Journal of Forest Research ◽

10.1139/x11-086 ◽

2011 ◽

Vol 41 (8) ◽

pp. 1710-1721 ◽

Cited By ~ 53

Author(s):

Aaron R. Weiskittel ◽

Nicholas L. Crookston ◽

Philip J. Radtke

Keyword(s):

Climate Change ◽

Growth Models ◽

Gross Primary Production ◽

Site Index ◽

Nonparametric Model ◽

Climate Change Scenarios ◽

Geographic Scale ◽

Potential Productivity ◽

The One ◽

The Relationship

Assessing forest productivity is important for developing effective management regimes and predicting future growth. Despite some important limitations, the most common means for quantifying forest stand-level potential productivity is site index (SI). Another measure of productivity is gross primary production (GPP). In this paper, SI is compared with GPP estimates obtained from 3-PG and NASA’s MODIS satellite. Models were constructed that predict SI and both measures of GPP from climate variables. Results indicated that a nonparametric model with two climate-related predictor variables explained over 68% and 76% of the variation in SI and GPP, respectively. The relationship between GPP and SI was limited (R2 of 36%–56%), while the relationship between GPP and climate (R2 of 76%–91%) was stronger than the one between SI and climate (R2 of 68%–78%). The developed SI model was used to predict SI under varying expected climate change scenarios. The predominant trend was an increase of 0–5 m in SI, with some sites experiencing reductions of up to 10 m. The developed model can predict SI across a broad geographic scale and into the future, which statistical growth models can use to represent the expected effects of climate change more effectively.

Download Full-text

Challenges of Big Data analysis

National Science Review ◽

10.1093/nsr/nwt032 ◽

2014 ◽

Vol 1 (2) ◽

pp. 293-314 ◽

Cited By ~ 468

Author(s):

Jianqing Fan ◽

Fang Han ◽

Han Liu

Keyword(s):

Big Data ◽

Data Analysis ◽

Measurement Errors ◽

Modern Society ◽

Big Data Analysis ◽

Small Scale ◽

Paradigm Change ◽

Statistical Inferences ◽

The One ◽

And Storage

Abstract Big Data bring new opportunities to modern society and challenges to data scientists. On the one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This paper gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogenous assumptions in most statistical methods for Big Data cannot be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

Download Full-text

Bottom up proteomics data analysis strategies to explore protein modifications and genomic variants

PROTEOMICS ◽

10.1002/pmic.201400186 ◽

2015 ◽

Vol 15 (11) ◽

pp. 1789-1792 ◽

Cited By ~ 3

Author(s):

Ana Sofia Carvalho ◽

Deborah Penque ◽

Rune Matthiesen

Keyword(s):

Data Analysis ◽

Proteomics Data ◽

Bottom Up ◽

Protein Modifications ◽

Genomic Variants ◽

Analysis Strategies ◽

Proteomics Data Analysis

Download Full-text

Using a distributed deep learning algorithm for analyzing big data in smart cities

Smart and Sustainable Built Environment ◽

10.1108/sasbe-04-2019-0040 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Mohammed Anouar Naoui ◽

Brahim Lejdel ◽

Mouloud Ayad ◽

Abdelfattah Amamra ◽

Okba kazar

Keyword(s):

Big Data ◽

Deep Learning ◽

Data Analysis ◽

Data Storage ◽

Smart City ◽

Smart Cities ◽

Smart Environment ◽

Data Systems ◽

Content Type ◽

Big Data Systems

PurposeThe purpose of this paper is to propose a distributed deep learning architecture for smart cities in big data systems.Design/methodology/approachWe have proposed an architectural multilayer to describe the distributed deep learning for smart cities in big data systems. The components of our system are Smart city layer, big data layer, and deep learning layer. The Smart city layer responsible for the question of Smart city components, its Internet of things, sensors and effectors, and its integration in the system, big data layer concerns data characteristics 10, and its distribution over the system. The deep learning layer is the model of our system. It is responsible for data analysis.FindingsWe apply our proposed architecture in a Smart environment and Smart energy. 10; In a Smart environment, we study the Toluene forecasting in Madrid Smart city. For Smart energy, we study wind energy foresting in Australia. Our proposed architecture can reduce the time of execution and improve the deep learning model, such as Long Term Short Memory10;.Research limitations/implicationsThis research needs the application of other deep learning models, such as convolution neuronal network and autoencoder.Practical implicationsFindings of the research will be helpful in Smart city architecture. It can provide a clear view into a Smart city, data storage, and data analysis. The 10; Toluene forecasting in a Smart environment can help the decision-maker to ensure environmental safety. The Smart energy of our proposed model can give a clear prediction of power generation.Originality/valueThe findings of this study are expected to contribute valuable information to decision-makers for a better understanding of the key to Smart city architecture. Its relation with data storage, processing, and data analysis.

Download Full-text

The Use of Distributed Data Storage and Processing Systems in Bioinformatic Data Analysis

Beyond Databases, Architectures and Structures. Facing the Challenges of Data Proliferation and Growing Variety - Communications in Computer and Information Science ◽

10.1007/978-3-319-99987-6_2 ◽

2018 ◽

pp. 18-32

Author(s):

Michał Bochenek ◽

Kamil Folkert ◽

Roman Jaksik ◽

Michał Krzesiak ◽

Marcin Michalak ◽

...

Keyword(s):

Data Analysis ◽

Data Storage ◽

Distributed Data ◽

Distributed Data Storage

Download Full-text

Differences in Entrepreneurial Skills of College Students in the Mexican Intercultural Context

International Journal of Business and Management ◽

10.5539/ijbm.v11n7p120 ◽

2016 ◽

Vol 11 (7) ◽

pp. 120

Author(s):

Marco Alberto Nunez Ramirez ◽

Teodoro Rafael Wendlandt Amezaga ◽

Maria Trinidad Alvarez Medina ◽

Jorge Ortega Arreola

Keyword(s):

Higher Education ◽

College Students ◽

Data Analysis ◽

Empirical Evidence ◽

Sampling Method ◽

Educational Programs ◽

Sustainable Rural Development ◽

Indigenous Students ◽

Entrepreneurial Skills ◽

The One

The purpose of this study is to describe the development of entrepreneurial skills of college students in the intercultural context of Mexico. By a non-probability sampling method, a sample of 120 students from an intercultural institution of higher education in the Southeastern Mexico was selected, from which two groups (Indigenous and Mestizos) were obtained to perform the corresponding statistical analyses. The first group was integrated by indigenous students (n = 55) and the second group by mestizos (n = 65). For data analysis, the Student t test and one-way analysis of variance (ANOVA) were used. The results showed no significant differences in the entrepreneurial skills between both groups. However, significant differences were obtained when considering the educational programs offered by the intercultural institution, where the program in sustainable rural development was the one that obtained a higher level regarding the development of entrepreneurial skills. This research contributes with empirical evidence to the knowledge on interculturality in this country.

Download Full-text

Design on Big data Platform-based in Higher Education Institute

Higher Education Studies ◽

10.5539/hes.v10n4p36 ◽

2020 ◽

Vol 10 (4) ◽

pp. 36

Author(s):

Sajeewan Pratsri ◽

Prachyanun Nilsook

Keyword(s):

Higher Education ◽

Big Data ◽

Data Analysis ◽

Digital Media ◽

Data Storage ◽

Security And Privacy ◽

Digital Learning ◽

Learning Platform ◽

Data Platform ◽

Tools And Techniques

According to a continuously increasing amount of information in all aspects whether the sources are retrieved from an internal or external organization, a platform should be provided for the automation of whole processes in the collection, storage, and processing of Big Data. The tool for creating Big Data is a Big Data challenge. Furthermore, the security and privacy of Big Data and Big Data analysis in organizations, government agencies, and educational institutions also have an impact on the aspect of designing a Big Data platform for higher education institute (HEi). It is a digital learning platform that is an online instruction and the use of digital media for educational reform including a module provides information on functions of various modules between computers and humans. 1) Big Data architecture is a framework for an architecture of numerous data which consisting of Big Data Infrastructure (BDI), Data Storage (Cloud-based), processing of a computer system that uses all parts of computer resources for optimal efficiency (High-Performance Computing: HPC), a network system to detect the target device network. Thereafter, according to Hadoop’s tools and techniques, when Big Data was introduced with Hadoop's tools and techniques, the benefits of the Big Data platform would provide desired data analysis by retrieving existing information, to illustrate, student information and teaching information that is large amounts of information to adopt for accurate forecasting.

Download Full-text