scholarly journals New Fast ApEn and SampEn Entropy Algorithms Implementation and Their Application to Supercomputer Power Consumption

Entropy ◽  
2020 ◽  
Vol 22 (8) ◽  
pp. 863 ◽  
Author(s):  
Jiří Tomčala

Approximate Entropy and especially Sample Entropy are recently frequently used algorithms for calculating the measure of complexity of a time series. A lesser known fact is that there are also accelerated modifications of these two algorithms, namely Fast Approximate Entropy and Fast Sample Entropy. All these algorithms are effectively implemented in the R software package TSEntropies. This paper contains not only an explanation of all these algorithms, but also the principle of their acceleration. Furthermore, the paper contains a description of the functions of this software package and their parameters, as well as simple examples of using this software package to calculate these measures of complexity of an artificial time series and the time series of a complex real-world system represented by the course of supercomputer infrastructure power consumption. These time series were also used to test the speed of this package and to compare its speed with another R package pracma. The results show that TSEntropies is up to 100 times faster than pracma and another important result is that the computational times of the new Fast Approximate Entropy and Fast Sample Entropy algorithms are up to 500 times lower than the computational times of their original versions. At the very end of this paper, the possible use of this software package TSEntropies is proposed.

2021 ◽  
Author(s):  
Qingqing Chen ◽  
Ate Poorthuis

Identifying meaningful locations, such as home or work, from human mobility data has become an increasingly common prerequisite for geographic research. Although location-based services (LBS) and other mobile technology have rapidly grown in recent years, it can be challenging to infer meaningful places from such data, which - compared to conventional datasets – can be devoid of context. Existing approaches are often developed ad-hoc and can lack transparency and reproducibility. To address this, we introduce an R software package for inferring home locations from LBS data. The package implements pre-existing algorithms and provides building blocks to make writing algorithmic ‘recipes’ more convenient. We evaluate this approach by analyzing a de-identified LBS dataset from Singapore that aims to balance ethics and privacy with the research goal of identifying meaningful locations. We show that ensemble approaches, combining multiple algorithms, can be especially valuable in this regard as the resulting patterns of inferred home locations closely correlate with the distribution of residential population. We hope this package, and others like it, will contribute to an increase in use and sharing of comparable algorithms, research code and data. This will increase transparency and reproducibility in mobility analyses and further the ongoing discourse around ethical big data research.


Author(s):  
D. Cuesta-Frau ◽  
P. Miro-Martinez ◽  
S. Oltra-Crespo ◽  
M. Varela-Entrecanales ◽  
M. Aboy ◽  
...  

Entropy ◽  
2021 ◽  
Vol 24 (1) ◽  
pp. 73
Author(s):  
Dragana Bajić ◽  
Nina Japundžić-Žigon

Approximate and sample entropies are acclaimed tools for quantifying the regularity and unpredictability of time series. This paper analyses the causes of their inconsistencies. It is shown that the major problem is a coarse quantization of matching probabilities, causing a large error between their estimated and true values. Error distribution is symmetric, so in sample entropy, where matching probabilities are directly summed, errors cancel each other. In approximate entropy, errors are accumulating, as sums involve logarithms of matching probabilities. Increasing the time series length increases the number of quantization levels, and errors in entropy disappear both in approximate and in sample entropies. The distribution of time series also affects the errors. If it is asymmetric, the matching probabilities are asymmetric as well, so the matching probability errors cease to be mutually canceled and cause a persistent entropy error. Despite the accepted opinion, the influence of self-matching is marginal as it just shifts the error distribution along the error axis by the matching probability quant. Artificial lengthening the time series by interpolation, on the other hand, induces large error as interpolated samples are statistically dependent and destroy the level of unpredictability that is inherent to the original signal.


2020 ◽  
Vol 5 ◽  
pp. 252
Author(s):  
Jim R. Broadbent ◽  
Christopher N. Foley ◽  
Andrew J. Grant ◽  
Amy M. Mason ◽  
James R. Staley ◽  
...  

The MendelianRandomization package is a software package written for the R software environment that implements methods for Mendelian randomization based on summarized data. In this manuscript, we describe functions that have been added to the package or updated in recent years. These features can be divided into four categories: robust methods for Mendelian randomization, methods for multivariable Mendelian randomization, functions for data visualization, and the ability to load data into the package seamlessly from the PhenoScanner web-resource. We provide examples of the graphical output produced by the data visualization commands, as well as syntax for obtaining suitable data and performing a Mendelian randomization analysis in a single line of code.


2000 ◽  
Vol 278 (6) ◽  
pp. H2039-H2049 ◽  
Author(s):  
Joshua S. Richman ◽  
J. Randall Moorman

Entropy, as it relates to dynamical systems, is the rate of information production. Methods for estimation of the entropy of a system represented by a time series are not, however, well suited to analysis of the short and noisy data sets encountered in cardiovascular and other biological studies. Pincus introduced approximate entropy (ApEn), a set of measures of system complexity closely related to entropy, which is easily applied to clinical cardiovascular and other time series. ApEn statistics, however, lead to inconsistent results. We have developed a new and related complexity measure, sample entropy (SampEn), and have compared ApEn and SampEn by using them to analyze sets of random numbers with known probabilistic character. We have also evaluated cross-ApEn and cross-SampEn, which use cardiovascular data sets to measure the similarity of two distinct time series. SampEn agreed with theory much more closely than ApEn over a broad range of conditions. The improved accuracy of SampEn statistics should make them useful in the study of experimental clinical cardiovascular and other biological time series.


2020 ◽  
Vol 5 ◽  
pp. 252
Author(s):  
Jim R. Broadbent ◽  
Christopher N. Foley ◽  
Andrew J. Grant ◽  
Amy M. Mason ◽  
James R. Staley ◽  
...  

The MendelianRandomization package is a software package written for the R software environment that implements methods for Mendelian randomization based on summarized data. In this manuscript, we describe functions that have been added to the package or updated in recent years. These features can be divided into four categories: robust methods for Mendelian randomization, methods for multivariable Mendelian randomization, functions for data visualization, and the ability to load data into the package seamlessly from the PhenoScanner web-resource. We provide examples of the graphical output produced by the data visualization commands, as well as syntax for obtaining suitable data and performing a Mendelian randomization analysis in a single line of code.


2014 ◽  
Vol 13 ◽  
pp. CIN.S13495 ◽  
Author(s):  
Ying Hu ◽  
Chunhua Yan ◽  
Chih-Hao Hsu ◽  
Qing-Rong Chen ◽  
Kelvin Niu ◽  
...  

Summary OmicCircos is an R software package used to generate high-quality circular plots for visualizing genomic variations, including mutation patterns, copy number variations (CNVs), expression patterns, and methylation patterns. Such variations can be displayed as scatterplot, line, or text-label figures. Relationships among genomic features in different chromosome positions can be represented in the forms of polygons or curves. Utilizing the statistical and graphic functions in an R/Bioconductor environment, OmicCircos performs statistical analyses and displays results using cluster, boxplot, histogram, and heatmap formats. In addition, OmicCircos offers a number of unique capabilities, including independent track drawing for easy modification and integration, zoom functions, link-polygons, and position-independent heatmaps supporting detailed visualization. Availability and Implementation OmicCircos is available through Bioconductor at http://www.bioconductor.org/packages/devel/bioc/html/OmicCircos.html . An extensive vignette in the package describes installation, data formatting, and workflow procedures. The software is open source under the Artistic—2.0 license.


Entropy ◽  
2020 ◽  
Vol 22 (6) ◽  
pp. 694
Author(s):  
Sebastian Żurek ◽  
Waldemar Grabowski ◽  
Klaudia Wojtiuk ◽  
Dorota Szewczak ◽  
Przemysław Guzik ◽  
...  

Relative consistency is a notion related to entropic parameters, most notably to Approximate Entropy and Sample Entropy. It is a central characteristic assumed for e.g., biomedical and economic time series, since it allows the comparison between different time series at a single value of the threshold parameter r. There is no formal proof for this property, yet it is generally accepted that it is true. Relative consistency in both Approximate Entropy and Sample entropy was first tested with the M I X process. In the seminal paper by Richman and Moorman, it was shown that Approximate Entropy lacked the property for cases in which Sample Entropy did not. In the present paper, we show that relative consistency is not preserved for M I X processes if enough noise is added, yet it is preserved for another process for which we define a sum of a sinusoidal and a stochastic element, no matter how much noise is present. The analysis presented in this paper is only possible because of the existence of the very fast NCM algorithm for calculating correlation sums and thus also Sample Entropy.


2021 ◽  
Author(s):  
Zuguang Gu ◽  
Daniel Huebschmann

Spiral layout has two major advantages for data visualization. First, it is able to visualize data with long axes, which greatly improves the resolution of visualization. Second, it is efficient for time series data to reveal periodic patterns. Here we present the R package spiralize that provides a general solution for visualizing data on spirals. spiralize implements numerous graphics functions so that self-defined high-level graphics can be easily implemented by users. The power of spiralize is demonstrated by five real world datasets.


Sign in / Sign up

Export Citation Format

Share Document