scholarly journals Some Order Preserving Inequalities for Cross Entropy and Kullback–Leibler Divergence

Entropy ◽  
2018 ◽  
Vol 20 (12) ◽  
pp. 959 ◽  
Author(s):  
Mateu Sbert ◽  
Min Chen ◽  
Jordi Poch ◽  
Anton Bardera

Cross entropy and Kullback–Leibler (K-L) divergence are fundamental quantities of information theory, and they are widely used in many fields. Since cross entropy is the negated logarithm of likelihood, minimizing cross entropy is equivalent to maximizing likelihood, and thus, cross entropy is applied for optimization in machine learning. K-L divergence also stands independently as a commonly used metric for measuring the difference between two distributions. In this paper, we introduce new inequalities regarding cross entropy and K-L divergence by using the fact that cross entropy is the negated logarithm of the weighted geometric mean. We first apply the well-known rearrangement inequality, followed by a recent theorem on weighted Kolmogorov means, and, finally, we introduce a new theorem that directly applies to inequalities between K-L divergences. To illustrate our results, we show numerical examples of distributions.

2011 ◽  
Vol 139 (7) ◽  
pp. 2156-2162 ◽  
Author(s):  
Steven V. Weijs ◽  
Nick van de Giesen

Abstract Recently, an information-theoretical decomposition of Kullback–Leibler divergence into uncertainty, reliability, and resolution was introduced. In this article, this decomposition is generalized to the case where the observation is uncertain. Along with a modified decomposition of the divergence score, a second measure, the cross-entropy score, is presented, which measures the estimated information loss with respect to the truth instead of relative to the uncertain observations. The difference between the two scores is equal to the average observational uncertainty and vanishes when observations are assumed to be perfect. Not acknowledging for observation uncertainty can lead to both overestimation and underestimation of forecast skill, depending on the nature of the noise process.


2019 ◽  
Vol 490 (1) ◽  
pp. 331-342 ◽  
Author(s):  
Luisa Lucie-Smith ◽  
Hiranya V Peiris ◽  
Andrew Pontzen

ABSTRACT We present a generalization of our recently proposed machine-learning framework, aiming to provide new physical insights into dark matter halo formation. We investigate the impact of the initial density and tidal shear fields on the formation of haloes over the mass range 11.4 ≤ log (M/M⊙) ≤ 13.4. The algorithm is trained on an N-body simulation to infer the final mass of the halo to which each dark matter particle will later belong. We then quantify the difference in the predictive accuracy between machine-learning models using a metric based on the Kullback–Leibler divergence. We first train the algorithm with information about the density contrast in the particles’ local environment. The addition of tidal shear information does not yield an improved halo collapse model over one based on density information alone; the difference in their predictive performance is consistent with the statistical uncertainty of the density-only based model. This result is confirmed as we verify the ability of the initial conditions-to-halo mass mapping learnt from one simulation to generalize to independent simulations. Our work illustrates the broader potential of developing interpretable machine-learning frameworks to gain physical understanding of non-linear large-scale structure formation.


Author(s):  
Elena V Esaulenko ◽  
Aleksey A Yakovlev ◽  
Genady A Volkov ◽  
Anastasia A Sukhoruk ◽  
Kirill G Surkov ◽  
...  

Abstract Background This study compares the immunogenicity and safety of a 3-antigen (S/pre-S1/pre-S2) hepatitis B (HepB) vaccine (3AV), to a single antigen vaccine (1AV) in adults to support the registration of 3AV in Russia. Methods We conducted a randomized, double-blind, comparative study of 3-dose regimens of 3AV (10 μg) and 1AV (20 µg) in adults aged 18–45 years. We evaluated immunogenicity based on hepatitis B surface (HBs) antibody titers at days 1, 28, 90, 180, and 210, adverse and serious adverse events (SAEs) to study day 210. The primary outcome was based on the difference in rates of seroconversion at day 210 (lower bound 95% confidence interval [CI]: > − 4%). Secondary outcomes were seroprotection rates (SPR), defined as anti-HBs ≥10 mIU/mL and anti-HBs geometric mean concentration (GMC). Results Rate of seroconversion in 3AV (100%) was noninferior to 1AV (97.9%) at study day 210 (difference: 2.1%, 95% CI: −2.0, 6.3%]) but significantly higher at study day 28. SPR at study day 210 was >97% in both arms. Anti-HBs titers were significantly higher at study days 90 (P = .001) and 180 (P = .0001) with 3AV. Sex, age, and body mass index (BMI) had no impact on anti-HBs titers. The rates of local reactions related to vaccination were similar between vaccine arms (3AV vs 1AV) after the first (30% vs 18.8%, P = .15), second (20.0% vs 14.6%, P = .33), and third vaccination (14.9% vs 23.4%, P = .22). No SAEs were reported. Conclusions 3AV was noninferior to 1AV. 3AV induced high SPR, and there were no safety concerns. Clinical Trials Registration. NCT04209400.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yersultan Mirasbekov ◽  
Adina Zhumakhanova ◽  
Almira Zhantuyakova ◽  
Kuanysh Sarkytbayev ◽  
Dmitry V. Malashenkov ◽  
...  

AbstractA machine learning approach was employed to detect and quantify Microcystis colonial morphospecies using FlowCAM-based imaging flow cytometry. The system was trained and tested using samples from a long-term mesocosm experiment (LMWE, Central Jutland, Denmark). The statistical validation of the classification approaches was performed using Hellinger distances, Bray–Curtis dissimilarity, and Kullback–Leibler divergence. The semi-automatic classification based on well-balanced training sets from Microcystis seasonal bloom provided a high level of intergeneric accuracy (96–100%) but relatively low intrageneric accuracy (67–78%). Our results provide a proof-of-concept of how machine learning approaches can be applied to analyze the colonial microalgae. This approach allowed to evaluate Microcystis seasonal bloom in individual mesocosms with high level of temporal and spatial resolution. The observation that some Microcystis morphotypes completely disappeared and re-appeared along the mesocosm experiment timeline supports the hypothesis of the main transition pathways of colonial Microcystis morphoforms. We demonstrated that significant changes in the training sets with colonial images required for accurate classification of Microcystis spp. from time points differed by only two weeks due to Microcystis high phenotypic heterogeneity during the bloom. We conclude that automatic methods not only allow a performance level of human taxonomist, and thus be a valuable time-saving tool in the routine-like identification of colonial phytoplankton taxa, but also can be applied to increase temporal and spatial resolution of the study.


2021 ◽  
Vol 11 (9) ◽  
pp. 4251
Author(s):  
Jinsong Zhang ◽  
Shuai Zhang ◽  
Jianhua Zhang ◽  
Zhiliang Wang

In the digital microfluidic experiments, the droplet characteristics and flow patterns are generally identified and predicted by the empirical methods, which are difficult to process a large amount of data mining. In addition, due to the existence of inevitable human invention, the inconsistent judgment standards make the comparison between different experiments cumbersome and almost impossible. In this paper, we tried to use machine learning to build algorithms that could automatically identify, judge, and predict flow patterns and droplet characteristics, so that the empirical judgment was transferred to be an intelligent process. The difference on the usual machine learning algorithms, a generalized variable system was introduced to describe the different geometry configurations of the digital microfluidics. Specifically, Buckingham’s theorem had been adopted to obtain multiple groups of dimensionless numbers as the input variables of machine learning algorithms. Through the verification of the algorithms, the SVM and BPNN algorithms had classified and predicted the different flow patterns and droplet characteristics (the length and frequency) successfully. By comparing with the primitive parameters system, the dimensionless numbers system was superior in the predictive capability. The traditional dimensionless numbers selected for the machine learning algorithms should have physical meanings strongly rather than mathematical meanings. The machine learning algorithms applying the dimensionless numbers had declined the dimensionality of the system and the amount of computation and not lose the information of primitive parameters.


2021 ◽  
Vol 11 (6) ◽  
pp. 2784
Author(s):  
Shahnaz TayebiHaghighi ◽  
Insoo Koo

In this paper, the combination of an indirect self-tuning observer, smart signal modeling, and machine learning-based classification is proposed for rolling element bearing (REB) anomaly identification. The proposed scheme has three main stages. In the first stage, the original signal is resampled, and the root mean square (RMS) signal is extracted from it. In the second stage, the normal resampled RMS signal is approximated using the AutoRegressive with eXternal Uncertainty (ARXU) technique. Moreover, the nonlinearity of the bearing signal is solved using the combination of the ARXU and the machine learning-based regression, which is called AMRXU. After signal modeling by AMRXU, the RMS resampled signal is estimated using a combination of the proportional multi-integral (PMI) technique, the variable structure (VS) Lyapunov technique, and a self-tuning network-fuzzy system (SNFS). Finally, in the third stage, the difference between the original signal and the estimated one is calculated to generate the residual signal. A machine learning-based classification technique is utilized to classify the residual signal. The Case Western Reserve University (CWRU) dataset is used to evaluate anomaly identification performance of the proposed scheme. Regarding the experimental results, the average accuracy for REB crack identification is 98.65%, 97.7%, 97.35%, and 97.67%, respectively, when the motor torque loads are 0-hp, 1-hp, 2-hp, and 3-hp.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Jung Eun Huh ◽  
Seunghee Han ◽  
Taeseon Yoon

Abstract Objective In this study we compare the amino acid and codon sequence of SARS-CoV-2, SARS-CoV and MERS-CoV using different statistics programs to understand their characteristics. Specifically, we are interested in how differences in the amino acid and codon sequence can lead to different incubation periods and outbreak periods. Our initial question was to compare SARS-CoV-2 to different viruses in the coronavirus family using BLAST program of NCBI and machine learning algorithms. Results The result of experiments using BLAST, Apriori and Decision Tree has shown that SARS-CoV-2 had high similarity with SARS-CoV while having comparably low similarity with MERS-CoV. We decided to compare the codons of SARS-CoV-2 and MERS-CoV to see the difference. Though the viruses are very alike according to BLAST and Apriori experiments, SVM proved that they can be effectively classified using non-linear kernels. Decision Tree experiment proved several remarkable properties of SARS-CoV-2 amino acid sequence that cannot be found in MERS-CoV amino acid sequence. The consequential purpose of this paper is to minimize the damage on humanity from SARS-CoV-2. Hence, further studies can be focused on the comparison of SARS-CoV-2 virus with other viruses that also can be transmitted during latent periods.


Entropy ◽  
2019 ◽  
Vol 21 (8) ◽  
pp. 721 ◽  
Author(s):  
YuGuang Long ◽  
LiMin Wang ◽  
MingHui Sun

Due to the simplicity and competitive classification performance of the naive Bayes (NB), researchers have proposed many approaches to improve NB by weakening its attribute independence assumption. Through the theoretical analysis of Kullback–Leibler divergence, the difference between NB and its variations lies in different orders of conditional mutual information represented by these augmenting edges in the tree-shaped network structure. In this paper, we propose to relax the independence assumption by further generalizing tree-augmented naive Bayes (TAN) from 1-dependence Bayesian network classifiers (BNC) to arbitrary k-dependence. Sub-models of TAN that are built to respectively represent specific conditional dependence relationships may “best match” the conditional probability distribution over the training data. Extensive experimental results reveal that the proposed algorithm achieves bias-variance trade-off and substantially better generalization performance than state-of-the-art classifiers such as logistic regression.


2009 ◽  
Vol 76-78 ◽  
pp. 459-464
Author(s):  
Jae Won Baik ◽  
Chang Wook Kang

Chemical mechanical polishing (CMP) is a technique used in semiconductor fabrication for planarizing the top surface of an in-process semiconductor wafer. Especially, Post-CMP thickness variations are known to have a severe impact on the stability of downstream processes and ultimately on device yield. Hence understanding how to quantify and characterize this non-uniformity is significant step towards statistical process control to achieve higher quality and enhanced productivity. The main reason is that the non-uniformed interface between the wafer and the machine-pad adversely affects the polishing performance and ultimate surface uniformity. The purpose of this paper is to suggest a new measure that estimates the uniformity of wafer surface considering the difference of the amount of abrasion between the center and the edge. This new measure which is called the Coefficient of Uniformity is defined as the following ratio: Geometric Mean (GM) / Arithmetic Mean (AM). This metric can be evaluated regionally to quantify the non-uniformity on the wafer surface from the center to the edge. Further simulations show that this new measure is insensitive to shift of the wafer center and sensitive to shift of the wafer edge. This trend indicates that this new measure is a very useful to test the non-uniformity of wafer after CMP polishing.


Sign in / Sign up

Export Citation Format

Share Document