scholarly journals Data Cost Games as an Application of 1-Concavity in Cooperative Game Theory

2014 ◽  
Vol 2014 ◽  
pp. 1-5 ◽  
Author(s):  
Dongshuang Hou ◽  
Theo Driessen

The main goal is to reveal the 1-concavity property for a subclass of cost games called data cost games. The motivation for the study of the 1-concavity property is the appealing theoretical results for both the core and the nucleolus, in particular their geometrical characterization as well as their additivity property. The characteristic cost function of the original data cost game assigns to every coalition the additive cost of reproducing the data the coalition does not own. The underlying data and cost sharing situation is composed of three components, namely, the player set, the collection of data sets for individuals, and the additive cost function on the whole data set. The proof of 1-concavity is direct, but robust to a suitable generalization of the characteristic cost function. As an adjunct, the 1-concavity property is shown for the subclass of so-called “bicycle” cost games, inclusive of the data cost games in which the individual data sets are nested in a decreasing order.

Author(s):  
Danlei Xu ◽  
Lan Du ◽  
Hongwei Liu ◽  
Penghui Wang

A Bayesian classifier for sparsity-promoting feature selection is developed in this paper, where a set of nonlinear mappings for the original data is performed as a pre-processing step. The linear classification model with such mappings from the original input space to a nonlinear transformation space can not only construct the nonlinear classification boundary, but also realize the feature selection for the original data. A zero-mean Gaussian prior with Gamma precision and a finite approximation of Beta process prior are used to promote sparsity in the utilization of features and nonlinear mappings in our model, respectively. We derive the Variational Bayesian (VB) inference algorithm for the proposed linear classifier. Experimental results based on the synthetic data set, measured radar data set, high-dimensional gene expression data set, and several benchmark data sets demonstrate the aggressive and robust feature selection capability and comparable classification accuracy of our method comparing with some other existing classifiers.


2019 ◽  
Vol 622 ◽  
pp. A172 ◽  
Author(s):  
F. Murgas ◽  
G. Chen ◽  
E. Pallé ◽  
L. Nortmann ◽  
G. Nowak

Context. Rayleigh scattering in a hydrogen-dominated exoplanet atmosphere can be detected using ground- or space-based telescopes. However, stellar activity in the form of spots can mimic Rayleigh scattering in the observed transmission spectrum. Quantifying this phenomena is key to our correct interpretation of exoplanet atmospheric properties. Aims. We use the ten-meter Gran Telescopio Canarias (GTC) telescope to carry out a ground-based transmission spectra survey of extrasolar planets to characterize their atmospheres. In this paper we investigate the exoplanet HAT-P-11b, a Neptune-sized planet orbiting an active K-type star. Methods. We obtained long-slit optical spectroscopy of two transits of HAT-P-11b with the Optical System for Imaging and low-Intermediate-Resolution Integrated Spectroscopy (OSIRIS) on August 30, 2016 and September 25, 2017. We integrated the spectrum of HAT-P-11 and one reference star in several spectroscopic channels across the λ ~ 400–785 nm region, creating numerous light curves of the transits. We fit analytic transit curves to the data taking into account the systematic effects and red noise present in the time series in an effort to measure the change of the planet-to-star radius ratio (Rp∕Rs) across wavelength. Results. By fitting both transits together, we find a slope in the transmission spectrum showing an increase of the planetary radius towards blue wavelengths. Closer inspection of the transmission spectrum of the individual data sets reveals that the first transit presents this slope while the transmission spectrum of the second data set is flat. Additionally, we detect hints of Na absorption on the first night, but not on the second. We conclude that the transmission spectrum slope and Na absorption excess found in the first transit observation are caused by unocculted stellar spots. Modeling the contribution of unocculted spots to reproduce the results of the first night we find a spot filling factor of δ = 0.62−0.17+0.20 and a spot-to-photosphere temperature difference of ΔT = 429−299+184 K.


2014 ◽  
Vol 31 (8) ◽  
pp. 1778-1789
Author(s):  
Hongkang Lin

Purpose – The clustering/classification method proposed in this study, designated as the PFV-index method, provides the means to solve the following problems for a data set characterized by imprecision and uncertainty: first, discretizing the continuous values of all the individual attributes within a data set; second, evaluating the optimality of the discretization results; third, determining the optimal number of clusters per attribute; and fourth, improving the classification accuracy (CA) of data sets characterized by uncertainty. The paper aims to discuss these issues. Design/methodology/approach – The proposed method for the solution of the clustering/classifying problem, designated as PFV-index method, combines a particle swarm optimization algorithm, fuzzy C-means method, variable precision rough sets theory, and a new cluster validity index function. Findings – This method could cluster the values of the individual attributes within the data set and achieves both the optimal number of clusters and the optimal CA. Originality/value – The validity of the proposed approach is investigated by comparing the classification results obtained for UCI data sets with those obtained by supervised classification BPNN, decision-tree methods.


2019 ◽  
Vol 34 (9) ◽  
pp. 1369-1383 ◽  
Author(s):  
Dirk Diederen ◽  
Ye Liu

Abstract With the ongoing development of distributed hydrological models, flood risk analysis calls for synthetic, gridded precipitation data sets. The availability of large, coherent, gridded re-analysis data sets in combination with the increase in computational power, accommodates the development of new methodology to generate such synthetic data. We tracked moving precipitation fields and classified them using self-organising maps. For each class, we fitted a multivariate mixture model and generated a large set of synthetic, coherent descriptors, which we used to reconstruct moving synthetic precipitation fields. We introduced randomness in the original data set by replacing the observed precipitation fields in the original data set with the synthetic precipitation fields. The output is a continuous, gridded, hourly precipitation data set of a much longer duration, containing physically plausible and spatio-temporally coherent precipitation events. The proposed methodology implicitly provides an important improvement in the spatial coherence of precipitation extremes. We investigate the issue of unrealistic, sudden changes on the grid and demonstrate how a dynamic spatio-temporal generator can provide spatial smoothness in the probability distribution parameters and hence in the return level estimates.


2020 ◽  
Vol 36 (4) ◽  
pp. 1175-1188
Author(s):  
Pierre Lamarche ◽  
Friderike Oehler ◽  
Irene Rioboo

Poverty indicators purely based on income statistics do not reflect the full picture of household’s economic well-being. Consumption and wealth are two additional key dimensions that determine the economic opportunities of people or material inequalities. We use non-parametric statistical matching methods to join consumption data from the Household Budget Survey to micro data from the European Union Statistics on Income and Living Conditions. In a second step, micro data from the Household Finance and Consumption Survey are joint to produce a common distribution of income, consumption and wealth variables. A variety of different indicators is then produced based on this joint data set, in particular household saving rates. Care has to be taken when interpreting the indicators, since the statistical matching is based on strong assumptions and a limited number of variables common to all of the three original data sets. We are able to show, however, that the assumptions made are justified by the use of strong proxies as matching variables. Thus, the resulting indicators have the potential to contribute to the analysis of inequality patterns and enhance the possibilities of social, and possibly fiscal, policy impact analysis.


Author(s):  
CHANGHUA YU ◽  
MICHAEL T. MANRY ◽  
JIANG LI

In the neural network literature, many preprocessing techniques, such as feature de-correlation, input unbiasing and normalization, are suggested to accelerate multilayer perceptron training. In this paper, we show that a network trained with an original data set and one trained with a linear transformation of the original data will go through the same training dynamics, as long as they start from equivalent states. Thus preprocessing techniques may not be helpful and are merely equivalent to using a different weight set to initialize the network. Theoretical analyses of such preprocessing approaches are given for conjugate gradient, back propagation and the Newton method. In addition, an efficient Newton-like training algorithm is proposed for hidden layer training. Experiments on various data sets confirm the theoretical analyses and verify the improvement of the new algorithm.


1969 ◽  
Vol 63 (4) ◽  
pp. 1106-1119 ◽  
Author(s):  
Michael J. Shapiro

In recent years the welter of data accumulated on American voting behavior has been continually reanalyzed by social scientists interested in building theories of electoral choice. Most of the original data-gathering enterprises were guided by general theoretical frameworks which, for the most part, were not developed to a point where the ensuing analyses addressed themselves unambiguously to the overall conceptions by which they were guided. As a result much of our knowledge about voting behavior is in the form of generalizations about what social and psychological variables account for voting choices while we lack conceptual frameworks which systematically interrelate these generalizations and provide comprehensive and parsimonious explanation. If any one unifying conception has emerged from the original large scale studies it is that the average voter is irrational. This inference has been derived from a variety of empirical relationships coupled with varying conceptions of rationality.The more recent reanalyses of these data sets have been characterized by a theoretical sophistication that was lacking heretofore. One of these, a theory of the calculus of voting, has applied some formal rigor to the question of the rationality of the decision to vote, selected empirical equivalents of theoretical entities from survey data on national elections, and conducted a successful test of the theory. Unlike traditional approaches to the rationality question which infer the degree of rationality from quantities of information possessed or from correlates of decisions (background, party affiliation, group memberships, etc.), this investigation conceived of rationality in terms of the kind of calculus employed by the individual in deciding among alternatives (in this case whether or not to vote).


2019 ◽  
Vol 16 (2) ◽  
pp. 445-452
Author(s):  
Kishore S. Verma ◽  
A. Rajesh ◽  
Adeline J. S. Johnsana

K anonymization is one of the worldwide used approaches to protect the individual records from the privacy leakage attack of Privacy Preserving Data Mining (PPDM) arena. Typically anonymized dataset will impact the effectiveness of data mining results. Anyhow, currently researchers of PPDM progress in driving their efforts in finding out the optimum trade-off between privacy and utility. This work tends in bringing out the optimum classifier from a set of best classifiers of data mining approaches that are capable enough in generating value-added classifying results on utility aware k-anonymized data set. We performed the analytical approach on the data set that are anonymized in sense of accompanying the anonymity utility factors like null values count and transformation pattern loss. The experimentation is done with three widely used classifiers HNB, PART and J48 and these classifiers are analysed with Accuracy, F-measure, and ROC-AUC which are literately proved to be the perfect measures of classification. Our experimental analysis reveals the best classifiers on the utility aware anonymized data sets of Cell oriented Anonymization (CoA), Attribute oriented Anonymization (AoA) and Record oriented Anonymization (RoA).


Geophysics ◽  
2017 ◽  
Vol 82 (2) ◽  
pp. Q1-Q12 ◽  
Author(s):  
Carlos Alberto da Costa Filho ◽  
Giovanni Angelo Meles ◽  
Andrew Curtis

Conventional seismic processing aims to create data that contain only primary reflections, whereas real seismic recordings also contain multiples. As such, it is desirable to predict, identify, and attenuate multiples in seismic data. This task is more difficult in elastic (solid) media because mode conversions create families of internal multiples not present in the acoustic case. We have developed a method to predict prestack internal multiples in general elastic media based on the Marchenko method and convolutional interferometry. It can be used to identify multiples directly in prestack data or migrated sections, as well as to attenuate internal multiples by adaptively subtracting them from the original data set. We developed the method on two synthetic data sets, the first composed of horizontal density layers and constant velocities, and the second containing horizontal and vertical density and velocity variations. The full-elastic method is computationally expensive and ideally uses data components that are not usually recorded. We therefore tested an acoustic approximation to the method on the synthetic elastic data from the second model and find that although the spatial resolution of the resulting image is reduced by this approximation, it provides images with relatively fewer artifacts. We conclude that in most cases where cost is a factor and we are willing to sacrifice some resolution, it may be sufficient to apply the acoustic version of this demultiple method.


2021 ◽  
Vol 12 ◽  
Author(s):  
Haoyang Li ◽  
Juexiao Zhou ◽  
Yi Zhou ◽  
Qiang Chen ◽  
Yangyang She ◽  
...  

Periodontitis is a prevalent and irreversible chronic inflammatory disease both in developed and developing countries, and affects about 20–50% of the global population. The tool for automatically diagnosing periodontitis is highly demanded to screen at-risk people for periodontitis and its early detection could prevent the onset of tooth loss, especially in local communities and health care settings with limited dental professionals. In the medical field, doctors need to understand and trust the decisions made by computational models and developing interpretable models is crucial for disease diagnosis. Based on these considerations, we propose an interpretable method called Deetal-Perio to predict the severity degree of periodontitis in dental panoramic radiographs. In our method, alveolar bone loss (ABL), the clinical hallmark for periodontitis diagnosis, could be interpreted as the key feature. To calculate ABL, we also propose a method for teeth numbering and segmentation. First, Deetal-Perio segments and indexes the individual tooth via Mask R-CNN combined with a novel calibration method. Next, Deetal-Perio segments the contour of the alveolar bone and calculates a ratio for individual tooth to represent ABL. Finally, Deetal-Perio predicts the severity degree of periodontitis given the ratios of all the teeth. The Macro F1-score and accuracy of the periodontitis prediction task in our method reach 0.894 and 0.896, respectively, on Suzhou data set, and 0.820 and 0.824, respectively on Zhongshan data set. The entire architecture could not only outperform state-of-the-art methods and show robustness on two data sets in both periodontitis prediction, and teeth numbering and segmentation tasks, but also be interpretable for doctors to understand the reason why Deetal-Perio works so well.


Sign in / Sign up

Export Citation Format

Share Document