scholarly journals Assessing data change in scientific datasets

Author(s):  
Juliane Müller ◽  
Boris Faybishenko ◽  
Deborah Agarwal ◽  
Stephen Bailey ◽  
Chongya Jiang ◽  
...  
Keyword(s):  
2015 ◽  
Vol 28 (8) ◽  
pp. 2546-2563 ◽  
Author(s):  
Cameron Tolooee ◽  
Matthew Malensek ◽  
Sangmi Lee Pallickara

2008 ◽  
pp. 1250-1268
Author(s):  
Cyrus Shahabi ◽  
Mehrdad Jahangiri ◽  
Dimitris Sacharidis

Data analysis systems require range-aggregate query answering of large multidimensional datasets. We provide the necessary framework to build a retrieval system capable of providing fast answers with progressively increasing accuracy in support of range-aggregate queries. In addition, with error forecasting, we provide estimations on the accuracy of the generated approximate results. Our framework utilizes the wavelet transformation of query and data hypercubes. While prior work focused on the ordering of either the query or the data coefficients, we propose a class of hybrid ordering techniques that exploits both query and data wavelets in answering queries progressively. This work effectively subsumes and extends most of the current work where wavelets are used as a tool for approximate or progressive query evaluation. The results of our experimental studies show that independent of the characteristics of the dataset, the data coefficient ordering, contrary to the common belief, is the inferior approach. Hybrid ordering, on the other hand, performs best for scientific datasets that are inter-correlated. For an entirely random dataset with no inter-correlation, query ordering is the superior approach.


Mathematics ◽  
2020 ◽  
Vol 8 (6) ◽  
pp. 956 ◽  
Author(s):  
Shahryar Rahnamayan ◽  
Sedigheh Mahdavi ◽  
Kalyanmoy Deb ◽  
Azam Asilian Bidgoli

The ranking of multi-metric scientific achievements is a challenging task. For example, the scientific ranking of researchers utilizes two major types of indicators; namely, number of publications and citations. In fact, they focus on how to select proper indicators, considering only one indicator or combination of them. The majority of ranking methods combine several indicators, but these methods are faced with a challenging concern—the assignment of suitable/optimal weights to the targeted indicators. Pareto optimality is defined as a measure of efficiency in the multi-objective optimization which seeks the optimal solutions by considering multiple criteria/objectives simultaneously. The performance of the basic Pareto dominance depth ranking strategy decreases by increasing the number of criteria (generally speaking, when it is more than three criteria). In this paper, a new, modified Pareto dominance depth ranking strategy is proposed which uses some dominance metrics obtained from the basic Pareto dominance depth ranking and some sorted statistical metrics to rank the scientific achievements. It attempts to find the clusters of compared data by using all of indicators simultaneously. Furthermore, we apply the proposed method to address the multi-source ranking resolution problem which is very common these days; for example, there are several world-wide institutions which rank the world’s universities every year, but their rankings are not consistent. As our case studies, the proposed method was used to rank several scientific datasets (i.e., researchers, universities, and countries) for proof of concept.


2018 ◽  
Author(s):  
Xavier Delaunay ◽  
Aurélie Courtois ◽  
Flavien Gouillon

Abstract. The increasing volume of scientific datasets imposes the use of compression to reduce the data storage or transmission costs, specifically for the oceanography or meteorological datasets generated by Earth observation mission ground segments. These data are mostly produced in NetCDF formatted files. Indeed, the NetCDF-4/HDF5 file formats are widely spread in the global scientific community because of the nice features they offer. Particularly, the HDF5 offers the dynamically loaded filter plugin functionality allowing users to write filters, such as compression/decompression filters, to process the data before reading or writing it on the disk. In this work, we evaluate the performance of lossy and lossless compression/decompression methods through NetCDF-4 and HDF5 tools on analytical and real scientific floating-point datasets. We also introduce the Digit Rounding algorithm, a new relative error bounded data reduction method inspired by the Bit Grooming algorithm. The Digit Rounding algorithm allows high compression ratio while preserving a given number of significant digits in the dataset. It achieves higher compression ratio than the Bit Grooming algorithm while keeping similar compression speed.


Author(s):  
Mojgan Ghanavati ◽  
Raymond K. Wong ◽  
Fang Chen ◽  
Yang Wang
Keyword(s):  

2002 ◽  
Vol 32 (2) ◽  
pp. 165-190 ◽  
Author(s):  
Joan Slottow ◽  
Ali Shahriari ◽  
Michael Stein ◽  
Xiao Chen ◽  
Chris Thomas ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document