query estimation Latest Research Papers

Robust Cardinality: a novel approach for cardinality prediction in SQL queries

Journal of the Brazilian Computer Society ◽

10.1186/s13173-021-00115-9 ◽

2021 ◽

Vol 27 (1) ◽

Author(s):

Francisco D. B. S. Praciano ◽

Paulo R. P. Amora ◽

Italo C. Abreu ◽

Francisco L. F. Pereira ◽

Javam C. Machado

Keyword(s):

Estimation Error ◽

Experimental Tests ◽

Query Execution ◽

New Approach ◽

Modern Technique ◽

Lower Estimation ◽

Novel Approach ◽

Execution Engine ◽

Query Estimation ◽

Query Operations

Abstract Background Database Management Systems (DBMSs) use declarative language to execute queries to stored data. The DBMS defines how data will be processed and ultimately retrieved. Therefore, it must choose the best option from the different possibilities based on an estimation process. The optimization process uses estimated cardinalities to make optimization decisions, such as choosing predicate order. Methods In this paper, we propose Robust Cardinality, an approach to calculate cardinality estimates of query operations to guide the execution engine of the DBMSs to choose the best possible form or at least avoid the worst one. By using machine learning, instead of the current histogram heuristics, it is possible to improve these estimates; hence, leading to more efficient query execution. Results We perform experimental tests using PostgreSQL, comparing both estimators and a modern technique proposed in the literature. With Robust Cardinality, a lower estimation error of a batch of queries was obtained and PostgreSQL executed these queries more efficiently than when using the default estimator. We observed a 3% reduction in execution time after reducing 4 times the query estimation error. Conclusions From the results, it is possible to conclude that this new approach results in improvements in query processing in DBMSs, especially in the generation of cardinality estimates.

Download Full-text

Scalable Correlated Sampling for Join Query Estimations on Big Data

10.29007/87vt ◽

2019 ◽

Author(s):

David Wilson ◽

Wen-Chi Hou ◽

Feng Yu

Keyword(s):

Big Data ◽

Random Samples ◽

Join Queries ◽

Correlated Sampling ◽

Speed Up ◽

Data Environment ◽

Relative Errors ◽

Query Estimation ◽

Simple Selection ◽

Query Length

Estimate query results within limited time constraints is a challenging problem in the research of big data management. Query estimation based on simple random samples per- forms well for simple selection queries; however, return results with extremely high relative errors for complex join queries. Existing methods only work well with foreign key joins, and the sample size can grow dramatically as the dataset gets larger. This research implements a scalable sampling scheme in a big data environment, namely correlated sampling in map-reduce, that can speed up search query length results, give precise join query estimations, and minimize storage costs when presented with big data. Extensive experiments with large TPC-H datasets in Apache Hive show that our sampling method produces fast and accurate query estimations on big data.

Download Full-text

Efficient histogram-based range query estimation for dirty data

Frontiers of Computer Science ◽

10.1007/s11704-016-5551-1 ◽

2018 ◽

Vol 12 (5) ◽

pp. 984-999

Author(s):

Yan Zhang ◽

Hongzhi Wang ◽

Long Yang ◽

Jianzhong Li

Keyword(s):

Range Query ◽

Dirty Data ◽

Query Estimation

Download Full-text

Query estimation structures in geo-social networks

10.14711/thesis-991012615767203412 ◽

2018 ◽

Author(s):

Christos Koutras

Keyword(s):

Social Networks ◽

Query Estimation

Download Full-text

A Technique for Efficient Query Estimation over Distributed Data Streams

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2017.2693983 ◽

2017 ◽

Vol 28 (10) ◽

pp. 2770-2783 ◽

Cited By ~ 6

Author(s):

Zubair Shah ◽

Abdun Naser Mahmood ◽

Zahir Tari ◽

Albert Y. Zomaya

Keyword(s):

Data Streams ◽

Distributed Data ◽

Distributed Data Streams ◽

Query Estimation

Download Full-text

Online prime labeling and generation of synopsis for XML query estimation

2015 6th International Conference on Information and Communication Systems (ICICS) ◽

10.1109/iacs.2015.7103195 ◽

2015 ◽

Author(s):

Salahadin Mohammed ◽

El-Sayed M. El-Alfy ◽

Ahmad F. Barradah

Keyword(s):

Prime Labeling ◽

Query Estimation

Download Full-text

Range query estimation with data skewness for top-k retrieval

Decision Support Systems ◽

10.1016/j.dss.2013.09.005 ◽

2014 ◽

Vol 57 ◽

pp. 258-273 ◽

Cited By ~ 2

Author(s):

Anteneh Ayanso ◽

Paulo B. Goes ◽

Kumar Mehta

Keyword(s):

Range Query ◽

Data Skewness ◽

Query Estimation

Download Full-text

Range Query Estimation for Dirty Data Management System

Web-Age Information Management - Lecture Notes in Computer Science ◽

10.1007/978-3-642-32281-5_15 ◽

2012 ◽

pp. 152-164 ◽

Cited By ~ 4

Author(s):

Yan Zhang ◽

Long Yang ◽

Hongzhi Wang

Keyword(s):

Data Management ◽

Management System ◽

Range Query ◽

Data Management System ◽

Dirty Data ◽

Query Estimation

Download Full-text

Spatial distance join query estimation without data access

2011 IEEE International Conference on Computer Science and Automation Engineering ◽

10.1109/csae.2011.5952459 ◽

2011 ◽

Author(s):

Ye Wu ◽

Wei Xiong ◽

Ning Jing ◽

Hongsheng Chen

Keyword(s):

Data Access ◽

Spatial Distance ◽

Query Estimation

Download Full-text

Cost Modeling and Range Estimation for Top-k Retrieval in Relational Databases

Theoretical and Practical Advances in Information Systems Development ◽

10.4018/978-1-60960-521-6.ch012 ◽

2011 ◽

pp. 295-315

Author(s):

Anteneh Ayanso ◽

Paulo B. Goes ◽

Kumar Mehta

Keyword(s):

Relational Databases ◽

Estimation Method ◽

Synthetic Data ◽

Cost Modeling ◽

Data Sets ◽

Range Estimation ◽

Mapping Techniques ◽

The Cost ◽

Query Estimation ◽

Query Mapping

Relational databases have increasingly become the basis for a wide range of applications that require efficient methods for exploratory search and retrieval. Top-k retrieval addresses this need and involves finding a limited number of records whose attribute values are the closest to those specified in a query. One of the approaches in the recent literature is query-mapping which deals with converting top-k queries into equivalent range queries that relational database management systems (RDBMSs) normally support. This approach combines the advantages of simplicity as well as practicality by avoiding the need for modifications to the query engine, or specialized data structures and indexing techniques to handle top-k queries separately. This paper reviews existing query-mapping techniques in the literature and presents a range query estimation method based on cost modeling. Experiments on real world and synthetic data sets show that the cost-based range estimation method performs at least as well as prior methods and avoids the need to calibrate workloads on specific database contents.

Download Full-text

query estimation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Robust Cardinality: a novel approach for cardinality prediction in SQL queries

Scalable Correlated Sampling for Join Query Estimations on Big Data

Efficient histogram-based range query estimation for dirty data

Query estimation structures in geo-social networks

A Technique for Efficient Query Estimation over Distributed Data Streams

Online prime labeling and generation of synopsis for XML query estimation

Range query estimation with data skewness for top-k retrieval

Range Query Estimation for Dirty Data Management System

Spatial distance join query estimation without data access

Cost Modeling and Range Estimation for Top-k Retrieval in Relational Databases

Export Citation Format

query estimationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Robust Cardinality: a novel approach for cardinality prediction in SQL queries

Scalable Correlated Sampling for Join Query Estimations on Big Data

Efficient histogram-based range query estimation for dirty data

Query estimation structures in geo-social networks

A Technique for Efficient Query Estimation over Distributed Data Streams

Online prime labeling and generation of synopsis for XML query estimation

Range query estimation with data skewness for top-k retrieval

Range Query Estimation for Dirty Data Management System

Spatial distance join query estimation without data access

Cost Modeling and Range Estimation for Top-k Retrieval in Relational Databases

query estimation
Recently Published Documents