approximate query processing Latest Research Papers

Accelerating approximate aggregation queries with expensive predicates

Proceedings of the VLDB Endowment ◽

10.14778/3476249.3476285 ◽

2021 ◽

Vol 14 (11) ◽

pp. 2341-2354

Author(s):

Daniel Kang ◽

John Guibas ◽

Peter Bailis ◽

Tatsunori Hashimoto ◽

Yi Sun ◽

...

Keyword(s):

Neural Networks ◽

Query Processing ◽

Deep Neural Networks ◽

Optimal Allocation ◽

Approximate Query Processing ◽

Approximate Aggregation ◽

Approximate Query ◽

Real World Datasets ◽

Aggregation Queries ◽

Processing Techniques

Researchers and industry analysts are increasingly interested in computing aggregation queries over large, unstructured datasets with selective predicates that are computed using expensive deep neural networks (DNNs). As these DNNs are expensive and because many applications can tolerate approximate answers, analysts are interested in accelerating these queries via approximations. Unfortunately, standard approximate query processing techniques to accelerate such queries are not applicable because they assume the result of the predicates are available ahead of time. Furthermore, recent work using cheap approximations (i.e., proxies) do not support aggregation queries with predicates. To accelerate aggregation queries with expensive predicates, we develop and analyze a query processing algorithm that leverages proxies (ABAE). ABAE must account for the key challenge that it may sample records that do not satisfy the predicate. To address this challenge, we first use the proxy to group records into strata so that records satisfying the predicate are ideally grouped into few strata. Given these strata, ABAE uses pilot sampling and plugin estimates to sample according to the optimal allocation. We show that ABAE converges at an optimal rate in a novel analysis of stratified sampling with draws that may not satisfy the predicate. We further show that ABAE outperforms on baselines on six real-world datasets, reducing labeling costs by up to 2.3X.

Download Full-text

QoS-Aware Approximate Query Processing for Smart Cities Spatial Data Streams

Sensors ◽

10.3390/s21124160 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4160

Author(s):

Isam Mashhour Al Jawarneh ◽

Paolo Bellavista ◽

Antonio Corradi ◽

Luca Foschini ◽

Rebecca Montanari

Keyword(s):

Data Streams ◽

Spatial Data ◽

Data Stream ◽

Smart Cities ◽

Stream Processing ◽

Processing System ◽

Strategic Decision ◽

Approximate Query Processing ◽

Data Stream Processing ◽

Mobility Data

Large amounts of georeferenced data streams arrive daily to stream processing systems. This is attributable to the overabundance of affordable IoT devices. In addition, interested practitioners desire to exploit Internet of Things (IoT) data streams for strategic decision-making purposes. However, mobility data are highly skewed and their arrival rates fluctuate. This nature poses an extra challenge on data stream processing systems, which are required in order to achieve pre-specified latency and accuracy goals. In this paper, we propose ApproxSSPS, which is a system for approximate processing of geo-referenced mobility data, at scale with quality of service guarantees. We focus on stateful aggregations (e.g., means, counts) and top-N queries. ApproxSSPS features a controller that interactively learns the latency statistics and calculates proper sampling rates to meet latency or/and accuracy targets. An overarching trait of ApproxSSPS is its ability to strike a plausible balance between latency and accuracy targets. We evaluate ApproxSSPS on Apache Spark Structured Streaming with real mobility data. We also compared ApproxSSPS against a state-of-the-art online adaptive processing system. Our extensive experiments prove that ApproxSSPS can fulfill latency and accuracy targets with varying sets of parameter configurations and load intensities (i.e., transient peaks in data loads versus slow arriving streams). Moreover, our results show that ApproxSSPS outperforms the baseline counterpart by significant magnitudes. In short, ApproxSSPS is a novel spatial data stream processing system that can deliver real accurate results in a timely manner, by dynamically specifying the limits on data samples.

Download Full-text

Combining Aggregation and Sampling (Nearly) Optimally for Approximate Query Processing

Proceedings of the 2021 International Conference on Management of Data ◽

10.1145/3448016.3457277 ◽

2021 ◽

Author(s):

Xi Liang ◽

Stavros Sintos ◽

Zechao Shang ◽

Sanjay Krishnan

Keyword(s):

Query Processing ◽

Approximate Query Processing ◽

Approximate Query

Download Full-text

Approximate Query Processing over Static Sets and Sliding Windows

Theoretical Computer Science ◽

10.1016/j.tcs.2021.06.015 ◽

2021 ◽

Author(s):

Ran Ben Basat ◽

Seungbum Jo ◽

Srinivasa Rao Satti ◽

Shubham Ugare

Keyword(s):

Query Processing ◽

Approximate Query Processing ◽

Sliding Windows ◽

Approximate Query

Download Full-text

Towards crowd-aware indoor path planning

Proceedings of the VLDB Endowment ◽

10.14778/3457390.3457401 ◽

2021 ◽

Vol 14 (8) ◽

pp. 1365-1377

Author(s):

Tiantian Liu ◽

Huan Li ◽

Hua Lu ◽

Muhammad Aamir Cheema ◽

Lidan Shou

Keyword(s):

Path Planning ◽

Query Processing ◽

Travel Time ◽

Real Data ◽

Experimental Results ◽

Search Process ◽

Unified Framework ◽

Approximate Query Processing ◽

Approximate Query ◽

Processing Algorithms

Indoor venues accommodate many people who collectively form crowds. Such crowds in turn influence people's routing choices, e.g., people may prefer to avoid crowded rooms when walking from A to B. This paper studies two types of crowd-aware indoor path planning queries. The Indoor Crowd-Aware Fastest Path Query (FPQ) finds a path with the shortest travel time in the presence of crowds, whereas the Indoor Least Crowded Path Query (LCPQ) finds a path encountering the least objects en route. To process the queries, we design a unified framework with three major components. First, an indoor crowd model organizes indoor topology and captures object flows between rooms. Second, a time-evolving population estimator derives room populations for a future timestamp to support crowd-aware routing cost computations in query processing. Third, two exact and two approximate query processing algorithms process each type of query. All algorithms are based on graph traversal over the indoor crowd model and use the same search framework with different strategies of updating the populations during the search process. All proposals are evaluated experimentally on synthetic and real data. The experimental results demonstrate the efficiency and scalability of our framework and query processing algorithms.

Download Full-text

LAQP: Learning-based approximate query processing

Information Sciences ◽

10.1016/j.ins.2020.09.070 ◽

2021 ◽

Vol 546 ◽

pp. 1113-1134

Author(s):

Meifan Zhang ◽

Hongzhi Wang

Keyword(s):

Query Processing ◽

Approximate Query Processing ◽

Approximate Query

Download Full-text

Database Native Approximate Query Processing Based on Machine-Learning

10.1007/978-3-030-87571-8_7 ◽

2021 ◽

pp. 74-86

Author(s):

Yang Duan ◽

Yong Zhang ◽

Jiacheng Wu

Keyword(s):

Machine Learning ◽

Query Processing ◽

Approximate Query Processing ◽

Approximate Query

Download Full-text

Approximate computation for big data analytics

ACM SIGWEB Newsletter ◽

10.1145/3447879.3447883 ◽

2021 ◽

pp. 1-8

Author(s):

Shuai Ma ◽

Jinpeng Huai

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Optimal Solution ◽

Approximate Computation ◽

Approximate Query Processing ◽

Data Approximation ◽

Approximation Techniques ◽

Approximate Query ◽

Data Analytic

Over the past a few years, research and development has made significant progresses on big data analytics. A fundamental issue for big data analytics is the efficiency. If the optimal solution is unable to attain or unnecessary or has a price to high to pay, it is reasonable to sacrifice optimality with a "good" feasible solution that can be computed efficiently. Existing approximation techniques can be in general classified into approximation algorithms, approximate query processing for aggregate SQL queries and approximation computing for multiple layers of the system stack. In this article, we systematically introduce approximate computation, i.e. , query approximation and data approximation, for efficient and effective big data analytics. We explain the ideas and rationales behind query and data approximation, and show efficiency can be obtained with high effectiveness, and even without sacrificing for effectiveness, for certain data analytic tasks.

Download Full-text

Approximate Query Processing for Lambda Architecture

Proceedings of the 6th International Conference on Internet of Things, Big Data and Security ◽

10.5220/0010465802530261 ◽

2021 ◽

Author(s):

Aleksey Burdakov ◽

Uriy Grigorev ◽

Andrey Ploutenko ◽

Oleg Ermakov

Keyword(s):

Query Processing ◽

Approximate Query Processing ◽

Lambda Architecture ◽

Approximate Query

Download Full-text

APPROXIMATE QUERY PROCESSING TECHNIQUE FOR EXECUTING JOINAGGREGATE QUERIES ON BIG DATA

Indian Journal of Computer Science and Engineering ◽

10.21817/indjcse/2020/v11i6/201106014 ◽

2020 ◽

Vol 11 (6) ◽

pp. 719-734

Author(s):

Praveen Kumar Sadineni

Keyword(s):

Big Data ◽

Query Processing ◽

Processing Technique ◽

Approximate Query Processing ◽

Approximate Query

Download Full-text

approximate query processing
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Accelerating approximate aggregation queries with expensive predicates

QoS-Aware Approximate Query Processing for Smart Cities Spatial Data Streams

Combining Aggregation and Sampling (Nearly) Optimally for Approximate Query Processing

Approximate Query Processing over Static Sets and Sliding Windows

Towards crowd-aware indoor path planning

LAQP: Learning-based approximate query processing

Database Native Approximate Query Processing Based on Machine-Learning

Approximate computation for big data analytics

Approximate Query Processing for Lambda Architecture

APPROXIMATE QUERY PROCESSING TECHNIQUE FOR EXECUTING JOINAGGREGATE QUERIES ON BIG DATA

Export Citation Format

approximate query processingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Accelerating approximate aggregation queries with expensive predicates

QoS-Aware Approximate Query Processing for Smart Cities Spatial Data Streams

Combining Aggregation and Sampling (Nearly) Optimally for Approximate Query Processing

Approximate Query Processing over Static Sets and Sliding Windows

Towards crowd-aware indoor path planning

LAQP: Learning-based approximate query processing

Database Native Approximate Query Processing Based on Machine-Learning

Approximate computation for big data analytics

Approximate Query Processing for Lambda Architecture

APPROXIMATE QUERY PROCESSING TECHNIQUE FOR EXECUTING JOINAGGREGATE QUERIES ON BIG DATA

approximate query processing
Recently Published Documents