The Complexity of Counting Problems Over Incomplete Databases

Marcelo Arenas; Pablo BarcelÓ; Mikaël Monet

doi:10.1145/3461642

The Complexity of Counting Problems Over Incomplete Databases

ACM Transactions on Computational Logic ◽

10.1145/3461642 ◽

2021 ◽

Vol 22 (4) ◽

pp. 1-52

Author(s):

Marcelo Arenas ◽

Pablo BarcelÓ ◽

Mikaël Monet

Keyword(s):

Polynomial Time ◽

Relational Databases ◽

Approximation Scheme ◽

Query Languages ◽

Complexity Classes ◽

Conjunctive Query ◽

Counting Problems ◽

Incomplete Databases ◽

Boolean Query ◽

The Impact

We study the complexity of various fundamental counting problems that arise in the context of incomplete databases, i.e., relational databases that can contain unknown values in the form of labeled nulls. Specifically, we assume that the domains of these unknown values are finite and, for a Boolean query q , we consider the following two problems: Given as input an incomplete database D , (a) return the number of completions of D that satisfy q ; or (b) return the number of valuations of the nulls of D yielding a completion that satisfies q . We obtain dichotomies between #P-hardness and polynomial-time computability for these problems when q is a self-join–free conjunctive query and study the impact on the complexity of the following two restrictions: (1) every null occurs at most once in D (what is called Codd tables ); and (2) the domain of each null is the same. Roughly speaking, we show that counting completions is much harder than counting valuations: For instance, while the latter is always in #P, we prove that the former is not in #P under some widely believed theoretical complexity assumption. Moreover, we find that both (1) and (2) can reduce the complexity of our problems. We also study the approximability of these problems and show that, while counting valuations always has a fully polynomial-time randomized approximation scheme (FPRAS), in most cases counting completions does not. Finally, we consider more expressive query languages and situate our problems with respect to known complexity classes.

Download Full-text

On the Complexity of the Smallest Grammar Problem over Fixed Alphabets

Theory of Computing Systems ◽

10.1007/s00224-020-10013-w ◽

2020 ◽

Author(s):

Katrin Casel ◽

Henning Fernau ◽

Serge Gaspers ◽

Benjamin Gras ◽

Markus L. Schmid

Keyword(s):

Polynomial Time ◽

Approximation Scheme ◽

Time Algorithm ◽

Polynomial Time Approximation Scheme ◽

Context Free Grammar ◽

Constant Size ◽

Size Measure ◽

The Right ◽

The Impact ◽

Context Free

AbstractIn the smallest grammar problem, we are given a word w and we want to compute a preferably small context-free grammar G for the singleton language {w} (where the size of a grammar is the sum of the sizes of its rules, and the size of a rule is measured by the length of its right side). It is known that, for unbounded alphabets, the decision variant of this problem is NP-hard and the optimisation variant does not allow a polynomial-time approximation scheme, unless P = NP. We settle the long-standing open problem whether these hardness results also hold for the more realistic case of a constant-size alphabet. More precisely, it is shown that the smallest grammar problem remains NP-complete (and its optimisation version is APX-hard), even if the alphabet is fixed and has size of at least 17. The corresponding reduction is robust in the sense that it also works for an alternative size-measure of grammars that is commonly used in the literature (i. e., a size measure also taking the number of rules into account), and it also allows to conclude that even computing the number of rules required by a smallest grammar is a hard problem. On the other hand, if the number of nonterminals (or, equivalently, the number of rules) is bounded by a constant, then the smallest grammar problem can be solved in polynomial time, which is shown by encoding it as a problem on graphs with interval structure. However, treating the number of rules as a parameter (in terms of parameterised complexity) yields W[1]-hardness. Furthermore, we present an $\mathcal {O}(3^{\mid {w}\mid })$ O ( 3 ∣ w ∣ ) exact exponential-time algorithm, based on dynamic programming. These three main questions are also investigated for 1-level grammars, i. e., grammars for which only the start rule contains nonterminals on the right side; thus, investigating the impact of the “hierarchical depth” of grammars on the complexity of the smallest grammar problem. In this regard, we obtain for 1-level grammars similar, but slightly stronger results.

Download Full-text

Knowledge-Preserving Certain Answers for SQL-like Queries

Proceedings of the Seventeenth International Conference on Principles of Knowledge Representation and Reasoning ◽

10.24963/kr.2020/78 ◽

2020 ◽

Author(s):

Etienne Toussaint ◽

Paolo Guagliardo ◽

Leonid Libkin

Keyword(s):

Information Content ◽

Data Model ◽

Incomplete Data ◽

Relational Databases ◽

Missing Values ◽

A Priori ◽

Real Life ◽

Query Languages ◽

Incomplete Databases ◽

Certain Answers

Answering queries over incomplete data is based on finding answers that are certainly true, independently of how missing values are interpreted. This informal description has given rise to several different mathematical definitions of certainty. To unify them, a framework based on "explanations", or extra information about incomplete data, was recently proposed. It partly succeeded in justifying query answering methods for relational databases under set semantics, but had two major limitations. First, it was firmly tied to the set data model, and a fixed way of comparing incomplete databases with respect to their information content. These assumptions fail for real-life database queries in languages such as SQL that use bag semantics instead. Second, it was restricted to queries that only manipulate data, while in practice most analytical SQL queries invent new values, typically via arithmetic operations and aggregation. To leverage our understanding of the notion of certainty for queries in SQL-like languages, we consider incomplete databases whose information content may be enriched by additional knowledge. The knowledge order among them is derived from their semantics, rather than being fixed a priori. The resulting framework allows us to capture and justify existing notions of certainty, and extend these concepts to other data models and query languages. As natural applications, we provide for the first time a well-founded definition of certain answers for the relational bag data model and for value-inventing queries on incomplete databases, addressing the key shortcomings of previous approaches.

Download Full-text

A new linear storage, polynomial-time approximation scheme for the subset-sum problem

Discrete Applied Mathematics ◽

10.1016/0166-218x(90)90021-4 ◽

1990 ◽

Vol 26 (1) ◽

pp. 61-77 ◽

Cited By ~ 4

Author(s):

Matteo Fischetti

Keyword(s):

Polynomial Time ◽

Approximation Scheme ◽

Polynomial Time Approximation Scheme ◽

Time Approximation ◽

Subset Sum Problem ◽

Subset Sum ◽

Polynomial Time Approximation

Download Full-text

Polynomial-Time Approximation Scheme for a Problem of Searching for the Largest Subset with the Constraint on Quadratic Variation

Lecture Notes in Computer Science - Numerical Computations: Theory and Algorithms ◽

10.1007/978-3-030-40616-5_36 ◽

2020 ◽

pp. 400-405

Author(s):

Vladimir Khandeev

Keyword(s):

Polynomial Time ◽

Approximation Scheme ◽

Quadratic Variation ◽

Polynomial Time Approximation Scheme ◽

Time Approximation ◽

Polynomial Time Approximation

Download Full-text

Introducing Databases in Context Through Customizable Visualizations

Frontiers in Education ◽

10.3389/feduc.2021.719134 ◽

2021 ◽

Vol 6 ◽

Author(s):

Suzanne W. Dietrich ◽

Don Goelman ◽

Jennifer Broatch ◽

Sharon Crook ◽

Becky Ball ◽

...

Keyword(s):

Learning Outcomes ◽

Visual Cues ◽

Relational Databases ◽

Unique Feature ◽

Control Group ◽

Attitudes And Beliefs ◽

Learning Tools ◽

Positive Attitudes ◽

The Impact ◽

Assessment Questions

The goal of the Databases for Many Majors project is to engage a broad audience in understanding fundamental database concepts using visualizations with color and visual cues to present these topics to students across many disciplines. There are three visualizations: introducing relational databases, querying, and design. A unique feature of these learning tools is the ability for instructors in diverse disciplines to customize the content of the visualization’s example data, supporting text, and formative assessment questions to promote relevance to their students. This paper presents a study on the impact of the customized introduction to relational databases visualization on both conceptual learning and attitudes towards databases. The assessment was performed in three different courses across two universities. The evaluation shows that learning outcomes are met with any visualization, which appears to be counter to expectations. However, students using a visualization customized to the course context had more positive attitudes and beliefs towards the usefulness of databases than the control group.

Download Full-text

Single-machine Pareto-scheduling with multiple weighting vectors for minimizing the total weighted late works

Journal of Industrial and Management Optimization ◽

10.3934/jimo.2021192 ◽

2021 ◽

Vol 0 (0) ◽

pp. 0

Author(s):

Shuen Guo ◽

Zhichao Geng ◽

Jinjiang Yuan

Keyword(s):

Polynomial Time ◽

Single Machine ◽

Approximation Scheme ◽

Pareto Frontier ◽

Dynamic Programming Algorithm ◽

Programming Algorithm ◽

Polynomial Time Approximation Scheme ◽

Late Work ◽

Weighting Vector ◽

Late Works

<p style='text-indent:20px;'>In this paper, we study the single-machine Pareto-scheduling of jobs with multiple weighting vectors for minimizing the total weighted late works. Each weighting vector has its corresponding weighted late work. The goal of the problem is to find the Pareto-frontier for the weighted late works of the multiple weighting vectors. When the number of weighting vectors is arbitrary, it is implied in the literature that the problem is unary NP-hard. Then we concentrate on our research under the assumption that the number of weighting vectors is a constant. For this problem, we present a dynamic programming algorithm running in pseudo-polynomial time and a fully polynomial-time approximation scheme (FPTAS).</p>

Download Full-text

BUILDING ONTOLOGIES OVER RELATIONAL DATABASES

International Journal of Research -GRANTHAALAYAH ◽

10.29121/granthaalayah.v6.i11.2018.1123 ◽

2018 ◽

Vol 6 (11) ◽

pp. 254-265

Author(s):

Damitha D Karunaratna

Keyword(s):

Relational Databases ◽

World Wide ◽

Query Languages ◽

Public Access ◽

Prototype System ◽

Data Bases ◽

Domain Specific ◽

The World ◽

Access Data ◽

Relational Data Bases

Relational Databases are typically created to fulfil the information requirements of a community of users generally belongs to a single organization. Data stored in these databases were typically accessed by using Structured Query Languages or through customized interfaces. With the popularity of the World Wide Web and the availability of large number of Relational Databases for public access there is a need for users to retrieve data from these databases by using a text-based queries, possibly by using the terms that they are familiar with. However, the inherent limitations of Structured Query Languages used to create and access data in relational Data Bases does not allow uses to access data by using text-based queries. Also, the terms used in queries should be limited to those used during the construction of the databases. This paper proposes an architecture to generated ontologies over relation databases and show how they could be enhanced semantically by using available domain-specific or top-level ontologies so that the data managed by the DBs can be accessed by using text-based queries. The feasibility of the proposed architecture was demonstrated by building a prototype system over a sample MySQL database.

Download Full-text

A polynomial time approximation scheme for euclidean minimum cost k-connectivity

Automata, Languages and Programming - Lecture Notes in Computer Science ◽

10.1007/bfb0055093 ◽

1998 ◽

pp. 682-694 ◽

Cited By ~ 5

Author(s):

Artur Czumaj ◽

Andrzej Lingas

Keyword(s):

Polynomial Time ◽

Approximation Scheme ◽

Minimum Cost ◽

Polynomial Time Approximation Scheme ◽

Time Approximation ◽

Polynomial Time Approximation

Download Full-text

A Polynomial Time Approximation Scheme for the Square Packing Problem

Integer Programming and Combinatorial Optimization - Lecture Notes in Computer Science ◽

10.1007/978-3-540-68891-4_13 ◽

2008 ◽

pp. 184-198 ◽

Cited By ~ 7

Author(s):

Klaus Jansen ◽

Roberto Solis-Oba

Keyword(s):

Polynomial Time ◽

Approximation Scheme ◽

Packing Problem ◽

Polynomial Time Approximation Scheme ◽

Time Approximation ◽

Square Packing ◽

Polynomial Time Approximation

Download Full-text

Polynomial Time Approximation Scheme for Connected Vertex Cover in Unit Disk Graph

Combinatorial Optimization and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-540-85097-7_24 ◽

2008 ◽

pp. 255-264

Author(s):

Zhao Zhang ◽

Xiaofeng Gao ◽

Weili Wu

Keyword(s):

Unit Disk ◽

Polynomial Time ◽

Approximation Scheme ◽

Vertex Cover ◽

Unit Disk Graph ◽

Polynomial Time Approximation Scheme ◽

Time Approximation ◽

Connected Vertex Cover ◽

Connected Vertex ◽

Polynomial Time Approximation

Download Full-text