Thesis
On the complexity of queries with intersection joins
- Abstract:
-
This thesis studies the complexity of join processing on interval data. It defines a class of queries, called Conjunctive Queries with Intersections Joins (IJQs). An IJQ is a query in which the variables range both over scalars and intervals with real-valued endpoints. The joins are expressed through intersection predicates; an intersection predicate over a multi-set that consists of both scalars and intervals is a true assertion, if the elements in the multi-set intersect; otherwise, it is false. The class of IJQs includes queries that are often asked in practice.
This thesis introduces techniques for obtaining reductions from the problem of evaluating IJQs to the problem of evaluating Conjunctive Queries with Equality Joins (CQs). The key idea is the rewriting of an intersection predicate over a set of intervals into an equivalent predicate with equality conditions. This rewriting is achieved by building a segment tree where the nodes hierarchically encode intervals using bit-strings. Given a multi-set of intervals, their intersection is captured by certain equality conditions on the encoding of the nodes. Following that, it turns out that the problem of evaluating an IJQ on an input database containing intervals can be reduced to the problem of evaluating a union of CQs on a database containing scalars and vice versa. Such reductions lead to upper and lower bounds for the data complexity of Boolean IJQs, given upper and lower bounds for the data complexity Boolean CQs. The upper bounds are obtained using a reduction called forward reduction, which reduces any Boolean IJQ to a disjunction of Boolean CQs. The lower bounds are obtained by a reduction called backward reduction, in which any Boolean CQ from the aforementioned disjunction is reduced to the input Boolean IJQ. Overall, the two findings suggest that a Boolean IJQ is as difficult as the forward disjunctions' most difficult Boolean CQ. Last but not least, this thesis identifies an interesting subclass of Boolean IJQs that admit quasi-linear time computation in data complexity. They are referred to as $\iota$-acyclic IJQs.
Actions
Authors
Contributors
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Computer Science
- Role:
- Supervisor
- ORCID:
- 0000-0003-2964-0880
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Computer Science
- Role:
- Supervisor
- Funder identifier:
- https://ror.org/0439y7842
- Funding agency for:
- Kormpa, A
- Grant:
- 17000156
- Programme:
- EPSRC ICASE SUPPORTED BY ORDNANCE SURVEY
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2024-01-09
If you are the owner of this record, you can report an update to it here: Report update to this record