An illustration of properties of join algorithms. When performing a join between more than two relations on more than two attributes, binary join algorithms such as hash join operate over two relations at a time, and join them on all attributes in the join condition; worst-case optimal algorithms such as generic join operate on a single attribute at a time but join all the relations on this attribute.[1]

A worst-case optimal join algorithm is an algorithm for computing relational joins with a runtime that is bounded by the worst-case output size of the join. Traditional binary join algorithms such as hash join operate over two relations at a time; joins between more than two relations are implemented by repeatedly applying binary joins. Worst-case optimal join algorithms are asymptotically faster in worst case than any join algorithm based on such iterated binary joins.

The first worst-case optimal join algorithm, generic join, was published in 2012.[2] Worst-case optimal join algorithms have been implemented in commercial database systems, including the LogicBlox system.[3][4] Worst-case optimal joins have been applied to build a worst-case optimal algorithm for e-matching.[5]

References

edit

Notes

edit
  1. ^ Wang, Yisu Remy; Willsey, Max; Suciu, Dan (2023-01-27). "Free Join: Unifying Worst-Case Optimal and Traditional Joins". arXiv:2301.10841 [cs.DB].
  2. ^ Ngo, Hung Q.; Porat, Ely; Ré, Christopher; Rudra, Atri (2012-03-08). "Worst-case Optimal Join Algorithms". arXiv:1203.1952 [cs.DB].
  3. ^ Veldhuizen, Todd L. (2013-12-20). "Leapfrog Triejoin: a worst-case optimal join algorithm". arXiv:1210.0481 [cs.DB].
  4. ^ Freitag, Michael; Bandle, Maximilian; Schmidt, Tobias; Kemper, Alfons; Neumann, Thomas (2020-07-01). "Adopting worst-case optimal joins in relational database systems" (PDF). Proceedings of the VLDB Endowment. 13 (12): 1891–1904. doi:10.14778/3407790.3407797. ISSN 2150-8097. S2CID 221115321. Archived from the original on 2024-02-01.
  5. ^ Zhang, Yihong; Wang, Yisu Remy; Willsey, Max; Tatlock, Zachary (2022-01-12). "Relational e-matching". Proceedings of the ACM on Programming Languages. 6 (POPL): 35:1–35:22. arXiv:2108.02290. doi:10.1145/3498696. S2CID 236924583.

Sources

edit
edit

📚 Artikel Terkait di Wikipedia

Join (SQL)

nested loop join, sort-merge join and hash join. Worst-case optimal join algorithms are asymptotically faster than binary join algorithms for joins between

Dijkstra's algorithm

instance-optimal, meaning no correct algorithm can asymptotically relax fewer edges on the same graph instance. Although Dijkstra's algorithm is optimal for

LogicBlox

Stratified negation Aggregate functions Evaluation using a novel worst-case optimal join algorithm. Data constructors Static typing A module system Probabilistic

Red–black tree

coloring properties that constrain how unbalanced the tree can become in the worst case. The properties are designed such that this rearranging and recoloring

Huffman coding

optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed

Merge sort

one of the first sorting algorithms where optimal speed up was achieved, with Richard Cole using a clever subsampling algorithm to ensure O(1) merge. Other

Matrix multiplication algorithm

idealized case of a fully associative cache consisting of M bytes and b bytes per cache line (i.e. ⁠M/b⁠ cache lines), the above algorithm is sub-optimal for

Prim's algorithm

In computer science, Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph. This means it finds a