Nested Loops Revisited Again (2023)

40 points | by feeblefakie 5 days ago

7 comments

Log_out_ 4 days ago
Convert all nested loops and funccalls to one large do whilw loop with chacherowsized bundled input and output in the compiler ?
cryptonector 4 days ago
> Hash joins and sort-merge joins have been considered the algorithms of choice for analytical relational queries in most parallel database systems because of their performance robustness and ease of parallelization. On the other hand, nested loop joins have been considered less attractive and are conservatively used. In this paper, we revisit the potential of nested loop joins [...]
Uh, a hash join is basically a nested loop where first you create a hash table index of one side. When I finally realized that I suddenly understood why certain PG queries were slow!
So right off in the abstract I find this paper suspect.
[-]
- ramchip 3 days ago
  Hash join is not a nested loop, it's two separate loops one after the other. This makes a huge difference since the complexity is O(N+M) rather than O(N*M). See these examples:
  https://vladmihalcea.com/hash-join-algorithm/
  https://vladmihalcea.com/nested-loop-join-algorithm/
  [-]
  - cryptonector 3 days ago
    Oh, I see, thanks. I did screw up my understanding.
gnabgib 5 days ago
(2023)
[-]
- 4 days ago
  [deleted]
4 days ago
[deleted]