exec: hash joiner #1

changangela · 2018-10-15T20:58:00Z

Toy hash joiner for joining on int-int columns where we build with the left relation (requires unique key on join column) and probe on right relation.

Results of go test -bench=BenchmarkHashJoin:

Using the hashJoinBuilder:

BenchmarkHashJoin/name=random_source/rows=0-8  	   30000	     53737 ns/op
BenchmarkHashJoin/name=random_source/rows=4096-8         	    5000	    325867 ns/op	 804.45 MB/s
BenchmarkHashJoin/name=random_source/rows=16384-8        	    1000	   1405501 ns/op	 746.05 MB/s
BenchmarkHashJoin/name=random_source/rows=262144-8       	     100	  22095146 ns/op	 759.32 MB/s
BenchmarkHashJoin/name=random_source/rows=4194304-8      	       3	 360555462 ns/op	 744.51 MB/s
BenchmarkHashJoin/name=random_source/rows=67108864-8     	       1	10443118639 ns/op	 411.27 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=0-8         	   30000	     54243 ns/op
BenchmarkHashJoin/name=uniformly_distinct_source/rows=4096-8      	    5000	    287274 ns/op	 912.52 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=16384-8     	    1000	   1286287 ns/op	 815.20 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=262144-8    	     100	  21394291 ns/op	 784.19 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=4194304-8   	       3	 351375238 ns/op	 763.96 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=67108864-8  	       1	5903473387 ns/op	 727.53 MB/s

Using the hashJoinGroupBuilder:

BenchmarkHashJoin/name=random_source/rows=0-8  	   30000	     53264 ns/op
BenchmarkHashJoin/name=random_source/rows=4096-8         	    5000	    309597 ns/op	 846.73 MB/s
BenchmarkHashJoin/name=random_source/rows=16384-8        	    1000	   1246564 ns/op	 841.17 MB/s
BenchmarkHashJoin/name=random_source/rows=262144-8       	     100	  19342936 ns/op	 867.36 MB/s
BenchmarkHashJoin/name=random_source/rows=4194304-8      	       5	 317785006 ns/op	 844.71 MB/s
BenchmarkHashJoin/name=random_source/rows=67108864-8     	       1	8037002408 ns/op	 534.40 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=0-8         	   30000	     56612 ns/op
BenchmarkHashJoin/name=uniformly_distinct_source/rows=4096-8      	    5000	    257718 ns/op	1017.17 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=16384-8     	    2000	   1098873 ns/op	 954.23 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=262144-8    	     100	  18678830 ns/op	 898.19 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=4194304-8   	       5	 310974801 ns/op	 863.21 MB/s
BenchmarkHashJoin/name=uniformly_distinct_source/rows=67108864-8  	       1	5158706255 ns/op	 832.57 MB/s

… performance tradeoffs

jordanlewis

Very very cool stuff. I think this is looking good for a first cut, but I don't fully understand everything in the implementation yet. Many of my comments are about improved documentation.

I would recommend going through and adding documentation to each of the main components - cleaning things up. Then, I think you should switch the implementation to use the exec.Operator version to make sure we don't keep diverging from what's in the main repo, and PR it. We can continue the review there.

jordanlewis · 2018-10-17T17:58:23Z

hashjoin.go

+const hashTableBucketSize = 1 << 16
+
+type hashTableInt struct {
+	first []int


I know the other code in this repo isn't well commented, but we'll need to add comments to all of these fields once we productionize it - otherwise, it'll be impossible for people to understand what's going on.

You should describe the contract of hashTableInt as well. How is it used? What does it do exactly? At least a few sentences would be helpful.

Specifically, what's first and next? How do they work? What's the overall structure of the hash table, what guarantees does it provide?

Thanks for the review! I've started migrating this code out to cockroachdb and I promise there is better documentation in there 😆

jordanlewis · 2018-10-18T00:07:08Z

hashjoin.go

+
+func (hashTable *hashTableInt) grow(amount int) {
+	hashTable.next = append(hashTable.next, make([]int, amount)...)
+	hashTable.keys = append(hashTable.keys, make(intColumn, amount)...)


This may cause multiple allocations. Depending on whether you want fine-grained control over how much the slice will grow, I think the right way to do this is to allocate a new slice if the old's slice capacity is too small to fit the new amount, and copy the old slice into the new slice...

I hear Go 1.11 is more optimal for this case (https://go-review.googlesource.com/c/go/+/109517) but unfortunately we're still on 1.10 for other reasons. If this is your bottleneck i'd consider changing it, otherwise I guess we can leave for now.

Yeah that makes sense, the implementation's a little different with the ColVec stuff and different types, which we can discuss later.

jordanlewis · 2018-10-18T00:11:27Z

hashjoin.go

+
+// hashJoinerIntInt performs a hashJoin joining on two integer columns where the
+// left table represents the build relation. It does not work with N - N joins.
+type hashJoinerIntInt struct {


nit: I would stick this in the top of the file! since it's the first thing somebody will want to read. The rest of the stuff might even belong in a separate file.

jordanlewis · 2018-10-18T00:14:00Z

hashjoin.go

+
+	hashJoiner.hashTable = makeHashTableInt(hashTableBucketSize, len(hashJoiner.leftCols))
+
+	hashJoiner.build()


This doesn't belong in Init, which is designed to run before execution starts at all - more of a setup phase than a do work phase. You should put this into Next behind a conditional that will only run once. In distsql we do this with a little state machine infrastructure.

jordanlewis · 2018-10-18T00:14:49Z

hashjoin.go

+
+// build performs the build phase of the hash join using the left relation.
+// Different builders used different heuristics for the build phase allowing us
+// to evaluate cpu-memory trade-offs.


Great idea to have multiple builders! Perhaps we will be able to select a builder at plan time depending on the characteristics of the tables. Do you see any opportunities like that?

Yes definitely :D Found some cool literature on how to optimize the build phase for various tradeoffs... and we will also need different builders/probers when we expand to N-N joins

jordanlewis · 2018-10-18T00:28:34Z

hashjoin.go

+	valCol := builder.hashTable.values[valIdx]
+
+	for i:= 0; i < batchSize; i++ {
+		valCol[i + builder.totalSize + 1] = outCol[i]


another + 1 - why? Seems like you could get rid of these everywhere, maybe, unless I'm missing something.

Sorry for the lack of documentation :P The whole point of the +1 is because in our hash table, the index = 0 is reserved and represents the end of chain. So for every row index, we want to offset by 1 such that next[i + 1] holds the next value in their corresponding bucket chains. Then for consistency, I had keys and values have that same offset by 1. This is how they implemented it in that paper, but now that I think about it, we can just have -1 be equal to end of chain.

jordanlewis · 2018-10-18T00:31:38Z

hashjoin.go

+			break
+		}
+
+		builder.insertBatch(flow, eqColIdx, outCols, batchSize)


There's a slight complication here which is that you need to be examining the selection vector if it's set... since we don't have a standard way to do that yet, I suggest leaving it out for now, but add a TODO for us to make sure this is fixed later.

jordanlewis · 2018-10-18T00:33:10Z

hashjoin.go

+	builder.hashTable.growNext(builder.totalSize)
+
+	for i := 0; i < builder.totalSize; i++ {
+		builder.hashTable.insertKey(builder.bucket[i], i + 1)


It seems like this loop is over the same bounds as the one above. Is there a reason you can't/shouldn't do both these steps in one loop?

I don't think so. I was trying to figure out why the papers implementation split up the build process into a hash -> bucket -> insert loops, when it could easily have been combined into a single loop. Any idea why they might've done that?

jordanlewis · 2018-10-18T00:35:20Z

hashjoin.go

+	hashTable.next = append(hashTable.next, make([]int, amount)...)
+}
+
+func (hashTable *hashTableInt) insertKey(hashKey int, id int) {


comment on this - what does insertKey do exactly?

jordanlewis · 2018-10-18T00:36:36Z

hashjoin.go

+	hashTable.allocated += amount
+}
+
+func (hashTable *hashTableInt) insert(hashKey int, key int) (id int) {


This seems to be unused by the group implementation - why? Also, please add a comment on what it does.

I wrote this function before implementing the second builder and it does a bit too much since it inserts to keys which we don't want to do if we are preloading everything.

changangela

Thanks for the review! I'll continue this in the cockroach repo

changangela added 3 commits October 15, 2018 14:48

hash joiner build and hash tables added

c10e878

probing now works with tests

6559061

added benchmarks for hash joiner

8924344

changangela changed the title ~~Hash joiner~~ [wip] exec: hash joiner Oct 16, 2018

changangela changed the title ~~[wip] exec: hash joiner~~ exec: hash joiner Oct 16, 2018

changangela added 4 commits October 16, 2018 13:17

seperated the build and probe mechanics

c910094

fixed benchmark bytes usage

8815163

added multiple benchmarks for the hash joiner

dd8089a

added the hashJoinGroupBuilder with a different heuristic to evaluate…

498d9d1

… performance tradeoffs

jordanlewis approved these changes Oct 18, 2018

View reviewed changes

changangela commented Oct 18, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exec: hash joiner #1

exec: hash joiner #1

changangela commented Oct 15, 2018 •

edited

Loading

jordanlewis left a comment

jordanlewis Oct 17, 2018

jordanlewis Oct 18, 2018

jordanlewis Oct 18, 2018

changangela Oct 18, 2018

jordanlewis Oct 18, 2018

changangela Oct 18, 2018

jordanlewis Oct 18, 2018

jordanlewis Oct 18, 2018

jordanlewis Oct 18, 2018

changangela Oct 18, 2018

jordanlewis Oct 18, 2018

changangela Oct 18, 2018 •

edited

Loading

jordanlewis Oct 18, 2018

jordanlewis Oct 18, 2018

changangela Oct 18, 2018

jordanlewis Oct 18, 2018

jordanlewis Oct 18, 2018

changangela Oct 18, 2018

changangela left a comment


		hashJoiner.hashTable = makeHashTableInt(hashTableBucketSize, len(hashJoiner.leftCols))

		hashJoiner.build()

exec: hash joiner #1

Are you sure you want to change the base?

exec: hash joiner #1

Conversation

changangela commented Oct 15, 2018 • edited Loading

jordanlewis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

changangela Oct 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

changangela left a comment

Choose a reason for hiding this comment

changangela commented Oct 15, 2018 •

edited

Loading

changangela Oct 18, 2018 •

edited

Loading