Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sources of nondeterminism in egglog #439

Merged
merged 6 commits into from
Nov 1, 2024

Conversation

oflatt
Copy link
Member

@oflatt oflatt commented Oct 8, 2024

No description provided.

@oflatt oflatt requested a review from a team as a code owner October 8, 2024 23:50
@oflatt oflatt requested review from Alex-Fischman and removed request for a team October 8, 2024 23:50
@oflatt oflatt marked this pull request as draft October 9, 2024 00:10
@oflatt oflatt changed the title Another iteration over a hashmap, this time in the remove globals pass Fix sources on nondeterminism in egglog Oct 9, 2024
@oflatt
Copy link
Member Author

oflatt commented Oct 9, 2024

@ezrosent, I ended up having to use indexmap in generic join, which might slow down egglog. Should we just merge it for now?
I would love to add testing to egglog that somehow catches nondeterminism as well.

@oflatt oflatt requested a review from ezrosent October 9, 2024 00:17
@yihozhang
Copy link
Collaborator

Can we instead just use a build-time flag which users can use to opt in for determinism?

@oflatt
Copy link
Member Author

oflatt commented Oct 9, 2024

That's a good idea, but perhaps determinism should be opt-out?

@oflatt oflatt requested a review from yihozhang October 9, 2024 20:23
@oflatt oflatt marked this pull request as ready for review October 9, 2024 20:26
@oflatt oflatt changed the title Fix sources on nondeterminism in egglog Fix sources of nondeterminism in egglog Oct 9, 2024
@yihozhang
Copy link
Collaborator

yihozhang commented Oct 10, 2024

Some thoughts:

  • In general, when performance and determinism are in conflict, I think performance should be the default option since most users don't care about determinism (e.g., they don't run snapshot tests), and users who need determinism know that they need it.
  • That being said, in this case, IndexMap may actually help performance since it's faster for iterating through! A quick run of cargo build --release && time target/release/egglog tests/eggcc-extraction.egg 2> /dev/null vs cargo build --release -F nondeterministic && time target/release/egglog tests/eggcc-extraction.egg 2> /dev/null shows that the deterministic version is slightly faster. If this result is consistent then we should just merge this PR without a separate compiler option!
    • However, if in the future performance and determinism are in conflict, I'd still prefer the default build to prioritize performance over determinism.
  • Related to the last one, I noticed that by changing TypeInfo::sorts from HashMap to IndexMap, the time it takes to run eggcc-extraction.egg goes down from 16.6s to 11.8s, which is huge!! This is because one of our bottleneck, as shown by cargo flamegraph, is in eval_lit, which linearly scans the TypeInfo::sorts map. A better implementation should not do a linear scan.

@oflatt
Copy link
Member Author

oflatt commented Oct 15, 2024

Reverted the feature, so this should be good to go!
Excited to see better performance testing in the future.

Copy link
Collaborator

@yihozhang yihozhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@Alex-Fischman
Copy link
Collaborator

@oflatt Update to main to get benchmarking!

@oflatt oflatt force-pushed the oflatt-nondeterminism-fix branch from 3f0507f to f10226f Compare October 16, 2024 22:09
Copy link

codspeed-hq bot commented Oct 16, 2024

CodSpeed Performance Report

Merging #439 will degrade performances by 14.5%

Comparing oflatt-nondeterminism-fix (1b550ab) with main (60342af)

Summary

❌ 6 regressions
✅ 2 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main oflatt-nondeterminism-fix Change
eggcc-extraction 4.1 s 4.4 s -7.45%
herbie 282.6 ms 304.8 ms -7.28%
lambda 145.1 ms 159.2 ms -8.89%
math-microbenchmark 3.6 s 4.2 s -14.5%
python_array_optimize 6.1 s 6.9 s -11.25%
typeinfer 408.7 ms 433.5 ms -5.72%

@saulshanabrook
Copy link
Member

I imagine some of the changes on shorter running benchmarks might not be significant - FWIW I just opened a PR to reduce our benchmarks to just the longer more meaningful ones (#444). Otherwise, it seems like there is too much uncertainty introduced by the indeterminacy of the memory allocator.

The slowdown in the math-microbenchmark is possibly more relevant.

@oflatt
Copy link
Member Author

oflatt commented Oct 24, 2024

We should try:

  • using a hash function with a default seed
  • forbid using hash maps except the one defined in a util file

@saulshanabrook
Copy link
Member

reference to ahash as fast hash map with ability to set seed https://docs.rs/ahash/latest/ahash/

@saulshanabrook saulshanabrook mentioned this pull request Oct 24, 2024
@thaliaarchi
Copy link
Contributor

thaliaarchi commented Oct 25, 2024

reference to ahash as fast hash map with ability to set seed https://docs.rs/ahash/latest/ahash/

hashbrown moved from ahash to foldhash in the past release, so that seems to be the new consensus. foldhash has a fixed seed initializer too: foldhash::fast::FixedState::with_seed(0). For comparison, I switched symbol_table (used by egglog) to use this.

@oflatt oflatt force-pushed the oflatt-nondeterminism-fix branch from f10226f to 334f911 Compare October 25, 2024 21:36
Copy link
Collaborator

@Alex-Fischman Alex-Fischman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we still depend on rustc-hash? We might as well use the same hasher for everything right?

@oflatt
Copy link
Member Author

oflatt commented Oct 25, 2024

Not sure why, but things are still non-deterministic with this change
https://github.com/egraphs-good/eggcc/actions/runs/11525853271/job/32088922445?pr=637

@thaliaarchi
Copy link
Contributor

thaliaarchi commented Oct 25, 2024

Since I updated the hash function in PR#445 Update hashbrown, I'd recommend you rebase onto main. My goal there was to update to the latest versions of the hashing dependencies.

I didn't notice that egglog also uses rustc-hash, so that ought to be updated, since they recently did significant improvements improving both performance and hash strength. It's no longer the Firefox-derived hash function; its collision weaknesses, which were problematic for rustc, have been fixed. Now, rustc-hash is a mildly stripped down version of foldhash, making tradeoffs for usage in a compiler (I've written on the differences between them in my notes). It's probably only worth using one of the two.

Update: fixed in PR#456 Update symbol_table. Max's symbol_table has some upstream changes which haven't been released to crates.io, so the version used by egglog is still on an old version of hashbrown and is also missing performance improvements contributed by someone else. For that, we should either ping him again or switch to a Git dependency on it in Cargo.toml.

Copy link
Collaborator

@Alex-Fischman Alex-Fischman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Thalia's comment

@ezrosent
Copy link
Contributor

One thought:

Another source of nondeterminism that can happen if you're running more than one instance of egglog in a single process is symbol_table. The numbers symbol_table hands out will not be deterministic (there are various process-global counters being incremented and the like. Different numbers will cause hashmap to store strings or symbols in different spots, hence iteration order will be different too.

@oflatt
Copy link
Member Author

oflatt commented Oct 28, 2024

@thaliaarchi thanks for the suggestions

@ezrosent hmm yeah, that sounds like it would do it. I assumed wrongly that symbol gen was deterministic.

Sounds like I should make a "nondeterministic" feature flag for now, and we can revisit this question at a later date. The best option is probably to somehow sort the resulting serialized egraph, since that would work even with parallel egglog.

@oflatt oflatt force-pushed the oflatt-nondeterminism-fix branch from c9727fc to 798fe04 Compare October 28, 2024 21:54
@oflatt oflatt force-pushed the oflatt-nondeterminism-fix branch from 8593ad5 to 9f34210 Compare October 28, 2024 22:06
@oflatt oflatt requested a review from Alex-Fischman November 1, 2024 18:35
Copy link
Collaborator

@Alex-Fischman Alex-Fischman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but please add a regression comment

@oflatt
Copy link
Member Author

oflatt commented Nov 1, 2024

Regression is caused by the swap from hashmap to index maps, which have a slower insertion and lookup speed. To recover the performance, use the nondeterministic flag.

Looking to the future, we'd like to remove the dependency on symbol gen library and use the fixed seed solution.

@Alex-Fischman Alex-Fischman merged commit 2c8d947 into main Nov 1, 2024
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants