-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test Enzyme and reexport ADTypes.AutoEnzyme
#1887
base: master
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 12181846564Details
💛 - Coveralls |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1887 +/- ##
==========================================
- Coverage 44.72% 44.46% -0.26%
==========================================
Files 22 22
Lines 1554 1554
==========================================
- Hits 695 691 -4
- Misses 859 863 +4 ☔ View full report in Codecov by Sentry. |
Also if you want to disable the warnings you can set it like so (https://github.com/EnzymeAD/Enzyme.jl/blob/c29e6119c7963ddb22f1363726f762455748e193/src/api.jl#L414
|
You also may want to set the version to 0.11.2 since your CI currently is running at 0.11.0 ( |
@devmotion this PR (EnzymeAD/Enzyme.jl#914) should fix the immediate issues you see on CI if you want to try. |
|
||
using AdvancedPS: AdvancedPS | ||
|
||
include("container.jl") | ||
|
||
export @model, | ||
@varname, | ||
AutoEnzyme, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's export this as Turing.Experimental.AutoEnzyme
until Enzyme becomes more stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the threshold for being considered stable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhauru and @penelopeysm probably have a lot more experience on this.
My heuristic threshold:
- Enzyme passes all Distributions.jl and Turing.jl tests
- No known segfaults for Enzyme
for a continuous period of 8 weeks.
I've merged the latest master and upgraded to Enzyme v0.12. We are still being held back from v0.13 by Bijectors.jl. There are a number of new test failures, because
Getting Bijectors.jl to support Enzyme v0.13 I think has to be the next step, because otherwise any of the failures we see here might already be fixed on v0.13, and thus minimising and reporting them is pointless. |
gentle bump here |
It would be good to address EnzymeAD/Enzyme.jl#1812, #2307 and TuringLang/Bijectors.jl#341 before merging this PR. |
TuringLang/Bijectors.jl#341 at the very least needs addressing, because it's currently holding us back from running a recent Enzyme version here, and thus we don't know if the test suite would pass on a recent version. TuringLang/Bijectors.jl#341 passes tests on 1.10, but on 1.11 the Enzyme tests fail because of the accursed extension load order issue. The fix for that is currently waiting on this one: TuringLang/Bijectors.jl#346 EDIT and consequently this: TuringLang/Bijectors.jl#349 |
With the bijectors fix landed, I suppose this is (again) ready to go? |
separately @yebai you appear to have removedd my permissions to run tests, if that can be restored |
I don't know what happened precisely -- some changes were made to the TuringLang repos permissions to make CI work more robustly. |
I'd try to help, but I don't have permission to edit things or rerun CI xD |
It should resolve with Mooncake 0.4.54 as that allows for DPPL=0.31.0. Don't know why CI isn't picking up the new version. └─Mooncake [da2b9cff] log:
├─possible versions are: 0.3.0 - 0.4.53 or uninstalled 0.4.54 should have been available a few hours ago. |
The registry issue is sorted, now merrily running with the latest Enzyme. |
Check ADType: Error During Test at /home/runner/work/Turing.jl/Turing.jl/test/mcmc/hmc.jl:334
Got exception outside of a @test
ArgumentError: Unsupported ADType: ADTypes.AutoEnzyme{Nothing, Nothing}
Stacktrace:
[1] Main.ADUtils.ADTypeCheckContext(adbackend::ADTypes.AutoEnzyme{Nothing, Nothing}, child::DynamicPPL.DefaultContext)
@ Main.ADUtils ~/work/Turing.jl/Turing.jl/test/test_utils/ad_utils.jl:102
[2] macro expansion
@ ~/work/Turing.jl/Turing.jl/test/mcmc/hmc.jl:336 [inlined]
[3] macro expansion
@ /opt/hostedtoolcache/julia/1.11.2/x86/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
[4] macro expansion
@ ~/work/Turing.jl/Turing.jl/test/mcmc/hmc.jl:335 [inlined]
[5] macro expansion
@ /opt/hostedtoolcache/julia/1.11.2/x86/share/julia/stdlib/v1.11/Test/src/Test.jl:1793 [inlined]
[6] top-level scope
@ ~/work/Turing.jl/Turing.jl/test/mcmc/hmc.jl:22
[7] include(fname::String)
@ Main ./sysimg.jl:38
[8] macro expansion
@ ~/.julia/packages/TimerOutputs/6KVfH/src/TimerOutput.jl:237 [inlined]
[9] macro expansion
@ ~/work/Turing.jl/Turing.jl/test/runtests.jl:26 [inlined]
[10] macro expansion
@ /opt/hostedtoolcache/julia/1.11.2/x86/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
[11] macro expansion
@ ~/work/Turing.jl/Turing.jl/test/runtests.jl:56 [inlined]
[12] macro expansion
@ ~/.julia/packages/TimerOutputs/6KVfH/src/TimerOutput.jl:237 [inlined]
[13] macro expansion
@ ~/work/Turing.jl/Turing.jl/test/runtests.jl:54 [inlined]
[14] macro expansion
@ /opt/hostedtoolcache/julia/1.11.2/x86/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
[15] top-level scope
@ ~/work/Turing.jl/Turing.jl/test/runtests.jl:34
[16] include(fname::String)
@ Main ./sysimg.jl:38
[17] top-level scope
@ none:6
[18] eval
@ ./boot.jl:430 [inlined]
[19] exec_options(opts::Base.JLOptions)
@ Base ./client.jl:296
[20] _start()
@ Base ./client.jl:531 Looks like something in turing needs to be updated? |
Fixed the above issue that @wsmoses pointed out. We are seeing a lot of illegal type analysis errors, which I suspect are all instances of EnzymeAD/Enzyme.jl#2169. |
So this is indicative of a union (which isn't presently fully supported, at least without setting Enzyme.API.strictAliasing!(false) which may permit it). Something around here https://github.com/TuringLang/DynamicPPL.jl/blob/2252a9b6012da8e2ac56353770a0f848f6874357/src/abstract_varinfo.jl#L791 is sometimes an int and other times a double. I think this will need to be fixed on the turing side. |
If @model function gdemo_copy()
s ~ InverseGamma(2, 3)
end fails, and it does, then I assume most Turing models are affected, since they don't really get simpler than that. We could look into trying to chase down that
@wsmoses has something in Enzyme gotten stricter so that these illegal type analysis errors come up more often nowadays? Some of the errors are from tests that already passed at an earlier point. |
Ideally, a proper fix should be added to Enzyme instead of requiring packages like Turing.jl / DynamicPPL.jl to work around it. One good reason is that Turing allows arbitrary Julia code inside the |
Note: This does not work yet
I opened this PR to make it easier to debug (and possibly fix) issues with Enzyme.
Currently, the following example does
notwork (note that the snippet does not require the PR which solely reexportsAutoEnzyme
at this point):With Enzyme#main my Julia (1.8.1) segfaults. An incomplete (it filled my whole terminal) output: https://gist.github.com/devmotion/1352197f2354c6fecddd7b778ec4bcf7#file-log-txtThe example works (latest releases of Turing, Enzyme, and ADTypes on Julia 1.10.0) but the following warnings show up: