-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for ADTypes.jl #88
Comments
It's a bit unclear to me in which scenarios this integration would be helpful. Naively, to me it seems if a package supports only It's also a bit unclear to me whether such |
It would, I think, be very helpful from the viewpoint of a third package that needs (e.g.) Optimisation (SciML) and also wants to do some of it's down AD for other parts of it's algorithms using AbstractDifferentiation.jl (since ADTypes.jl doesn't provide any functionality). And this third package wants to give it's users a choice regarding AD and not depend on specific AD-implementations. So currently, the user would then have to specify a But if The goal here would be to avoid the growth of silos in the ecosystem (like we often see in the Python world). We can only take full advantage of multiple dispatch and make package compose well if we use common types and functions for fundamental things - I would argue that AD falls in that category. A lot of packages currently roll their own AD-backend wrappers (LogDensityProblemsAd.jl, for example uses
I would argue that it should be the one or the other, splitting the convert methods between the two depending on which argument is first would make make the whole harder to maintain. I would think AbstractDifferentiation.jl to be the natural place to host this because it already has the necessary package extensions. Adding it to ADTypes.jl would mean creating lots of additional extensions there, which I fear would not be accepted. In a perfect world, we'd have one common abstract API for this, the AD packages would implement it, and we wouldn't need any "extension magic". But I don't think we'll get there anytime soon, so I'd like to create what cohesion we can without breaking anything or requiring major changes across packages (that may then never happen). |
A package that only hosts types with no functions seems odd to me. For these types to be useful, they need to be passed to functions which specialise their behaviour according to the input types. I think the path forward should be something like this:
|
From what I understand (I'm not affiliated with ADTypes, I come into this more from the user point of view) ADTypes just defines the types and then SciML packages like Optimizers each define internal implementations for them. So ADTypes not not useable to extenal packages directly - but if they want to use, e.g., Optimizers, they make still need those types.
I do agree with you @mohamed82008 . I'm just not sure on what time scale that can happen, so I thought a compatibility layer would be a useful first step. |
If the intention is there, it shouldn't take that much time at all. If there is no intention, then it will likely never happen. If there is no intention, you can roll your own conversion functions in your package or in a third party package which pirates both. |
That's what I had hoped to avoid. :-) |
I think most of the above action items are for the ADTypes maintainers. |
That would be the long term goal, but ultimately AbstractDifferentiation.jl needs a lot more work to be viable for that in most circumstances. It's just way too slow with way too much overhead. It's also too constrained right now. If you look at say the functions needed for MOI wrappers, it's missing most of them. I hope in the future we can use AbstractDifferentiation.jl for everything, but that's way too big of a lift right now, but I would support anyone who wants to try. In the meantime, we needed a set way to talk about AD choices that is a higher level and leaves the implementation to the package. At a higher level, there's ForwardDiff, ReverseDiff, Enzyme, Zygote, etc. And within those, ReverseDiff has compiled mode and interpret mode, Enzyme has forward and reverse, etc. SparseDiffTools, Optimization, OrdinaryDiffEq, etc. all need to evolve away from If someone would like to start on the path to unification, the first problem to solve is #41. Doing this even for just Jacobians would enable OrdinaryDiffEq, NonlinearSolve, and a whole lot of other packages to make use of AbstractDifferentiation.jl internally. However, without a way to define and initialize configs, the current setup with FiniteDiff.jl and ForwardDiff.jl is just so much faster in some cases (clocks in at like 10x) that it's simply not viable to adopt right now. So a caching interface is a first step. But then the next step would be that AbstractDifferentatiation would need to adopt the standard options of the AD libraries. For example, compiled mode on ReverseDiff.jl used a lot in SciMLSensitivity.jl (in conjunction with a Cassette pass to prove its correctness) and so it cannot adopt AbstractDifferentiation for vjp code until all of these options are there. If those pieces are there, Optimization.jl would probably still have to carry around some custom code for defining some weird MOI-specific functions until they get upstreamed, but we'd be in a much better spot. But for now, there's a bit too much of a lift and no one to solve it, so we'll need to take a short cut for a few years. Also, I'm not convinced that in 3 years we will still need anything other than Enzyme, so investing too much time into solving this unification isn't something I would personally want to be spending my limited free time on. So AbstractDifferentiation is a nice high level API as is, is good to recommend to many individuals, but optimized libraries need something else until it either catches up to all of the tricks that can be done, or Enzyme just becomes the standard. |
It would be nice to open issues with specific examples where AD.jl introduces too much overhead or doesn't provide an API that's needed. I think I get the general idea but we need issues with MWEs and discussions to give new potential contributors a chance to improve the status quo. @oschulz are you willing to put in some time? |
I pointed to the issue and gave step by step instructions. It's rather straightforward as there's just no config type handling in AD.jl so the ForwardDiff.jl/FiniteDiff.jl calls with a pre-built config doesn't allocate the config and that can cut a good chunk of time out. Adding a caching interface to AD.jl that is able to carry forward options like chunk size choices and tags probably wouldn't take more than a week, but someone's still got to do it. So solving #41 needs to make an init function, but that init may be function-specific until the tricks from PreallocationTools are added. It would be a starting point though. |
I don't think I'm expert enough regarding the issues @ChrisRackauckas mentioned. I do get that packages, or systems or packages, may need to use their own specialized AD-implementation beyond the default backend-bindings that AbstractDifferentiation provides. At the same time, the AbstractDifferentiation machinery can certainly be useful for many generic applications. As for Enzyme (which I like very much), I hope it has a very bright future, and may really become the default AD. But there may always be other ways of performing AD, so have a selection mechanism will probably still be useful in the future (my guess). My goal here is making things easy for AD-users, and for packages depending on packages that use AD - by adding a non-intrusive way that allows them to use either/both AbstractDifferentiation and ADTypes backends types to specify AD. My proposal above would have no measurable load-time impact on AbstractDifferentiation and ADTypes, would be usable right now, and could always go away if and when the two become one. |
I went with your advice and made AutoDiffOperators.jl. It supports (so far only a few) AD-"selectors" from both AbstractDifferentiation.jl and ADTypes.jl . Backend-implementations are still very non-optimized, though. AutoDiffOperators is meant to be as general as AbstractDifferentiation and is geared towards use-cases that mostly deal with flat real vectors and that do both JVP and VJP with the same function (though it also has a function for gradients). |
The SciML ecosystem has had it's own "AD-backend" types since before AbstractDifferentiation.jl was created. Those have more recently been exposed in the standalone packages ADTypes.jl.
From a user point of view that means were currently have two incompatible ways of specifying which AD to use, and in packages/algorithms/applications that rely on both SciML and non-SciML-packages we can't easily offer a way for users to select AD.
SciML is unlikely to switch to AbstractDifferentiation (SciML/ADTypes.jl#8), it would require breaking changes and some corresponding types in ADTypes.jl and AbstractDifferentiation.jl have different content (so typedefs also won't work).
Speaking from a user point again, it would be really nice though to have a consistent way of selecting AD across the ecosystem. Could we add bi-directional conversion between
ADTypes.AbstractADType
andAbstractDifferentiation.AbstractBackend
in AbstractDifferentiation.jl?ADTypes.jl is extremely lightweight (no dependencies and a load time of 0.7 ms) so we could make it a (non-weak) dependency of AbstractDifferentiation.jl and add a
In
AbstractDifferentiation
itself (so packages can use that as a field type for algorithm configuration, etc.) and then provide bi-directionalBase.convert
methods for the AD-types the do correspond (not all are supported by both "systems" at the moment) in theAbstractDifferentiation
package extensions.Would a PR in this direction be welcome?
CC @devmotion, @ChrisRackauckas, @Vaibhavdixit02
The text was updated successfully, but these errors were encountered: