Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expectations for implementations of the interface #13

Closed
sethaxen opened this issue Oct 2, 2021 · 10 comments
Closed

Expectations for implementations of the interface #13

sethaxen opened this issue Oct 2, 2021 · 10 comments
Labels
design Package structure and correctness question Inquiries and discussions

Comments

@sethaxen
Copy link
Member

sethaxen commented Oct 2, 2021

Is it expected that all packages that implement the interface use AD.@primitive? e.g., ForwardDiff can use DiffResults to enable returning a value and a gradient and/or hessian. In cases like this, what is advised?

I can see a few possibilities:

  1. define @primitive pushforward_function(...) only to keep things simple
  2. define @primitive pushforward_function(...) but then overload methods like value_and_gradient for which the AD package has a more efficient approach.
  3. implement all necessary interface methods by hand

Edit: I'm guessing (2) is the preferred option? For many (most? all?) AD engines, the fallback to working in terms of jacobians will be much less efficient than overloading the corresponding methods.

@mohamed82008
Copy link
Member

I prefer 1 because it's simple and it works. But yes if 2 is faster, then by all means, define methods to your heart's content.

@devmotion
Copy link
Member

While working on PRs such as #93 and #95 I started to dislike the @primitive macro more and more and became more and more convinced that it would be better to require AD backends to implement the full interface explicitly. Different AD backends often already provide optimized versions of gradient, jacobian, hessian, pullback_function etc., so it is often advantageous to implement these additional functions anyway (related #97). Without a macro it would be much less likely to run into such undesired performance issues and package authors would be required to actually think about how to implement the interface instead of relying on some undocumented default implementations.

Instead of the macro which might result in a sub-optimal implementation of jacobian, I think it would be helpful to provide helper functions that make it easy to define e.g. jacobian in terms of pullback_function, gradient etc. - something like jacobian_via_pullback_function(ad::AbstractBackend, f, xs...) etc. would make it easy for package authors to define the interface without having to think about these lower-level relations. I think it would also be more transparent how the interface is implemented for anyone who reads the code.

@mohamed82008
Copy link
Member

The macro is a feature but it is not compulsory. We should document how to define the interface without the macro though.

@devmotion
Copy link
Member

Sure but if such helper functions would exist, I don't think there's a completely convincing argument for keeping the macro anymore since arguably

derivative(...) = ...
gradient(...) = gradient_via_pullback_function(...)
function pullback_function(...)
    ...
end
function jacobian(...)
    return jacobian_via_pullback_function(...)
end

is clearer than

derivative(...) = ...
gradient(...) = pullback_function(...)(1)
@primitive function pullback_function(...)
    ...
end

and not significantly less concise.

To summarize, I think such utilities are clearer and more powerful than a macro.

@devmotion
Copy link
Member

Some of the existing functionality of the macro is a bit useless anyway: #95 (And even less general than just defining the respective functions since the macro eg. does not support arguments without name)

@mohamed82008
Copy link
Member

I think both can co-exist for now. Then we can deprecate the macro if it's not needed anymore. Although it's a shame because I like the idea of defining one primitive in a macro from which all AD functionality comes to life.

@gdalle gdalle added the design Package structure and correctness label Oct 5, 2023
@gdalle
Copy link
Member

gdalle commented Oct 5, 2023

I think the macro obscures the package code quite a bit, and would be in favor of removing it. At least that way the expectations on AD backends would be clear, with no hidden performance footguns

@gdalle gdalle added the question Inquiries and discussions label Oct 5, 2023
@adrhill
Copy link
Contributor

adrhill commented Jan 8, 2024

I feel like AbstractDifferentiation.jl won't get widespread adoption if there are substantial differences in performance compared to using individual backends directly.

it would be better to require AD backends to implement the full interface explicitly. Different AD backends often already provide optimized versions of gradient, jacobian, hessian, pullback_function etc., so it is often advantageous to implement these additional functions anyway.

@gdalle suggested to me that AbstractDifferentiation could just define an empty interface.

Individual AD backends could then define package extensions on AbstractDifferentiation (or add it as a light-weight dependency). The current approach of having all package extensions in AbstractDifferentiation surprised me, as it must put a lot of burden on the maintainers of AbstractDifferentiation to keep up with changes in a growing amount of backends.

@devmotion
Copy link
Member

devmotion commented Jan 8, 2024

I feel like AbstractDifferentiation.jl won't get widespread adoption if there are substantial differences in performance compared to using individual backends directly.

Many performance issues have been fixed, e.g. #97, but if there are more problems, feel free to open an issue or PR.

Individual AD backends could then define package extensions on AbstractDifferentiation (or add it as a light-weight dependency).

AFAIK the idea was from the start that these concrete implementations should live in the respective AD packages. But they wanted to add code for an interface that is still in flux and not used (much) in the ecosystem, hence currently the extensions are part of AbstractDifferentations.

@mohamed82008
Copy link
Member

I think this issue is largely fixed now. Currently, jacobian is no longer a primitive and @primitive itself is just a thin wrapper over a pushforward or pullback definition. It can probably be phased out in a future release.

Currently, one can simply define a pullback or pushforward as a function and make the backed type a subtype of AbstractForwardMode or AbstractReverseMode. This should give you working default fallbacks. However, the current philosophy of AD.jl is to make it as efficient as its wrapped packages so if the wrapped AD package provides a jacobian function, AD.jacobian should be defined to just call that instead of using its own potentially sub-optimal fallback. Fallbacks are only useful when AD packages don't have their own implementation.

I am closing this issue now, but please open it again or open another one if you think an action needs to be taken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Package structure and correctness question Inquiries and discussions
Projects
None yet
Development

No branches or pull requests

5 participants