Use multiple arguments instead of a tuple for pushforward and pullback function? #53

devmotion · 2022-02-09T21:37:33Z

It seems annoying that the pushforward and pullback function accept tuples of co-tangents instead of multiple arguments. Is there a compelling reason for doing so or was this a design decision that could be changed? In my opinion the main annoyance is that one has to handle the case of tuples of length 1 in a special way (as e.g. in #51) (it also makes it impossible to work with actual single-argument functions that take a tuple as only argument but maybe this is not needed anyway). Arguably it is also cleaner to provide multiple arguments as, well, multiple arguments instead of a tuple.

mohamed82008 · 2022-02-09T23:35:36Z

Yes I think this can be considered along with #35.

sethaxen · 2022-02-09T23:38:17Z

It seems annoying that the pushforward and pullback function accept tuples of co-tangents instead of multiple arguments. Is there a compelling reason for doing so or was this a design decision that could be changed?

While Julia functions may take multiple inputs, no Julia function returns multiple outputs. Instead, they might return a tuple of outputs. The (co)tangent of a tuple is like a tuple itself. FWIW, this is consistent with how ChainRules behaves, hence why in ChainRules, it would be represented as a Tangent, and here it would be represented as a tuple. One could make the case that since AD.jl supports only functions whose inputs and outputs are arrays, then if such a function returns a tuple it can only be interpreted as multiple outputs, but that would be inconsistent at least with ChainRules and Zygote.

it also makes it impossible to work with actual single-argument functions that take a tuple as only argument but maybe this is not needed anyway

I don't think function with tuple inputs would be supported anyways.

devmotion · 2022-02-10T00:22:43Z

While Julia functions may take multiple inputs, no Julia function returns multiple outputs. Instead, they might return a tuple of outputs. The (co)tangent of a tuple is like a tuple itself. FWIW, this is consistent with how ChainRules behaves, hence why in ChainRules, it would be represented as a Tangent, and here it would be represented as a tuple. One could make the case that since AD.jl supports only functions whose inputs and outputs are arrays, then if such a function returns a tuple it can only be interpreted as multiple outputs, but that would be inconsistent at least with ChainRules and Zygote.

Sure, multiple outputs are in fact just a tuple of outputs. But it does not necessarily mean that we have to use a tuple as input to the pullback and pushforward function.

The current design is also not completely consistent with ChainRules: In ChainRules one does not have to consider tuples of co-tangents of length 1 - the pullback function of a function with a single output just takes a single co-tangent without wrapping it as a tuple. Neglecting/not supporting tuples of length 1 would already solve the special case in #51, even if we stick with tuples in case of multiple outputs.

sethaxen · 2022-02-10T09:49:42Z

I think in general AD.jl has a funny relationship with inputs and outputs. Like gradient for a single input returns a tuple, and hessian only supports single inputs and yet still returns a tuple. IMO this should be changed.

The pushforward of a function (talking about the actual pushforward, not the fusion of the pushforward and primal that frule encodes) should be structured the same as the primal in terms of inputs and outputs. The pullback is the adjoint of the pushforward and vice versa, so a useful check of consistency is whether the rules we choose are symmetric.

i.e., these rules would maintain this symmetry, and perhaps they make sense:

The adjoint of a single-argument function returns a single output (not a tuple)
The adjoint of a multi-argument function returns a tuple of outputs
The adjoint of a single-output (non-tuple) function takes a single input
The adjoint of a multi-output (tuple) function takes multiple inputs

This is almost consistent with ChainRules, the key differences being that 1) in ChainRules, the function is treated as an argument, so there are no single-argument functions (or at least, I don't know of any examples where a rule is defined for a 0-argument function), hence all pullbacks return tuples and 2) a function might actually return a tuple directly, so it's not safe to interpret a tuple return value as being multiple outputs.

gdalle · 2023-08-09T13:20:04Z

I could try to give this a shot once #93 is merged

gdalle · 2023-12-21T11:09:51Z

Starting to work on this and I'm wondering what to do with the lazy derivatives? Only allow them for a single input / output? It's a bit counterintuitive to apply matrix multiplication on tuple anyway

devmotion mentioned this issue Aug 9, 2023

Only support tuples and not single inputs/outputs #103

Closed

This was referenced Sep 19, 2023

Remove the jacobian and primal_value primitives #95

Merged

Fix #99 #102

Closed

gdalle added design Package structure and correctness question Inquiries and discussions help wanted Extra attention is needed labels Oct 5, 2023

gdalle mentioned this issue Oct 12, 2023

New version timeline #116

Closed

gdalle mentioned this issue Feb 5, 2024

Maybe AbstractDifferentiation should shrink to a collection of names? #129

Open

adrhill mentioned this issue Feb 6, 2024

Generality JuliaDiff/DifferentiationInterface.jl#6

Closed

gdalle mentioned this issue Mar 13, 2024

Comparison with DifferentiationInterface.jl #131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use multiple arguments instead of a tuple for pushforward and pullback function? #53

Use multiple arguments instead of a tuple for pushforward and pullback function? #53

devmotion commented Feb 9, 2022

mohamed82008 commented Feb 9, 2022

sethaxen commented Feb 9, 2022

devmotion commented Feb 10, 2022

sethaxen commented Feb 10, 2022

gdalle commented Aug 9, 2023

gdalle commented Dec 21, 2023

Use multiple arguments instead of a tuple for pushforward and pullback function? #53

Use multiple arguments instead of a tuple for pushforward and pullback function? #53

Comments

devmotion commented Feb 9, 2022

mohamed82008 commented Feb 9, 2022

sethaxen commented Feb 9, 2022

devmotion commented Feb 10, 2022

sethaxen commented Feb 10, 2022

gdalle commented Aug 9, 2023

gdalle commented Dec 21, 2023