-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use multiple arguments instead of a tuple for pushforward and pullback function? #53
Comments
Yes I think this can be considered along with #35. |
While Julia functions may take multiple inputs, no Julia function returns multiple outputs. Instead, they might return a tuple of outputs. The (co)tangent of a tuple is like a tuple itself. FWIW, this is consistent with how ChainRules behaves, hence why in ChainRules, it would be represented as a
I don't think function with tuple inputs would be supported anyways. |
Sure, multiple outputs are in fact just a tuple of outputs. But it does not necessarily mean that we have to use a tuple as input to the pullback and pushforward function. The current design is also not completely consistent with ChainRules: In ChainRules one does not have to consider tuples of co-tangents of length 1 - the pullback function of a function with a single output just takes a single co-tangent without wrapping it as a tuple. Neglecting/not supporting tuples of length 1 would already solve the special case in #51, even if we stick with tuples in case of multiple outputs. |
I think in general AD.jl has a funny relationship with inputs and outputs. Like The pushforward of a function (talking about the actual pushforward, not the fusion of the pushforward and primal that i.e., these rules would maintain this symmetry, and perhaps they make sense:
This is almost consistent with ChainRules, the key differences being that 1) in ChainRules, the function is treated as an argument, so there are no single-argument functions (or at least, I don't know of any examples where a rule is defined for a 0-argument function), hence all pullbacks return tuples and 2) a function might actually return a tuple directly, so it's not safe to interpret a tuple return value as being multiple outputs. |
I could try to give this a shot once #93 is merged |
Starting to work on this and I'm wondering what to do with the lazy derivatives? Only allow them for a single input / output? It's a bit counterintuitive to apply matrix multiplication on tuple anyway |
It seems annoying that the pushforward and pullback function accept tuples of co-tangents instead of multiple arguments. Is there a compelling reason for doing so or was this a design decision that could be changed? In my opinion the main annoyance is that one has to handle the case of tuples of length 1 in a special way (as e.g. in #51) (it also makes it impossible to work with actual single-argument functions that take a tuple as only argument but maybe this is not needed anyway). Arguably it is also cleaner to provide multiple arguments as, well, multiple arguments instead of a tuple.
The text was updated successfully, but these errors were encountered: