Automatic Differentiation tools #1
Description
I think it would be smart to attempt to use the great tools already created if they can fit our needs, or improved to fit our needs. Here's a summary of what I hope to see in JuliaML, and my first impressions of the options.
I want:
- simple macros/methods to take a complex function and produce a type-stable and fast gradient calculation of all free/learnable parameters
- simple definition of new "primitives": losses, penalties, transformations, etc for inclusion in AD
- clean underlying internals (computation graph) so that we can:
- parallelize
- plot
- evolve the structure dynamically
AutoGrad.jl (@denizyuret)
I get the impression that this is very flexible and feature full, but the speed worries me. It seems that the function g
in g = grad(f)
actually contains a forward pass, and graph construction, and a backward pass all in one. Clearly this is not optimal if the graph structure is unchanged in each iteration. If there's a clean way to separate graph construction from a backward gradient calculation, then this package might be really nice. I don't know enough about it to comment on the flexibility of the internal representation though.
ForwardDiff (@jrevels @lbenet)
Doesn't seem practical for anything beyond toy problems. Unless I'm missing something?
ReverseDiffSource
My gut is that it will be possible to evolve, manipulate, and plot the underlying graph, as it's explicitly built (as opposed to using traces). I worry about the type stability... I think I remember reading about looping through Vector{Any}
objects in hot loops.
ReverseDiffOverload
Last commit was 2 years ago
ReverseDiffSparse (@mlubin)
I don't know much about this yet
Please, add your thoughts and opinions.