-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce macro to easily create custom layers #4
Conversation
@Magic
to easily create custom layers@Magic
macro to easily create custom layers
Edit nevermind, looks like a bug in the nightly version of Julia. |
It might be nice to add a reserved keyword that defines the string representation. Rather than, e.g., julia> print(model)
@Magic(w1 = Dense(n_in, 128), w2 = [Dense(128, 128) for i = 1:nlayers], w3 = Dense(128, n_out), act = relu) do x
embed = act(w1(x))
for w = w2
embed = act(w(embed))
end
out = w3(embed)
return out
end which might be too much detail when a simple function name would do, it might be nice to allow for model = @Magic(..., name="MLP") ... so that julia> print(model)
MLP(w1 = Dense(n_in, 128), w2 = [Dense(128, 128) for i = 1:nlayers], w3 = Dense(128, n_out), act = relu) which would make it a way to define new custom layers for |
Added the model = @Magic(w=randn(32, 32), name="Linear") do x, y
tmp = sum(w .* x)
return tmp + y
end
@test string(model) == "Linear(w = randn(32, 32))" The default string representation is still the verbatim printout of the definition. But if you would like to name the model, such as if you are using this API to build up layers that you want to stack in a |
Will look more later, but I have one quick printing comment. The goal in most of Flux is to make the printing re-digestable. |
A user could choose to define the name to make it re-digestible in their library of custom layers: function RectifiedLinear(; n_in, n_out)
name = "RectifiedLinear(; n_in=$(n_in), n_out=$(n_out))"
@eval @Magic(w=Dense($n_in, $n_out), name=$name) do x
relu(w(x))
end
end which gives us: julia> RectifiedLinear(; n_in=3, n_out=5)
RectifiedLinear(; n_in=3, n_out=5) I'm not sure it is practical to always print out the full string representation used to construct custom layers, especially when things get very complex – otherwise you would be printing out an entire codebase. But still you could choose to do so by not setting |
Tweaked the printing a bit:
model = @Magic(w=randn(32, 32), name="Linear(...)") do x, y
tmp = sum(w .* x)
return tmp + y
end
@test string(model) == "Linear(...)"
julia> m = @Magic(w1=Dense(5 => 100, relu), w2=Dense(100 => 1)) do x
w2(w1(x))
end
@Magic(
w1 = Dense(5 => 100, relu),
w2 = Dense(100 => 1),
) do x
w2(w1(x))
end |
It might also be nice to rewrite That way, you could have printouts like this: julia> m = @Magic(mlp=Chain(Dense(32 => 100, relu), Dense(100 => 1)), offset=randn(32)) do x
mlp(x .+ offset)
end;
julia> println(m)
@Magic(
mlp=Chain(
Dense(32 => 100, relu), # 3_300 parameters
Dense(100 => 1), # 101 parameters
),
offset=randn(32), # 32 parameters
) do x
mlp(x .+ offset)
end @mcabbott do you know if there's a way to check if something has been |
@marius311 proposed tweaking the name (which I am in favor of!) and suggested I wonder if something with |
|
That's a good one too. Or maybe just Also, should the macro be capitalized because it's creating an object? |
@Magic
macro to easily create custom layers
|
I think you could do something like https://github.com/FluxML/Metalhead.jl/blob/master/src/Metalhead.jl if you want to participate in the Flux fancy show methods. |
I have now renamed the macro to be |
- See discussion in FluxML/Flux.jl#2107 Co-authored-by: Michael Abbott <[email protected]>
Co-authored-by: Kyle Daruwalla <[email protected]>
@darsnack @mcabbott I updated the printing to overload julia> model = @compact(w1=Dense(32, 32, relu), w2=Dense(32, 32)) do x
w2(w1(x))
end
@compact(
w1 = Dense(32 => 32, relu), # 1_056 parameters
w2 = Dense(32 => 32), # 1_056 parameters
) do x
w2(w1(x))
end # Total: 4 arrays, 2_112 parameters, 8.602 KiB. It also works inside other Flux models: julia> Chain(model, Dense(32, 32))
Chain(
@compact(
w1 = Dense(32 => 32, relu), # 1_056 parameters
w2 = Dense(32 => 32), # 1_056 parameters
) do x
w2(w1(x))
end,
Dense(32 => 32), # 1_056 parameters
) # Total: 6 arrays, 3_168 parameters, 12.961 KiB. Or even with a hierarchy of julia> model1 = @compact(w1=Dense(32=>32, relu), w2=Dense(32=>32, relu)) do x
w2(w1(x))
end;
julia> model2 = @compact(w1=model1, w2=Dense(32=>32, relu)) do x
w2(w1(x))
end
@compact(
w1 = @compact(
w1 = Dense(32 => 32, relu), # 1_056 parameters
w2 = Dense(32 => 32, relu), # 1_056 parameters
) do x
w2(w1(x))
end,
w2 = Dense(32 => 32, relu), # 1_056 parameters
) do x
w2(w1(x))
end # Total: 6 arrays, 3_168 parameters, 13.047 KiB. This is re-digestable too! (For the most part, unless you start passing arrays of
|
Got arrays working too 🎉 julia> model = @compact(x=randn(5), w=Dense(32=>32)) do s
x .* s
end;
julia> model
@compact(
x = randn(5), # 5 parameters
w = Dense(32 => 32), # 1_056 parameters
) do s
x .* s
end # Total: 3 arrays, 1_061 parameters, 4.527 KiB. |
if get(io, :typeinfo, nothing) === nothing # e.g., top level of REPL | ||
Flux._big_show(io, m) | ||
elseif !get(io, :compact, false) # e.g., printed inside a Vector, but not a matrix | ||
Flux._layer_show(io, m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be overloaded too? What is the difference in _layer_show
?
- This is because the size depends on the indentation of the model, which might change in the future (and result in confusing errors!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spirit of Fluxperimental.jl being somewhere to explore and possibly break new APIs, I think we should go ahead. Thanks @MilesCranmer, will merge tomorrow morning if there are no further objections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No objections from me! Really appreciate this big effort.
Awesome, thanks! Also, could I ask for what would be an expected timeline, or milestones required, for this to eventually join the Flux.jl tree? I’ve been using it for local stuff and it’s extremely useful and helped me get started quicker. |
Ping regarding my question above 🙂 Would be fantastic to have this in the normal Flux.jl library. Others even mentioned they would be interested in using it for non-deep learning tasks. |
Edits:
name
keyword for naming custom layers.@Magic
to@compact
. The macro is also no longer exported, soFlux.@compact
should be used in the future.Chain
,Dense
, etc., with parameter counts listed and re-digestability.Introduction
This creates the
@compact
macro to easily allow building of complex layers without needing to first create a struct. It is completely compatible withFlux.Chain
. The@compact
macro specifically was contributed by @mcabbott following a lengthy discussion in FluxML/Flux.jl#2107 between us as well as @ToucheSir @darsnack. This code is copied here, along with some basic unit tests.Here are some examples:
Linear model:
Here is a linear model with bias and activation:
Finally, here is a simple MLP:
We can train this model just like any
Chain
:Discussion and Motivation
To see detailed discussion on this idea, please see threads FluxML/Flux.jl#2107 and #2.
The key motivation is that, while
Chain
is a really nice way to build many different complex layers in Flux.jl, it is sometimes significantly easier to write down models as forward functions in regular ol' code.Most popular deep learning frameworks in Python have a simple and extensible API for creating complex neural network layers, such as PyTorch:
where the
forward
function is a regular Python function that allows arbitrary code (i.e., notSequential
/Chain
). However, Flux.jl does not have something like this. The equivalent Flux implementation (without usingChain
) would be:Compounding the difficulty is the fact that Julia structs cannot be changed without restarting the runtime. So if you are interactively developing a complex neural net, you can't add new parameters to the
Net
struct without restarting.This simple
@compact
macro makes this all go away. Now it's even simpler to build custom layers in Flux.jl than in PyTorch:or even, for building things quickly,
This
@compact
macro is completely compatible with the existing Flux API such asChain
, so is an easy way to build complex layers inside larger models.PR Checklist
1-to-1 comparison: