Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

mcabbott · 2024-03-10T20:12:25Z

At present Parallel allows multiple layers and one input, but not the reverse. This PR extends it to allow both ways... much like broadcasting in connection((inputs .|> layers)...).

julia> Parallel(+, inv)(1, 2, 3)  # was an error
1.8333333333333333

julia> (1,2,3) .|> (inv,)
(1.0, 0.5, 0.3333333333333333)

Does this have any unintended side-effects?

PR Checklist

Tests are added
Entry in NEWS.md
Documentation, if applicable

codecov · 2024-03-10T20:32:24Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.03%. Comparing base (eb6492c) to head (0544711).

❗ Current head 0544711 differs from pull request most recent head 9ee2c69. Consider uploading reports for the commit 9ee2c69 to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #2393       +/-   ##
===========================================
+ Coverage   43.04%   74.03%   +30.98%     
===========================================
  Files          32       32               
  Lines        1856     1918       +62     
===========================================
+ Hits          799     1420      +621     
+ Misses       1057      498      -559

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/layers/basic.jl

mcabbott · 2024-03-13T02:41:50Z

Here's the complete run-down on where Flux does & doesn't splat at present:

julia> using Flux

julia> pr(x) = begin println("arg: ", x); x end;

julia> pr(x...) = begin println(length(x), " args: ", join(x, " & "), " -> tuple"); x end;

julia> c1 = Chain(pr, pr); ########## simple chain

julia> c1(1)
arg: 1
arg: 1
1

julia> c1((1, 2))
arg: (1, 2)
arg: (1, 2)
(1, 2)

julia> c1(1, 2)
ERROR: MethodError:
Closest candidates are:
  (::Chain)(::Any)

julia> p1 = Parallel(pr, a=pr);  ########## combiner + one layer

julia> p1(1)
arg: 1
arg: 1
1

julia> p1((1, 2))  # one 2-Tuple is NOT accepted, always splatted  --> changed by PR
ERROR: ArgumentError: Parallel with 1 sub-layers can take one input or 1 inputs, but got 2 inputs

julia> p1(1, 2)  # more obvious error  --> changed by PR
ERROR: ArgumentError: Parallel with 1 sub-layers can take one input or 1 inputs, but got 2 inputs

julia> p1((a=1, b=2))  # one NamedTuple is ok
arg: (a = 1, b = 2)
arg: (a = 1, b = 2)
(a = 1, b = 2)

julia> p1((((1,),),))  # splatted many times
arg: 1
arg: 1
1

julia> p2 = Parallel(pr, a=pr, b=pr);  ########## combiner + two layers

julia> p2(1)  # one non-tuple arg is broadcasted
arg: 1
arg: 1
2 args: 1 & 1 -> tuple
(1, 1)

julia> p2(1, 2)  # 2 args sent to 2 layers
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
(1, 2)

julia> p2((1, 2))  # one tuple splatted
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
(1, 2)

julia> p2((a=1, b=2))  # one NamedTuple sent to both
arg: (a = 1, b = 2)
arg: (a = 1, b = 2)
2 args: (a = 1, b = 2) & (a = 1, b = 2) -> tuple
((a = 1, b = 2), (a = 1, b = 2))

julia> p2(((1,2), ((3,4),)))  # only splatted once
arg: (1, 2)
arg: ((3, 4),)
2 args: (1, 2) & ((3, 4),) -> tuple
((1, 2), ((3, 4),))

julia> Chain(pr, p2, pr)((1, 2))  # here earlier layers cannot pass p2 two arguments
arg: (1, 2)
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
arg: (1, 2)
(1, 2)

This PR changes the two error cases above:

julia> p1((1, 2))  # changed by PR
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
(1, 2)

julia> p1(1, 2)  # changed by PR
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
(1, 2)

You could argue that p1((1, 2)) already has a plausible meaning, apply one layer to one input Tuple. But this use of Parallel is really just Chain (or in this order, ∘). And it's an error at present.

I think p1(1, 2) has no other plausible meaning.

The rule after this PR is:

(p::Paralel)(input::Tuple) always splats to p(input...)
return combine((inputs .|> layers)...)

Step 1 is unchanged, but step 2 previously allowed only broadcasting of the input. And today, I have a use where I want to broadcast the layer instead (easier than sharing it). That's in fact the 3rd case mentioned here: #1685 (comment) but I think it never worked.

mcabbott · 2024-03-13T03:05:39Z

Reading old threads... around here #2101 (comment) it was agreed that adding (c::Chain)(xs...) = c(xs) would make sense, but there was never a PR.

That's the first MethodError in my list above. I would like this too, and perhaps should just add it here.

mcabbott · 2024-03-13T14:32:18Z

Anyone remember why we allow Parallel(hcat)? You can write Returns(hcat()) if you really want that...

julia> Parallel(hcat)()
Any[]

julia> Parallel(hcat)(NaN)  # ignores input, but this case is tested
Any[]

julia> Parallel(hcat)(1,2,3)
ERROR: ArgumentError: Parallel with 0 sub-layers can take one input or 0 inputs, but got 3 inputs

Can we just make this an error on construction? I think that's basically what was agreed in #1685

ToucheSir

Since it was linked here, can you quickly comment on the relationship between this and FluxML/Functors.jl#80?

mcabbott · 2024-10-16T14:38:35Z

I put this on 0.15 milestone... I still think it's the right thing to do, but perhaps a breaking change is the right time to merge it.

ToucheSir reviewed Mar 13, 2024

View reviewed changes

src/layers/basic.jl Outdated Show resolved Hide resolved

mcabbott changed the title ~~Allow Parallel(+, f)(x, y, z) to work like broadcasting~~ Allow Parallel(+, f)(x, y, z) to work like broadcasting, and enable Chain(identity, Parallel(+, f))(x, y, z) Mar 13, 2024

mcabbott mentioned this pull request Mar 13, 2024

functor Base.Splat FluxML/Functors.jl#80

Merged

mcabbott added the run downstream test label Mar 13, 2024

mcabbott added 7 commits March 19, 2024 13:49

let Parallel(+, f)(x, y, z) work like broadcasting

d371fec

add (::Chain)(xs...) method

4b3fe90

more examples

75d239f

correction

a46da88

change implementation to dispatch

6e431e9

nicer errors when called on zero inputs

5743841

disallow zero layers, let's try this out

9ee2c69

mcabbott force-pushed the parallel_bc branch from 0544711 to 9ee2c69 Compare March 19, 2024 17:50

ToucheSir approved these changes Mar 28, 2024

View reviewed changes

mcabbott added this to the v0.15 milestone Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

mcabbott commented Mar 10, 2024

codecov bot commented Mar 10, 2024 •

edited

Loading

mcabbott commented Mar 13, 2024 •

edited

Loading

mcabbott commented Mar 13, 2024

mcabbott commented Mar 13, 2024 •

edited

Loading

ToucheSir left a comment

mcabbott commented Oct 16, 2024

Allow Parallel(+, f)(x, y, z) to work like broadcasting, and enable Chain(identity, Parallel(+, f))(x, y, z) #2393

Are you sure you want to change the base?

Allow Parallel(+, f)(x, y, z) to work like broadcasting, and enable Chain(identity, Parallel(+, f))(x, y, z) #2393

Conversation

mcabbott commented Mar 10, 2024

PR Checklist

codecov bot commented Mar 10, 2024 • edited Loading

Codecov Report

mcabbott commented Mar 13, 2024 • edited Loading

mcabbott commented Mar 13, 2024

mcabbott commented Mar 13, 2024 • edited Loading

ToucheSir left a comment

Choose a reason for hiding this comment

mcabbott commented Oct 16, 2024

Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

codecov bot commented Mar 10, 2024 •

edited

Loading

mcabbott commented Mar 13, 2024 •

edited

Loading

mcabbott commented Mar 13, 2024 •

edited

Loading