allow non-tuple data in the new train! #2119

CarloLucibello · 2022-11-21T07:06:27Z

Follow up to #2082 with ~~two~~ changes to the new train!:

allows for non-tuple elements in the data iterators (same as the batchmemaybe function of the old train!)
according to Add explicit train!, unify update!, and auto-translate the two Adams #2082 (comment)
~~support callbacks~~

I can factor out the second change if deemed controversial. Edit: removed the callback addition.

mcabbott

I think this maybe-splat behaviour is a little odd, but see the argument for keeping it in the name of making upgrades easier. Comments below are only on this.

Besides the headline change, this PR not only re-introduces callbacks, but introduces a larger more complicated callback scheme. Do we want this, or to push people towards writing a for loop?

src/train.jl

CarloLucibello · 2022-11-21T15:46:27Z

Besides the headline change, this PR not only re-introduces callbacks, but introduces a larger more complicated callback scheme. Do we want this, or to push people towards writing a for loop?

I simplified the scheme, compared to the old scheme it is more useful and has the same complexity.
I can factor out the callback change into another PR. Or also ditch it, I'm not sure myself it is something we want to do.

mcabbott · 2022-11-21T15:46:41Z

I agree that a callback which gets some details is more useful; the old-style ones where you wrap the global model seem more in keeping with the implicit style.

Passing this as one NamedTuple is probably a better choice than splatting it as keywords, as your function can then take what it needs.

One downside is that it does mean many more things need names, in code not just in docs. The present list is (model, data, opt, step, loss, gradient), that's a lot. And is "opt" an optimiser or a state or a tree, the documentation isn't completely consistent... but so far its name never matters much.

Maybe it would be clearer to call this train!(loss, m, data, opt; callback) with a different keyword. Then there's less confusion between the old and new things. The old cb is a pretty cryptic name; it could survive for now to accept a zero-arg function with a depwarn.

darsnack · 2022-11-21T15:48:20Z

Regarding callbacks, I think it's more of a design/community issue than anything else. The proposed callback scheme does provide all the state in the loop to the user. But there is already a difficulty: as written, the callback happens before the optimizer update, wouldn't opt and model be more useful after the update? I see two options:

We provide a callback as proposed here but called at the end of each loop iteration. We take a hard stance on declining to add more entry points into the loop. We do not provide built-in callbacks. This feature is described as a "quick and dirty" callback for when you do not want to refactor your code.
We delete callbacks completely. I don't think the old style (close over globals) callbacks should remain as we transition to explicit. They are incompatible with immutable models, and even though most models are/will be mutable, I think it is confusing to have a schism in behavior.

This is why I describe it as a community issue, because the issue is preventing feature creep. (2) does this neatly whereas (1) will always have users requesting more.

mcabbott · 2022-11-21T15:53:43Z

Re before/after, worth remembering the Flux.skip debacle, where apparently it was so confusing and so code-golfed that it got documented in a way which didn't actually skip anything, and tests didn't notice.

Whereas test() && continue in a loop does exactly what it looks like it'll do.

CarloLucibello · 2022-11-21T21:18:57Z

But there is already a difficulty: as written, the callback happens before the optimizer update, wouldn't opt and model be more useful after the update?

my reasoning was that executing the callback before the update you are able to implement things such as gradient clipping

mcabbott · 2022-11-21T21:47:03Z

For simple clipping we of course have ClipGrad.

Do we have good examples of callbacks being useful in the wild?

Every single use in the model zoo appears to be to print the loss, with throttle. So are all the ones I found on discourse in 5 minutes.

https://github.com/FluxML/model-zoo/search?q=cb
FluxML/Metalhead.jl#62 (comment)
https://discourse.julialang.org/search?q=flux%20cb

We could just build that into train! as an option. The goal of #2120 is to make the same thing easy in "procedural" code.

CarloLucibello · 2022-11-22T05:51:39Z

I removed the callback addition from this PR as it is controversial.
We can merge this if we agree on supporting non-tuple data (I do).

src/train.jl

Co-authored-by: Kyle Daruwalla <[email protected]>

CarloLucibello added 2 commits November 21, 2022 08:01

allow non-tuple data

2cc0305

cl/batchme

c5312f4

mcabbott reviewed Nov 21, 2022

View reviewed changes

src/train.jl Outdated Show resolved Hide resolved

src/train.jl Outdated Show resolved Hide resolved

src/train.jl Outdated Show resolved Hide resolved

src/train.jl Outdated Show resolved Hide resolved

CarloLucibello added 3 commits November 21, 2022 15:37

add tests

63967dc

test multiple callback

22f84ce

cleanup notes

34b2ab7

CarloLucibello changed the title ~~allow non-tuple data in the new train!~~ updates to the new train! Nov 21, 2022

CarloLucibello added 2 commits November 21, 2022 16:08

cleanup

1101f53

cleanup

3759e9e

CarloLucibello added 2 commits November 22, 2022 06:45

remove callbacks

a08785e

cleanup

8e6e9ac

CarloLucibello changed the title ~~updates to the new train!~~ allow non-tuple data in the new train! Nov 22, 2022

darsnack approved these changes Nov 24, 2022

View reviewed changes

src/train.jl Outdated Show resolved Hide resolved

Update src/train.jl

b8b3d8a

Co-authored-by: Kyle Daruwalla <[email protected]>

CarloLucibello merged commit a5e5546 into master Nov 24, 2022

CarloLucibello mentioned this pull request Nov 24, 2022

fix train! test #2123

Merged

mcabbott deleted the cl/batchme branch November 24, 2022 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow non-tuple data in the new train! #2119

allow non-tuple data in the new train! #2119

CarloLucibello commented Nov 21, 2022 •

edited

Loading

mcabbott left a comment

CarloLucibello commented Nov 21, 2022

mcabbott commented Nov 21, 2022

darsnack commented Nov 21, 2022

mcabbott commented Nov 21, 2022

CarloLucibello commented Nov 21, 2022

mcabbott commented Nov 21, 2022

CarloLucibello commented Nov 22, 2022

allow non-tuple data in the new train! #2119

allow non-tuple data in the new train! #2119

Conversation

CarloLucibello commented Nov 21, 2022 • edited Loading

mcabbott left a comment

Choose a reason for hiding this comment

CarloLucibello commented Nov 21, 2022

mcabbott commented Nov 21, 2022

darsnack commented Nov 21, 2022

mcabbott commented Nov 21, 2022

CarloLucibello commented Nov 21, 2022

mcabbott commented Nov 21, 2022

CarloLucibello commented Nov 22, 2022

CarloLucibello commented Nov 21, 2022 •

edited

Loading