Represent the absence of bias by `bias=false` #1407

mcabbott · 2020-11-27T10:46:08Z

This continues the good work of #1379 to its logical conclusion, by making the storage match the API: Dense(3,5; bias=false) literally stores randn(5,3), false, identity.

This neatly drops out of params, is handled well by loadparams! & destructure, automatically.
Since booleans are categorical in Zygote, they can be dropped completely in the gradient. ~~The code from Adjoint for broadcasting A .+-* Bool Zygote.jl#838 is for now duplicated here for testing purposes.~~
Maybe there's some tiny cost to doing σ.(Wx .+ false) during inference, but it's pretty hard to detect.

Broadcasting identity.(Wx) is an easier cost to detect, since it copies the whole array. (Also only inference, removed by Zygote.) We could avoid both of these at the cost of a little complication, like some function mapadd(σ, Wx, b) to handle this, which I'm happy to write if there is interest. But nobody has thought this worth the complication so far --- which I read as saying that Flux values simplicity.

Smaller points:

It has depwarns which catch both Zeros() and Zeros(dims...).
The function create_bias from Fix some issues with Zeros option 2 #1379 is retained, and now also checks the size of the bias vector, if the user provides their own.
conv.jl has fewer lines of code, by moving the reshape(c.bias, map(_->1, c.stride)..., :, 1) thing to a function, instead of repeating it.
~~There is some overlap with remove implicit conversions #1393, as this was written some time ago. This PR makes the type conversion noisy & explicit, although removing it entirely might be cleaner.~~
I see no reason this should un-fix GPU error when using Zeros() as bias in Conv layer #1332, but haven't tested on a GPU. Not sure if that's covered by CI.
Closes Revert "Fix some issues with Zeros option 2" #1392.

src/utils.jl

CarloLucibello · 2020-11-27T11:22:04Z

could you remove the conversion stuff? It's not related to this PR and I hope to merge #1393 soon

mcabbott · 2020-11-27T11:24:30Z

Yes, can do. Probably easiest as a rebase once that's merged, rather than trying to roll back smoothly enough that git doesn't notice.

(It got added just because I landed in dispatch hell from the old implementation, somehow, and this was the quickest way out!)

DhairyaLGandhi · 2021-01-22T10:00:42Z

Since we've moved on from representing bias with false I would close this, feel free to open again if one thinks this needs a revisit. I'd be inclined to get #1402 in instead

CarloLucibello · 2021-01-22T10:08:48Z

I still prefer this option to #1402 and to the solution we currently have. #1392 should be closed instead

mcabbott · 2021-01-22T10:08:58Z

No, this is not done. It is awaiting a rebase now that #1393 is in, as mentioned. Will someone who has sufficient permissions please re-open?

CarloLucibello · 2021-01-22T10:11:48Z

Will someone who has sufficient permissions please re-open?

@DhairyaLGandhi please upgrade @mcabbott and @darsnack roles, don't force me to keep asking and asking

DhairyaLGandhi · 2021-01-22T10:38:38Z

Both of them are part of the same team as you or I.

DhairyaLGandhi · 2021-01-22T10:45:24Z

Either way, I think this approach hasn't been in line with the api changes that we have wanted to make here.

CarloLucibello · 2021-01-22T10:50:52Z

org members' permissions should be set to Write
https://docs.github.com/en/github/setting-up-and-managing-organizations-and-teams/setting-base-permissions-for-an-organization

DhairyaLGandhi · 2021-01-22T11:41:52Z

I am not inclined to deprecate Zeros or introduce a literal false, and it's somewhat surprising that folks here are not willing to consider the possibility of respecting existing code, their patterns, and the current solution also has questionable behavior with other tools. Added to that the solution in #1402 is generic enough to be used outside of this context without forcing any more burden on users as what this PR proposes and comes with more predictable behavior.

Now I can understand that mcabbott wants to say that non existence of a bias means it doesn't exist, end of story. That's fine. But given a set of parameters how would you determine the model they came from? The skeleton model that we load the parameters into was typically expected to have this information prebaked, but once I tell you these params come from a typical resnet, with these slight tweaks to some biases being missing, without knowing them a priori, a singleton approach can't handle it, and would not be able to load the params correctly. Not to mention figuring out how to rewrite the model building code to make these distinctions. This is in part because this partial model specification was not embedded into the params. With a fake array approach, that is mitigated, and can be turned into trainable parameters at will, and used without any extra runtime penalties. The issues with param saving and GPU support were also fixed in #1402, so I don't understand why it's being pushed down this hard. Maybe it's more flexibility than we absolutely need at this moment, but it doesn't exactly come at extra cost either.

mcabbott · 2021-01-22T16:49:40Z

I think arguments in favour of Flux defining constant array types can be made in #1402. They would be strongest with references to deep learning models making use of them.

questionable behavior with other tools

Do you have a MWE?

given a set of parameters how would you determine the model they came from?

You can't. A vector of numbers has less information than a model. That's why destructure needs to return a function which re-creates the model.

darsnack · 2021-01-22T22:57:34Z

Having been incidentally tagged on this PR, and now that I have spent over an hour catching up on the optional bias threads scattered all over, I have to say that I am strongly in favor of the approach used in this PR over #1402. It is more maintainable and simpler (which as the original PR notes, is something Flux should continue to strive towards). I'll comment a bit more on #1402 as my thoughts are around Zeros not this PR.

CarloLucibello reviewed Nov 27, 2020

View reviewed changes

src/utils.jl Outdated Show resolved Hide resolved

CarloLucibello reviewed Nov 27, 2020

View reviewed changes

src/utils.jl Outdated Show resolved Hide resolved

mcabbott force-pushed the nobias branch from 54e84a8 to c7b9cd3 Compare December 21, 2020 23:41

DhairyaLGandhi closed this Jan 22, 2021

CarloLucibello reopened this Jan 22, 2021

mcabbott force-pushed the nobias branch from 2839f2a to fa93442 Compare January 22, 2021 16:41

darsnack mentioned this pull request Jan 22, 2021

Allow Zeros with dimensions #1402

Closed

4 tasks

squash PR 1407, eleven commits, 2020

8f650ac

mcabbott force-pushed the nobias branch from fa93442 to 8f650ac Compare March 6, 2021 11:27

fix conv test

aff2b4d

mcabbott mentioned this pull request Feb 19, 2022

rm Flux.Zeros #1882

Merged

mcabbott closed this in #1882 Mar 5, 2022

mcabbott deleted the nobias branch March 5, 2022 17:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Represent the absence of bias by `bias=false` #1407

Represent the absence of bias by `bias=false` #1407

mcabbott commented Nov 27, 2020 •

edited

Loading

CarloLucibello commented Nov 27, 2020

mcabbott commented Nov 27, 2020

DhairyaLGandhi commented Jan 22, 2021 •

edited

Loading

CarloLucibello commented Jan 22, 2021

mcabbott commented Jan 22, 2021

CarloLucibello commented Jan 22, 2021

DhairyaLGandhi commented Jan 22, 2021

DhairyaLGandhi commented Jan 22, 2021

CarloLucibello commented Jan 22, 2021

DhairyaLGandhi commented Jan 22, 2021

mcabbott commented Jan 22, 2021

darsnack commented Jan 22, 2021

Represent the absence of bias by bias=false #1407

Represent the absence of bias by bias=false #1407

Conversation

mcabbott commented Nov 27, 2020 • edited Loading

CarloLucibello commented Nov 27, 2020

mcabbott commented Nov 27, 2020

DhairyaLGandhi commented Jan 22, 2021 • edited Loading

CarloLucibello commented Jan 22, 2021

mcabbott commented Jan 22, 2021

CarloLucibello commented Jan 22, 2021

DhairyaLGandhi commented Jan 22, 2021

DhairyaLGandhi commented Jan 22, 2021

CarloLucibello commented Jan 22, 2021

DhairyaLGandhi commented Jan 22, 2021

mcabbott commented Jan 22, 2021

darsnack commented Jan 22, 2021

Represent the absence of bias by `bias=false` #1407

Represent the absence of bias by `bias=false` #1407

mcabbott commented Nov 27, 2020 •

edited

Loading

DhairyaLGandhi commented Jan 22, 2021 •

edited

Loading