Added Dense and Conv BatchEnsemble layers along with unit tests and example on MNIST classification using LeNet5 #4

DwaraknathT · 2021-08-30T15:24:53Z

Added BatchEnsemble layers -- the idea is to factorize the weight matrix of each member in the ensemble into 3 matrices. 1 full matrix of the same shape as layer's weight matrix, and 2 fast matrices (usually rank-1 matrices). A model's weights are generated by taking the cross product of the two fast matrices and taking the hadamard product between the resultant matrix and full matrix.
Added unit tests for both batch ensemble layers
Added an example on using batch ensemble layers for MNIST classification using LeNet 5

…xample on MNIST classification using LeNet5

…d conv batchensemble unit test

DhairyaLGandhi

Thoughts on starting to Add GPU tests along with the regular ones? In theory it should be as straightforward as gpu(layer), gpu(input), @test ...

DhairyaLGandhi · 2021-09-01T12:31:44Z

src/layers/BatchEnsemble/conv.jl

+function ConvBatchEnsemble(
+    k::NTuple{N,Integer},
+    ch::Pair{<:Integer,<:Integer},
+    rank::Integer,
+    ensemble_size::Integer,
+    σ = identity;
+    init = glorot_normal,
+    alpha_init = glorot_normal,
+    gamma_init = glorot_normal,
+    stride = 1,
+    pad = 0,
+    dilation = 1,
+    groups = 1,
+    bias = true,
+    ensemble_bias = true,
+    ensemble_act = identity,


Same comment as last time about keeping things simple and general.

Maybe it makes sense to have a constructor that takes in a Conv layer directly?

Yeah, it does. I guess we can have both as well.

We actually need the input/output dimensions to create the alpha/gamma matrices. Might as well keep them in the signature, or we'll have to infer them from the conv layer's struct and that might change anytime in flux source ?

src/layers/BatchEnsemble/conv.jl

DhairyaLGandhi · 2021-09-01T12:33:07Z

src/layers/BatchEnsemble/conv.jl

+        ensemble_act::F = identity,
+        rank = 1,
+    ) where {M,F,L}
+        ensemble_bias = create_bias(gamma, ensemble_bias, size(gamma)[1], size(gamma)[2])


Can you test it with FluxML/Flux.jl#1402

DhairyaLGandhi · 2021-09-01T12:35:20Z

src/layers/BatchEnsemble/conv.jl

+    alpha = repeat(alpha, samples_per_model)
+    gamma = repeat(gamma, samples_per_model)
+    # Reshape alpha, gamma to [units, batch_size, rank]
+    e_b = reshape(e_b, (1, 1, out_size, batch_size))


Size of the bias seems relevant here.

How do we know that the shape of the bias allocated can fit into the container its expected to be in

src/layers/BatchEnsemble/dense.jl

DhairyaLGandhi · 2021-09-01T12:38:19Z

src/layers/BatchEnsemble/dense.jl

+    outputs = sum(outputs, dims = 3)
+    outputs = reshape(outputs, (out_size, samples_per_model, ensemble_size))
+    # Reshape ensemble bias 
+    e_b = Flux.unsqueeze(e_b, ndims(e_b))


Curious: Are the sizes of bias somewhat variable in these methods?

Oh right, you meant the physical size in the memory ? no, those sizes are not variable. There are a fixed number of elements in the bias, we just change the shape of the array. If you meant the logical size (shape in numpy terms) the yes, they are variable.

DhairyaLGandhi · 2021-09-01T12:41:39Z

src/layers/BatchEnsemble/dense.jl

+    alpha = reshape(alpha, (in_size, ensemble_size * rank))
+    gamma = reshape(gamma, (out_size, ensemble_size * rank))
+    # Repeat breaks on GPU when input dims > 2 
+    alpha = repeat(alpha, samples_per_model)


Do we need to materialise this array or can we broadcast it to higher dimensions. Something like

julia> x = ones(3,3) 3×3 Matrix{Float64}: 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 julia> y = zeros(3,3,3) 3×3×3 Array{Float64, 3}: [:, :, 1] = 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 [:, :, 2] = 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 [:, :, 3] = 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 julia> x .+ y 3×3×3 Array{Float64, 3}: [:, :, 1] = 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 [:, :, 2] = 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 [:, :, 3] = 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

Notice that the lower dimension array was broadcasted to the higher dimensions automatically

We are already broadcasting the input for the last dimension (the rank dimension). I think we have to materialize the array because, conceptually, the idea is to take a minibatch of samples (of batch size B), repeat them N times to have an effective minibatch of B*N. Now, we want each N copy of the B samples to be give a different ensemble model weights. So we need the fast weights (alpha, gamma) to have the same size as batch size for to be broadcasted for the final dimension.

Also, the starting shape of the fast weights is (in_size, ensemble_size, rank) while input shape is (in_size, batch_size) -- so we need the repeat call to match the dimensions for * op.

But the samples are always the same, so why would it matter if its materialised or not?

1. Reduce imports and move them to main file 2. Renamed test file names 3. Added GPU tests for layers -- for now it's basic forward pass etc

…ts and example on MNIST classification using LeNet5 (#4)" This reverts commit ed3d16c.

…ts and example on MNIST classification using LeNet5 (#4)" (#14) This reverts commit ed3d16c.

Added Dense and Conv BatchEnsemble layers along with unit tests and e…

f704d0c

…xample on MNIST classification using LeNet5

DwaraknathT requested a review from DhairyaLGandhi August 30, 2021 15:40

Merged dense batchensemble forward passes for rank>1 and rank=1; Fixe…

3e924e2

…d conv batchensemble unit test

DhairyaLGandhi reviewed Sep 1, 2021

View reviewed changes

Changes:

9ae8391

1. Reduce imports and move them to main file 2. Renamed test file names 3. Added GPU tests for layers -- for now it's basic forward pass etc

DwaraknathT mentioned this pull request Sep 7, 2021

Test bias with https://github.com/FluxML/Flux.jl/pull/1402 #13

Open

DwaraknathT merged commit ed3d16c into main Sep 7, 2021

DwaraknathT added a commit that referenced this pull request Sep 7, 2021

Revert "Added Dense and Conv BatchEnsemble layers along with unit tes…

683827b

…ts and example on MNIST classification using LeNet5 (#4)" This reverts commit ed3d16c.

DwaraknathT mentioned this pull request Sep 7, 2021

Revert "Added Dense and Conv BatchEnsemble layers along with unit tests and example on MNIST classification using LeNet5" #14

Merged

DwaraknathT added a commit that referenced this pull request Sep 7, 2021

Revert "Added Dense and Conv BatchEnsemble layers along with unit tes…

ab89d82

…ts and example on MNIST classification using LeNet5 (#4)" (#14) This reverts commit ed3d16c.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Dense and Conv BatchEnsemble layers along with unit tests and example on MNIST classification using LeNet5 #4

Added Dense and Conv BatchEnsemble layers along with unit tests and example on MNIST classification using LeNet5 #4

DwaraknathT commented Aug 30, 2021

DhairyaLGandhi left a comment

DhairyaLGandhi Sep 1, 2021

DwaraknathT Sep 1, 2021

DwaraknathT Sep 1, 2021

DhairyaLGandhi Sep 1, 2021

DhairyaLGandhi Sep 1, 2021

DhairyaLGandhi Sep 1, 2021

DhairyaLGandhi Sep 1, 2021

DwaraknathT Sep 1, 2021

DhairyaLGandhi Sep 1, 2021 •

edited

Loading

DwaraknathT Sep 1, 2021 •

edited

Loading

DhairyaLGandhi Sep 7, 2021

Added Dense and Conv BatchEnsemble layers along with unit tests and example on MNIST classification using LeNet5 #4

Added Dense and Conv BatchEnsemble layers along with unit tests and example on MNIST classification using LeNet5 #4

Conversation

DwaraknathT commented Aug 30, 2021

DhairyaLGandhi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DhairyaLGandhi Sep 1, 2021 • edited Loading

Choose a reason for hiding this comment

DwaraknathT Sep 1, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DhairyaLGandhi Sep 1, 2021 •

edited

Loading

DwaraknathT Sep 1, 2021 •

edited

Loading