Convert 'Vision Transformer without Attention' to Keras 3. #1855

fkouteib · 2024-05-04T19:16:15Z

Tensorflow and PyTorch only compatibilty.

fkouteib · 2024-05-04T19:36:42Z

On Tensorflow, I am able to train and test the model, but hit this issue when loading the saved model to do inference on it. It may be the same issue as keras-team/keras#19492 but I am not 100% sure.

$HOME/.tf_venv/lib/python3.10/site-packages/keras/src/saving/saving_lib.py:418: UserWarning: Skipping variable loading for optimizer 'adamw', because it has 1 variables whereas the saved optimizer has 219 variables.
trackable.load_own_variables(weights_store.get(inner_path))
Traceback (most recent call last):
File "$HOME/keras-io_rw/examples/vision/shiftvit.py", line 1092, in
probabilities = predict(predict_ds)
File "$HOME/keras-io_rw/examples/vision/shiftvit.py", line 1062, in predict
logits = saved_model.predict(predict_ds)
File "$HOME/.tf_venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File "$HOME/keras-io_rw/examples/vision/shiftvit.py", line 720, in call
augmented_images = self.data_augmentation(images)
TypeError: Exception encountered when calling ShiftViTModel.call().
'TrackedDict' object is not callable
Arguments received by ShiftViTModel.call():
• images=tf.Tensor(shape=(10, 32, 32, 3), dtype=uint8)

fkouteib · 2024-05-04T19:40:21Z

On PyTorch, I am hitting this issue when compiling the initial model before training starts.

File "$HOME/keras-io_rw/examples/vision/shiftvit.py", line 937, in
model(sample_ds, training=False)
File "/$HOME/.torch_venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File "$HOME/.torch_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/$HOME/.torch_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "$HOMEkeras-io_rw/examples/vision/shiftvit.py", line 723, in call
x = stage(x, training=False)
File "$HOME/.torch_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/$HOME/.torch_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/$HOME/keras-io_rw/examples/vision/shiftvit.py", line 569, in call
x = shift_block(x, training=training)
File "$HOME/.torch_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "$HOME/.torch_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "$HOME/keras-io_rw/examples/vision/shiftvit.py", line 429, in call
x_splits[0] = self.get_shift_pad(x_splits[0], mode="left")
TypeError: Exception encountered when calling ShiftViTBlock.call().
'tuple' object does not support item assignment
Arguments received by ShiftViTBlock.call():
• x=torch.Tensor(shape=torch.Size([256, 12, 12, 96]), dtype=float32)
• training=False

fchollet

Thanks for the PR!

I believe you may be able to make the code fully-backend agnostic without implementing backend-specific train_steps. Instead, you could override compute_loss() and make it work with all backends. The train step is generic and only loss computation appears to be custom.

fkouteib · 2024-05-05T00:54:22Z

Thx for the review and suggestion Francois! I dropped the custom train and test steps. The combination of overriding call() method and the native compute_loss() method was equivalent to the custom loss method.

Current issues I am debugging:

Tensorflow: passes train and test steps. Failing inference when switching to use a loaded *.keras model with custom objects (see first comment for full error).
Pytorch: fails train step on first epoch with "'tuple' object does not support item assignment" (see second comment above for full error).
JAX: fails train step on first epoch with error below.

Encountered an unexpected tracer. A function transformed by JAX had a side effect, allowing for a reference to an intermediate value with type float32[] wrapped in a DynamicJaxprTracer to escape the scope of the transformation.
JAX transformations require that functions explicitly return their outputs, and disallow saving intermediate values to global state.
The function being traced when the value leaked was wrapped_fn at /home/faycel.kouteib/.tf_jax_venv/lib/python3.10/site-packages/keras/src/backend/jax/core.py:153 traced for make_jaxpr.
The leaked intermediate value was created on line /home/faycel.kouteib/keras-io_rw/examples/vision/shiftvit.py:535 ().
When the value was created, the final 5 stack frames (most recent last) excluding JAX-internal frames were:
$HOME/.tf_jax_venv/lib/python3.10/site-packages/keras/src/layers/layer.py:771 (call)
$HOME/.tf_jax_venv/lib/python3.10/site-packages/keras/src/layers/layer.py:1279 (_maybe_build)
$HOME/.tf_jax_venv/lib/python3.10/site-packages/keras/src/layers/layer.py:223 (build_wrapper)
$HOME/keras-io_rw/examples/vision/shiftvit.py:535 (build)
$HOME/keras-io_rw/examples/vision/shiftvit.py:535 ()

fchollet

Thanks for the updates.

torch issue

This caught a bug: ops.split is supposed to return a list, but for torch it returns a tuple (same as torch.split). I fixed it. You can route around it by creating an output list and appending elements to it. Once done, the code runs with torch.

tf issue with deserialization

You need to call deserialize_keras_object on the models/layers passed to constructors, to enable deserialization / model loading.

e.g.

 self.data_augmentation = keras.saving.deserialize_keras_object(data_augmentation)

jax issue

This one has to do with tracer leaks. Those problems are unique to JAX and can be tricky to debug. A first problem is using ops.linspace instead of np.linspace in build(). There are further issues down the line however.

fchollet · 2024-05-08T01:58:54Z

examples/vision/shiftvit.py

-        # Update the metrics
-        self.compiled_metrics.update_state(labels, logits)
-        return {m.name: m.result() for m in self.metrics}
-
    def call(self, images):
        augmented_images = self.data_augmentation(images)


Surely this should only be applied at training time? Also, we may consider moving it to the data pipeline instead of inside the model.

Correct. Only active at training time. see here for full context, but block level comment summarizes this well.

The augmentation pipeline consists of:
Rescaling
Resizing
Random cropping
Random horizontal flipping
Note: The image data augmentation layers do not apply data transformations at inference time. This means that when these layers are called with training=False they behave differently. Refer to the [documentation (https://keras.io/api/layers/preprocessing_layers/image_augmentation/) for more details.

fkouteib added 13 commits April 12, 2024 12:25

Convert 'Vision Transformer without Attention' to Keras 3

1ea90b7

import fixes and refactors.

856d088

Merge branch 'keras-team:master' into ShiftViT

f40c989

add multi-framework train and test steps skeleton code.

4405d6c

Merge remote-tracking branch 'upstream/master' into ShiftViT

a97bfc5

working TF train and test steps.

f537624

Model saving/loading fixes.

3980712

fix get_Config()

baff2c5

custom test step refactor, bug fixes.

6dd34b8

Merge remote-tracking branch 'upstream/master' into ShiftViT

0ac5d07

First pass at Pytorch implementation.

910e9ef

Switch crop image from TF to Keras op.

77a4e2e

code cleanup.

f17f900

github-actions bot assigned sachinprasadhs May 4, 2024

fkouteib marked this pull request as draft May 4, 2024 19:36

fchollet reviewed May 4, 2024

View reviewed changes

fkouteib added 2 commits May 5, 2024 00:34

Drop custom train and test steps.

2cf07c6

Fix formatting issues using Black.

154eb96

fchollet reviewed May 8, 2024

View reviewed changes

fkouteib added 7 commits May 14, 2024 17:45

Merge remote-tracking branch 'upstream/master' into shiftvit

77f071c

JAX fixes. Remove more TF dependencies.

51f3d1e

Merge remote-tracking branch 'upstream/master' into shiftvit

2faba87

Fix formatting issues.

21c342d

Merge branch 'master' into shiftvit

580fae8

Merge branch 'master' into shiftvit

d0a9106

Merge remote-tracking branch 'upstream/master' into shiftvit

89b48e0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert 'Vision Transformer without Attention' to Keras 3. #1855

Convert 'Vision Transformer without Attention' to Keras 3. #1855

fkouteib commented May 4, 2024

fkouteib commented May 4, 2024 •

edited

Loading

fkouteib commented May 4, 2024 •

edited

Loading

fchollet left a comment

fkouteib commented May 5, 2024

fchollet left a comment

fchollet May 8, 2024

fkouteib May 15, 2024 •

edited

Loading

Convert 'Vision Transformer without Attention' to Keras 3. #1855

Are you sure you want to change the base?

Convert 'Vision Transformer without Attention' to Keras 3. #1855

Conversation

fkouteib commented May 4, 2024

fkouteib commented May 4, 2024 • edited Loading

fkouteib commented May 4, 2024 • edited Loading

fchollet left a comment

Choose a reason for hiding this comment

fkouteib commented May 5, 2024

fchollet left a comment

Choose a reason for hiding this comment

fchollet May 8, 2024

Choose a reason for hiding this comment

fkouteib May 15, 2024 • edited Loading

Choose a reason for hiding this comment

fkouteib commented May 4, 2024 •

edited

Loading

fkouteib commented May 4, 2024 •

edited

Loading

fkouteib May 15, 2024 •

edited

Loading