RuntimeError: `merge_call` called while defining a new graph or a tf.function. 

**System information**.
- Have I written custom code: Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab
- TensorFlow version (use command below): 2.6, 2.9

---

## Describe the problem

I have code that works fine but gives the following error if I use `with strategy.scope()`.

>  RuntimeError: `merge_call` called while defining a new graph or a tf.function. This can often happen if the function `fn` passed to `strategy.run()` contains a nested `@tf.function`, and the nested `@tf.function` contains a synchronization point, such as aggregating gradients (e.g, optimizer.apply_gradients), or if the function `fn` uses a control flow statement which contains a synchronization point in the body. Such behaviors are not yet supported. Instead, please avoid nested `tf.function`s or control flow statements that may potentially cross a synchronization boundary, for example, wrap the `fn` passed to `strategy.run` or the entire `strategy.run` inside a `tf.function` or move the control flow out of `fn`

## Describe the expected behavior

I think, It should work.

- Do you want to contribute a PR? (yes/no): No.
- If yes, please read [this page](https://github.com/keras-team/keras/blob/master/CONTRIBUTING.md) for instructions


## Standalone code to reproduce the issue

The code is for gradient accumulation techniques. Here it is done by overriding the `trian_step` with `fit` method.  This code works fine (as said above) without `with strategy.scope()`. Now, I like to use it for multi-gpu cases, and so I use strategy scope but ened up the the above mentioned error.

[Gist.](https://colab.research.google.com/drive/1yTu-qM4zTxtzNE7Ki6FvSyD9ijhvHdnw?usp=sharing)
## Follow-up Questions

1. In the above gist, for multi-gpu training, do I need to adjust anythong for `BATCH_SIZE = 32 * strategy.num_replicas_in_sync` inside the `train_step` method? Or it will be handled auto?
2. In the above gist, I use mixed precisoin technique, and so I also wrap ([as described](https://www.tensorflow.org/api_docs/python/tf/keras/mixed_precision/LossScaleOptimizer)) optimizer with `LossScaleOptimizer` and use `optimizer.get_scaled_loss(loss)` and `optimizer.get_unscaled_gradients(gradients)`. 
But the [official documentation](https://www.tensorflow.org/api_docs/python/tf/keras/mixed_precision/LossScaleOptimizer) talks about normal `fit` and custom loop training cases. In case of custom loop, it's suggested to wrap the optimizer and scale the loss and gradient but what about the combination of `fit` and custom loop (overriding `train_step`)? Does it sill need to wrap the optimizer and scale the loss and gradient or it will be handled by the API?

---

Others: https://github.com/keras-team/tf-keras/issues/107 cc @chenmoneygithub @nikitamaia @bhack 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: `merge_call` called while defining a new graph or a tf.function. #301

Describe the problem

Describe the expected behavior

Standalone code to reproduce the issue

Follow-up Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: merge_call called while defining a new graph or a tf.function. #301

Description

Describe the problem

Describe the expected behavior

Standalone code to reproduce the issue

Follow-up Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

RuntimeError: `merge_call` called while defining a new graph or a tf.function. #301