You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 1, 2024. It is now read-only.
This is potentially an ask for addressing at the Module API level, but it is more obvious with the Keras integration.
When creating an instance of mx.optimizers.Optimizer if the value of rescale_grad is not specified, the default value of 1.0 has a significant impact on training. In fact, this is pointed out as a warning in the logs, when the optimizer is initialized.
/usr/local/lib/python2.7/site-packages/mxnet/module/bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.0078125). Is this intended?
force_init=force_init)
Since the MXNet implementations of the Keras optimizers, essentially delegate to the Module versions, this parameter should likely be configured to the normalized value, as it is not obvious from the Keras API. It is possible to provide rescale_grad as an additional argument, but that requires the user to know some of the details of both frameworks.
The text was updated successfully, but these errors were encountered:
This is potentially an ask for addressing at the Module API level, but it is more obvious with the Keras integration.
When creating an instance of mx.optimizers.Optimizer if the value of rescale_grad is not specified, the default value of 1.0 has a significant impact on training. In fact, this is pointed out as a warning in the logs, when the optimizer is initialized.
Since the MXNet implementations of the Keras optimizers, essentially delegate to the Module versions, this parameter should likely be configured to the normalized value, as it is not obvious from the Keras API. It is possible to provide rescale_grad as an additional argument, but that requires the user to know some of the details of both frameworks.
The text was updated successfully, but these errors were encountered: