-
Notifications
You must be signed in to change notification settings - Fork 651
Swish activation doesn't save the weight if beta is not trainable #482
Comments
Could you expand on this?
|
Indeed. To apply neural networks trained in the latest DNN libraries (such as Keras) to a massive C++ framework, developed over several years, it's often useful to use a light weight inference package (LWTNN). It is routinely used by certain experiments at CERN (European Organization for Nuclear Research) but the interest in using such a package extends beyond physics. It's lighter than adding an entire TensorFlow dependency to your production framework and avoids problems of having to harmonise multithreading and so on. The idea in LWTNN is just to pickup the weights of a model trained in Keras from the weights .h5 file and use it for inference in C++. So it is useful if the weight of a swish function is always stored in the weights file when a model is saved, whether or not the beta parameter is trainable. For this reason I suggest to add the weight with the
But I am open to any other solution as long as Keras always saves the |
How does it work for example with the |
For the |
On further consideration I am not convinced it's worth changing the implementation of |
We don't plan on changing this, so no problem here. Also, on a side note, we can't change code just for the sake of facilitating implementation in specific projects. For a change to take place, a majority of users must benefit from it. |
It is very useful to always save the beta value in the weights file even if beta is not trainable. It is useful when converting to a light weight inference package such as LWTNN (https://github.com/lwtnn/lwtnn).
I already have an implementation of swish activation that preserves all the features of the current one but also saves the untrainable beta in the weights file and I would like to create a pull request.
The text was updated successfully, but these errors were encountered: