Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] when the model is loaded and saved again, the argument names disappear (become 'args_0, args_0_1 etc) #1230

Open
rnyak opened this issue Nov 29, 2023 · 2 comments
Labels
bug Something isn't working P0 status/needs-triage
Milestone

Comments

@rnyak
Copy link
Contributor

rnyak commented Nov 29, 2023

Bug description

when the model (it is transformer based XLNet model that we are training) is loaded and saved again, the argument names disappear (become 'args_0, args_0_1 etc) and the saved model becomes invalid. Because of that we cannot export the reloaded model using Merlin Systems, therefore cannot generate the config files to serve on Triton.

Steps/Code to reproduce bug

Please run these two gists in order to repro the issue:

  1. https://gist.github.com/rnyak/0baa9e80cb419128379f5e91e8a30ed5
  2. https://gist.github.com/rnyak/d222fe2eb7b2ecbf87d924a762c1f7d4#file-sbr_xlnet_load-py

Expected behavior

Environment details

  • Merlin version: We are using merlin-tensorflow:23.08 image with the latest main branch pulled and installed.
  • Platform:
  • Python version:
  • PyTorch version (GPU?):
  • Tensorflow version (GPU?):
@rnyak rnyak added bug Something isn't working status/needs-triage P0 labels Nov 29, 2023
@rnyak rnyak added this to the Merlin 23.10 milestone Nov 29, 2023
@rnyak
Copy link
Contributor Author

rnyak commented Dec 15, 2023

One quick solution to that is to add model block line right before loading the saved model. Something like that:

_ =  mm.XLNetBlock(d_model=32, n_head=2, n_layer=2)  # replace XLNetBlock with `GPT2Block` if you use GPT2 block in model training.
loaded_model = tf.keras.models.load_model('/workspace/saved_model/')

@CarloNicolini
Copy link

One quick solution to that is to add model block line right before loading the saved model. Something like that:

I have the same problem, but I cannot understand this solution. Why that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 status/needs-triage
Projects
None yet
Development

No branches or pull requests

2 participants