-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for safetensors
materializers
#2539
Conversation
Important Auto Review SkippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
I think the thing to add now would be a section in that docs page on materializers that explains:
|
@strickvl Do I create a separate kind of section for these new materializers or add them with existing ones? |
I think I'd do it as a section on its own before
https://docs.zenml.io/user-guide/advanced-guide/data-management/handle-custom-data-types#custom-materializers
this section
…On Mon, 18 Mar 2024 at 16:38, Dev Khant ***@***.***> wrote:
@strickvl Do I create a separate kind of section for these new materializers or add them with existing ones?
And I'll also prepare a different section explaining why to use safetensors and provide a code example.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Also note that tests are failing https://github.com/zenml-io/zenml/actions/runs/8328772930/job/22789533265?pr=2539 |
Hi @strickvl, I have added the documentation, but I'm not sure if the code is correct for using materializers as I could not find any docs on how to use integration-specific materialized. I have also made fixes for failing tests. So could please check and let me know if that's correct or how it can be improved? Thanks. |
@Dev-Khant one thing you'll have to do is to make sure that your PR is made off the develop branch. See https://github.com/zenml-io/zenml/blob/develop/CONTRIBUTING.md#-pull-requests-rebase-your-branch-on-develop for more. At the moment this PR is listed as being based on |
I'm fixing failed test cases! |
safetensors
materializers
@strickvl Issue is being caused because here only So how do you think we should handle |
Hi @strickvl @avishniakov @bcdurak, please can you guide me on how to fix this issue? Thanks.
|
@Dev-Khant Not sure I understand the question. The test function takes a model in as you've currently defined it. btw, this is currently also failing mypi linting (https://github.com/zenml-io/zenml/actions/runs/8389112411/job/22974663097?pr=2539) |
@strickvl I have fixed the lint issue, will push in next commit. Because when safetensor materialize is called it uses So here if we do this So what is the best to way handle all materializers that need also |
I made the fix for failing test cases and for lint issues. |
Linting still is failing, btw. @Dev-Khant |
@bcdurak I have relevant changes. Please review it. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your effort.
I really like the direction of the PR. I think you already solved some of the challenges you faced before. I added a few more comments for the required changes and modifications.
Additionally, I have two more questions:
- I see on their docs, that
safetensors
also feature APIs fortensorflow
andnumpy
. Is it possible to add these to the list of materialized which have asafetensors
variant? - Can we also add a test for the
pytorch_lightning
materializer?
docs/book/user-guide/advanced-guide/data-management/handle-custom-data-types.md
Outdated
Show resolved
Hide resolved
docs/book/user-guide/advanced-guide/data-management/handle-custom-data-types.md
Outdated
Show resolved
Hide resolved
docs/book/user-guide/advanced-guide/data-management/handle-custom-data-types.md
Outdated
Show resolved
Hide resolved
# Save model architecture | ||
model_arch = {"model": model} | ||
model_filename = os.path.join(self.uri, DEFAULT_MODEL_FILENAME) | ||
torch.save(model_arch, model_filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are on the right path here, however, there is an issue:
This is a pattern that I see in all of the new materializers, AFAIK, if you do torch.save(...)
, it does not only save the model architecture but also the weights.
You can see this in play in the example we mentioned above. If you check your artifacts in your local artifact store manually, there are entire_model.safetensors
and model_architecture.json
present which are both roughly 100 MBs. Basically, it is saving the model twice in two different ways. We need to modify the torch.save
and torch.load
calls to only handle the architecture without the weights.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bcdurak Here I could not find/there is no method to just store the architecture in pytorch
. So what would you recommend here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tough question. But in the current case, it is really inefficient.
It feels like we need to go back to the version where you used the save_model
and load_model
calls. And, we somehow need to figure out how to save the model type in the save
method. If I can think of anything, I will share it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure @bcdurak. Let me know when I switch back to previous method.
model: The Torch Model to write. | ||
""" | ||
# Save model weights | ||
obj_filename = os.path.join(self.uri, DEFAULT_FILENAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you check the regular variant of this materializer, you will see that it uses TemporaryDictionary()
instances during the load
and save
calls. This is the case for many of our materializers, as they load/save from/to a temp directory and copy the contents to the proper destination.
This is mainly due to the remote artifact stores. While our fileio
and io_utils
can handle remote operations, unfortunately, this does not apply to most of the other libraries. Unless the safetensors.save_file
and safetensors.load_file
calls are compatible with all of the remote artifact store that ZenML provides, I would suggest using the same paradigm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll first check and see how it can be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here fileio
doesn't work with safetensors
because it fileio does not return PathLike
str.
So I have changed the HF materialized only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also quite critical. I have just tested the example in the docs with a simple GCP artifact store and as I predicted, it failed. If we leave it in this state, these materializers won't work in any non-local case.
Can we implement any conversion to make it work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright if that is the case, I'll find a way to use it. But first let's first decide on which method to use for saving and loading the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bcdurak Any update for this?
By the way, there is a known issue with our testing suite. We are currently fixing it at the moment. I will keep you updated as it goes on. For the time being, feel free to ignore the failing tests. |
…tom-data-types.md Co-authored-by: Barış Can Durak <[email protected]>
…tom-data-types.md Co-authored-by: Barış Can Durak <[email protected]>
…tom-data-types.md Co-authored-by: Barış Can Durak <[email protected]>
@bcdurak Thanks for this detailed review, I have gone through your comments. And here are my answers:
|
New and removed dependencies detected. Learn more about Socket for GitHub ↗︎
|
@bcdurak @avishniakov Can you please guide me how can I change |
Hey @Dev-Khant , IMO, since you modified |
Understood thanks @avishniakov. @bcdurak Let me know if should I make it a default dependency. |
@Dev-Khant let me discuss the dependency issue with the team internally, I will update this thread asap. |
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
- | Username Password | e423484 | src/zenml/cli/init.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Our GitHub checks need improvements? Share your feedbacks!
@bcdurak Any update for this? |
Hey @Dev-Khant , I can give you a quick update regarding the status. I think, I can sum it up in there different roadblocks that still remain:
|
Alright @bcdurak, For first point I'll switch back to the previous method and see how to store model_type. |
@Dev-Khant sorry for the delay here but I would be closing this PR due to staleness. Feel free to reopen when you work on it again! |
Describe changes
I implemented support for safetensors for model serialization. It is regarding #2532
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes