-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate safetensors
for model serialization
#2532
Comments
safetensors
for model serialization
Hi @strickvl I want to work on this task. By reading your detailed description I found out that Here I could add a new materialize named for eg. But I'm struggling with figuring out how can we show users both the option of |
Hi @Dev-Khant good question! I think what would be the best first place to start would be simply to add new materializers that use safetensors. Then we can allow users to specify them as a custom materializer for their chosen outputs. (See here for more details on that). We can keep the new materializers as part of the standard library, but they just wouldn't be the default. (The alternative would be to have a config option on the materializer itself, but that's a big / complicated feature to implement and I think we shouldn't start there). So, don't change the existing materializers but add new ones that use safetensors and update the docs so that people know how to use these parallel options. Hope that makes sense! |
@strickvl Totally Understood. As you said we will have parallel options for materialized, so correct if me I am wrong, we will have let's say two |
Correct. |
@strickvl can you assign this issue to me? thanks. |
@Saedbhati sure go for it! I've assigned it to you. Please keep in mind the conversation in this thread however :-) |
I am working on this as well! |
@htahir1 While this approach would require handling single versus multiple tensors slightly differently I feel it would avoid the problem of saving/loading models twice. Would there be a downside to this approach? |
@JasonBodzy thanks for the interest - it's an interesting suggestion! Using safetensors' native functions could potentially help avoid the double-save problem we ran into earlier. Though we'd need to solve a couple of challenges:
What do you think? If you're keen to explore this approach further, would be great to see how these pieces could come together. Feel free to share more thoughts or suggestions on tackling these requirements :-) @bcdurak curious about your thoughts here too! |
Hi!i is this still being worked on? if it hasn't been resolved id love to try and resolve it |
@Squishedmac go for it! |
Open Source Contributors Welcomed!
Please comment below if you would like to work on this issue!
Contact Details [Optional]
[email protected]
What happened?
ZenML currently uses Python's
pickle
module (viacloudpickle
library) for model serialization and materialization. However, the safetensors library is fast becoming a standard for storing tensors and model weights, offering a reasonable alternative topickle
. Integratingsafetensors
into ZenML would provide users with a more efficient and secure option for model serialization.Task Description
Implement support for using
safetensors
instead ofpickle
for model materialization in ZenML. The task involves the following:safetensors
for model serialization.src/zenml/integrations
) to utilizesafetensors
where appropriate.pickle
-based serialized models.safetensors
option.Expected Outcome
safetensors
, providing a faster and more secure alternative topickle
.pickle
andsafetensors
for model materialization.safetensors
will be seamless, maintaining compatibility with existing ZenML workflows.safetensors
option effectively.Steps to Implement
safetensors
library and its usage for model serialization.safetensors
serialization.src/zenml/integrations
that would benefit fromsafetensors
and update them accordingly.pickle
-based serialized models can still be loaded.safetensors
option and provide examples of its usage.safetensors
serialization in various scenarios.Additional Context
Integrating
safetensors
into ZenML aligns with the project's goal of providing efficient and secure tools for machine learning workflows. By offering an alternative topickle
, ZenML empowers users with more options for model serialization, catering to their specific needs and preferences.Code of Conduct
The text was updated successfully, but these errors were encountered: