-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need help using custom ckpt file from S3 #33
Comments
Hey, welcome! You're totally on the right track... this should be working with Two most common reasons:
Just before a long international flight but will check in tomorrow. You can send the log in the meantime if you have it 🙏 |
Thank you for the response Gadi, hope the flight treated you well :). Re: 1, I think I am using the Re:2, I noticed that copying the URI from S3 gets a URI that looks like This is the log when I set the
This is when I set the
Thank you again for your help 🙏🏻 |
I might be looking at the wrong place, but is the correct flow in the Seems like my build is getting stuck here. I could be looking at older version of the API (or misunderstanding something) but wanted to point it out just in case it's helpful. Thank you again! |
Hey, @DenimMazuki! Thanks for your patience and detailed reporting and troubleshooting steps. Ok great, you're indeed on the new And indeed, you want In short, all looks good. It's a pity banana doesn't give the full build logs on a successful build too, in case anything came up there. However, I'm recalling now that you said it's downloading the default stability weights... so here's my game plan:
We'll get this working and thanks for your patience too over my travels. |
Hey @gadicc , thank you for the detailed response! I really appreciate the responses :) Gameplan 1: I set the
GP 2: Failing the first, here's the runtime logs on the re-deploy (with stability weights)
GP 3: Yes! I'll send the path to you privately on discord :) Thank you again and hope that's helpful! |
Hey @DenimMazuki. Ok great, thanks so much for all this... I can see exactly what's going on now. Firstly, let me apologize 😅 It was a silly issue on my side. I know you've spent a few days on this and how frustrating that can be. But we're very close to an official I've released a fix to the ARG FROM_IMAGE="gadicc/diffusers-api" # <- from this
ARG FROM_IMAGE="gadicc/diffusers-api:dev" # <- to this (or overriding that build arg variable on the banana dashboard, but last time I checked, you still have to push a commit anyways to get it to use the new values) The reason why this wasn't working was an faulty assumption that checkpoints would be optimized first and saved to S3. This will work now regardless, but since we don't support banana's optimization anymore, your cold starts will be slower. There's info on creating an optimized build at https://forums.kiri.art/t/safetensors-our-own-optimization-faster-model-init/98. The short of it is, you need a 2nd deployment of the main repo (not build-downloads), which can perform the optimization for you. It's not too much extra work but is a a little time consuming, especially the first time, so as a thank you for your patience, I'm happy to do the conversion for you if you send the checkpoint file. (The reason why we need a second deploy is because banana builds happen without GPU, which is required for optimization. In banana's case, after the first stage of the build, they move the built image to separate "optimization servers" to complete the optimization, but we have no way of hooking into this). Anyway, hope you find all the extra info interesting, otherwise you can safely ignore for now and just get up and running in the meantime without the optimization. I should be around to help if you experience any further issues. And thanks again for your patience :) |
Whoa, sorry, one other note... I haven't deployed to banana in quite a while, and things have improved there big time!!! First, their optimization worked, which I wasn't expecting (it was breaking on docker-diffusers-api for a really long time!)... so even if you don't use our optimization, you'll still get there's. Secondly, wow, they've really improved things... everything completed in a few minutes instead of the 1hr+ I remember. Anyways, just wanted to correct my earlier note about things being slow without using "our" optimization. |
Hello!
I'm trying to use a custom ckpt to deploy to banana. My file is in S3 and I tried setting the
CHECKPOINT_URL
ARG in the Dockerfile with no luck (looks like the default stability weight got loaded instead of myckpt
in the S3 bucket).I tried setting
MODEL_URL
to the s3 location as well, and not seeing too much luck either (it reports a tar error of theckpt
file from s3).Am I approaching this the wrong way? I digged around and saw there's code to convert
ckpt
to diffuser format (and use it while building on banana). Would appreciate some guidance, thank you! 🙏🏻The text was updated successfully, but these errors were encountered: