Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update compatibility matrices: support cuda 12.2 #1396

Merged
merged 6 commits into from
Dec 4, 2023

Conversation

technillogue
Copy link
Contributor

No description provided.

Signed-off-by: technillogue <[email protected]>
@technillogue technillogue force-pushed the syl/update-compat-for-cu122 branch from f7502d5 to 4c5f067 Compare November 21, 2023 21:11
@technillogue technillogue requested a review from mattt November 21, 2023 21:11
@yorickvP
Copy link
Contributor

This removes the suffixes from torch 2.0.0, 2.0.1, 2.1.0. Not sure why.
There's also a fun hack here https://github.com/replicate/cog/blob/main/pkg/config/compatibility.go#L359-L367

@technillogue
Copy link
Contributor Author

hm, I am not sure what to do with that, I just ran the compatgen script. getting this out would probably be helpful for H100 models

@technillogue technillogue changed the title update compatibility matrices update compatibility matrices: support cuda 12.2 Nov 24, 2023
@technillogue technillogue requested review from yorickvP and cloneofsimo and removed request for cloneofsimo November 24, 2023 18:41
@yorickvP
Copy link
Contributor

Suffix thing is fine:
Torch 2.0.0, 2.0.1, 2.1.0 now go through the parsePreviousTorchVersionsCode codepath, which parses https://pytorch.org/get-started/previous-versions/ . Looks like they switched to having pip look up the compatible versions in pytorch/pytorch.github.io@b6cbb18, which should work.

(side-note: it would be way nicer to parse https://conda.anaconda.org/pytorch/linux-64/repodata.json instead, but that might require some constraint solving)

Failing test
This switches the default torch to 2.1.1 and the default cuda to 12.2. Do coreweave and GCP support that for older GPUs? If not, we should probably override it back to 2.1.1+cu118 or something.

@technillogue
Copy link
Contributor Author

we should probably keep 11.8 as the default

This reverts commit 66817f7.

Signed-off-by: Yorick van Pelt <[email protected]>
@yorickvP
Copy link
Contributor

Okay, set the default to 11.8

Signed-off-by: Yorick van Pelt <[email protected]>
@yorickvP yorickvP merged commit 29b3d7c into main Dec 4, 2023
13 checks passed
@yorickvP yorickvP deleted the syl/update-compat-for-cu122 branch December 4, 2023 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants