Skip to content

Load a lot of models in Triton #6135

Answered by dyastremsky
mnaranjorion asked this question in Q&A
Discussion options

You must be logged in to vote

Have you tried loading with verbose logging enabled (--log-verbose=1)? That may help figure out what's going on. Triton does not have a limit beyond the limits of your environment (e.g. if it is running out of memory).

It sounds like for your use case, EXPLICIT model loading mode works best. See here. You can load the models and see when memory runs out, and you can load/unload as needed.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by dyastremsky
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants