Load a lot of models in Triton #6135
-
Good morning, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Have you tried loading with verbose logging enabled (--log-verbose=1)? That may help figure out what's going on. Triton does not have a limit beyond the limits of your environment (e.g. if it is running out of memory). It sounds like for your use case, EXPLICIT model loading mode works best. See here. You can load the models and see when memory runs out, and you can load/unload as needed. |
Beta Was this translation helpful? Give feedback.
Have you tried loading with verbose logging enabled (--log-verbose=1)? That may help figure out what's going on. Triton does not have a limit beyond the limits of your environment (e.g. if it is running out of memory).
It sounds like for your use case, EXPLICIT model loading mode works best. See here. You can load the models and see when memory runs out, and you can load/unload as needed.