Replies: 1 comment 1 reply
-
there's no option to load multiple models |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm running a serverless worker that is basically just a docker image. Adding the refiner is too slow if switching back and forth between models. I can make multiple models setting = 2 in the UI and after on generation it is super fast. But I would like to be able to get to that state out of the box. I don't see any multiple models in memory launch flag and I don't know how to load 2 models to start either. Any advice, thoughts?
Beta Was this translation helpful? Give feedback.
All reactions