You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am thrilled to have discovered this project - it has been tremendously helpful in dealing with the high volume of access requirements. However, it seems that the serving mode in the openai-manager project does not currently implement the ChatCompletion feature? This is my primary expected method of invocation.
In addition, I have built a simple Flask reverse proxy app for user access control and user-level usage control of API calls. Therefore, I'd like to modify my code to directly use the openai-manager within this Flask app to realize load-balancing for multiple APIs.
Being not very familiar with asynchronous programming in Python, I have a couple of questions I'd like to ask:
In serving.py, does the GLOBAL_MANAGER only control the list of tasks submitted in a single submission, or all requests submitted over multiple submissions? In other words, can the current serving implementation properly handle multiple concurrent requests from a single source?
Is it feasible to use asyncio.run() to call the submission function directly within a Flask app enabled for multi-threading?
Thank you in advance for your time and help.
The text was updated successfully, but these errors were encountered:
F1mc
changed the title
Questions about use this project in a Flask app
Questions about using this project in a Flask app
Jul 15, 2023
Thanks for your interest! And yes, ChatCompletion is now only available for python package usage.
For your question:
I would recommend you use a message queue backed with Redis for your specific usage, as this project only considers requests from ONE source.
I am not sure the current Flask design allows calling external async functions. But yes, the most simple (but not elegant) way to work around it is to start a separate process for openai-manager.
I am thrilled to have discovered this project - it has been tremendously helpful in dealing with the high volume of access requirements. However, it seems that the serving mode in the openai-manager project does not currently implement the
ChatCompletion
feature? This is my primary expected method of invocation.In addition, I have built a simple Flask reverse proxy app for user access control and user-level usage control of API calls. Therefore, I'd like to modify my code to directly use the openai-manager within this Flask app to realize load-balancing for multiple APIs.
Being not very familiar with asynchronous programming in Python, I have a couple of questions I'd like to ask:
In
serving.py
, does theGLOBAL_MANAGER
only control the list of tasks submitted in a single submission, or all requests submitted over multiple submissions? In other words, can the current serving implementation properly handle multiple concurrent requests from a single source?Is it feasible to use
asyncio.run()
to call the submission function directly within a Flask app enabled for multi-threading?Thank you in advance for your time and help.
The text was updated successfully, but these errors were encountered: