Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline runs as background jobs #154

Open
dinmukhamedm opened this issue Nov 5, 2024 · 3 comments
Open

Pipeline runs as background jobs #154

dinmukhamedm opened this issue Nov 5, 2024 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@dinmukhamedm
Copy link
Member

Currently, our network configurations in the managed versions cut TCP (TLS to be more specific) connections after 350 seconds. For most of our APIs this is more than enough, but pipeline runs sometimes take longer, especially with the rise of larger/slower models, like o1.

We need to add an ability to run pipelines as background jobs.

Currently, this is rather a discussion, not a call for PR. I see two possible ways forward, but we are open to more suggestions as always.

  1. Polling job. Client submits a job, gets a run_id and polls on it.
    • Pros:
      • No need to care about network cutting anything short, as all responses are very quick
    • Cons:
      • If polling is user's responsibility, then this overall makes UX much worse. If polling is hidden in our SDK, then we need to be careful about the intervals in order not to cause too much load.
      • We'll need some infrastructure (separate DB table?) to keep the status of running jobs
  2. Websocket. Client opens a websocket connection, and it's the server's responsibility to periodically ping the connection to keep it alive.
    • Pros:
      • Job state is kept in memory, similar to now, so not much additional infra
    • Cons:
      • We need to design extensible messaging protocol with reliable ping/pong requests to make sure the connection does not close

We are open to discussions for the best way forward and any other suggestions

@dinmukhamedm dinmukhamedm added enhancement New feature or request help wanted Extra attention is needed labels Nov 5, 2024
@nagxsan
Copy link

nagxsan commented Dec 10, 2024

Can we integrate approach 1 with some sort of notification functionality?

  • Instead of the user polling continuously on the run_id why not keep it completely asynchronous, and once the run has completed execution, the server sends a notification object to the front-end indicating the run has completed execution (success/failure).
  • We may need to maintain a separate table which includes the run_id and the status and the server updates this status upon run completion.
  • The front-end would get this status from the database and show it to the user. Until the notification is received, we can show the status as Pending.
  • Also if needed the run can have a scheduled timeout (for example 5 minutes) and if this time has passed and there is no response, we indicate the same with a failure status?

Please let me know if I am missing something crucial or going wrong somewhere.

@dinmukhamedm
Copy link
Member Author

Hey @nagxsan thanks for your input and sorry for the delayed response!

We may need to maintain a separate table

Agreed, that's likely.

the server sends a notification object to the front-end

This issue is about the API requests from code. See the API reference. Currently, the pipeline run requests are simply a REST request-response model. The best (and maybe the only) way I can imagine us sending a notification to the code SDKs is exactly approach 2 I suggested.

Do you have another idea on how to send a notification?

@nagxsan
Copy link

nagxsan commented Jan 6, 2025

I was thinking approach 1 in that situation sounds the most straightforward:

  • User starts a pipeline run and hits a POST request which returns a run_id
  • The client makes a GET request to the relevant table (may need creation) to obtain the status of the run.
  • If success, then display data otherwise just show status as Pending

then this overall makes UX much worse

Can you please tell me why is this the case? The pipeline runs are anyways long runs which the user will probably just submit the job and forget about it for a while. Even if they manually refresh after sometime, the GET request will effectively show the status of the job right?
We can perform some sort of polling which would mean sending the GET request at specific intervals and checking the status of the job. If status is pending then keep polling otherwise return the state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants