Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck VOD tasks #1103

Open
figintern opened this issue Apr 18, 2022 · 4 comments
Open

Stuck VOD tasks #1103

figintern opened this issue Apr 18, 2022 · 4 comments
Labels
type: bug Something isn't working

Comments

@figintern
Copy link
Contributor

figintern commented Apr 18, 2022

Describe the bug
When requesting certain tasks in VOD API ie. Import, Transcode, the tasks may be stuck in a false Waiting state if the task was never actually queued to run. This could happen if the API was unable to access the tasks queues due to credentials etc.

Expected behavior
Tasks which never made it into the task-runner queues should be put into a Failed state directly.

Additional context
enqueueTask in the scheduler could fail, provide a failure callback that sets the task state to Failed

https://github.com/livepeer/livepeer-com/blob/d3bd5c6e2d133436024b6df09592370a565d8e53/packages/api/src/task/scheduler.ts#L165

https://github.com/livepeer/livepeer-com/blob/d3bd5c6e2d133436024b6df09592370a565d8e53/packages/api/src/store/queue.ts#L224

@figintern figintern added the type: bug Something isn't working label Apr 18, 2022
@github-actions github-actions bot added the status: triage this issue has not been evaluated yet label Apr 18, 2022
@pglowacky
Copy link
Contributor

Last week we implemented alarms when this happens, so we should not be alerted by customers when this happens in the future.

This is currently blocked by a decision about how best to fix this issue, pending some changes with recording and RabbitMQ. We should discuss this as a group with Victor as the decider.

@pglowacky pglowacky removed the status: triage this issue has not been evaluated yet label Apr 27, 2022
@yondonfu yondonfu transferred this issue from livepeer/go-livepeer May 3, 2022
@victorges
Copy link
Member

victorges commented May 3, 2022

Oh I did not now that we had this already and created this as well: #1031

The one I created is for a specific fix that we have to make though, which is not blocked by any decision AFAIK.

This is currently blocked by a decision about how best to fix this issue, pending some changes with recording and RabbitMQ. We should discuss this as a group with Victor as the decider.

@pglowacky I'm not sure if this is blocked by anything. Were you meaning to comment on this other issue instead? livepeer/task-runner#19
That one is more related to recordings, but I have also observed it happen on non-recording asset imports, so it is not exclusive either.

Also, what changes to RabbitMQ were you referring to?

@pglowacky
Copy link
Contributor

Hm! I think this was the intended issue, and my comments were based on some conversation we had. If it's not true, disregard!

@victorges victorges transferred this issue from livepeer/studio Jun 8, 2022
@victorges victorges transferred this issue from livepeer/task-runner Jun 8, 2022
@victorges victorges removed their assignment Jun 8, 2022
@pglowacky
Copy link
Contributor

We currently have an alert for this so we have a rough idea for how often this is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants