Query scheduler does not remove failed extraction job in its internal queue. #605

haiqi96 · 2024-11-20T16:24:49Z

Bug

The current query scheduler use two dictionaries to map a stream_id to a job.

When a stream extraction job fails, the query scheduler is supposed to remove the job from the dictionaries and notify the failure to other jobs waiting on the same stream_id.

However, if an exception is throw by the worker (see here), the scheduler will simply continue without cleaning up the entires in the dictionarires. This will cause an issue if the following sequence happens:

Webui submits a stream extraction job 0 with stream ID: X
the job 0 fails due to an exception in the worker.
Webui submits another stream extraction job 1 with the same stream ID: x
the query scheduler sees the entry {X: [0]} is still in the dictionary, so it thought the job 0 is still running (but it actually failed)
the query scheduler then assigns job 1 to be running without submitting it to worker, and keep waiting on job 0 (which will never return beause it has already failed).

CLP version

ee7e493

Environment

Ubuntu jammy

Reproduction steps

Described in the Bug description

The text was updated successfully, but these errors were encountered:

haiqi96 added the bug Something isn't working label Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query scheduler does not remove failed extraction job in its internal queue. #605

Query scheduler does not remove failed extraction job in its internal queue. #605

haiqi96 commented Nov 20, 2024 •

edited

Loading

Query scheduler does not remove failed extraction job in its internal queue. #605

Query scheduler does not remove failed extraction job in its internal queue. #605

Comments

haiqi96 commented Nov 20, 2024 • edited Loading

Bug

CLP version

Environment

Reproduction steps

haiqi96 commented Nov 20, 2024 •

edited

Loading