Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock detection fails if threads are transitively waited for #2

Open
Sipkab opened this issue Dec 31, 2019 · 0 comments
Open

Deadlock detection fails if threads are transitively waited for #2

Sipkab opened this issue Dec 31, 2019 · 0 comments
Labels
bug Something isn't working

Comments

@Sipkab
Copy link
Member

Sipkab commented Dec 31, 2019

Given the following scenario:

  1. The build task T is being invoked.
  2. The task starts a new thread W.
  3. The thread W starts to wait for other task D. (However, the task D is never started.)
  4. The build task attempts to wait for W.
  5. The execution deadlocks, as the waiting for D will never complete. However, the deadlock is not detected by the build system.

The build execution can be stopped by manually interrupting the execution thread. This can be done inside an IDE, however, it's unclear for command line execution.

This behaviour can be distruptive in case of CI builds, as the build will never stop.

Workaround

  1. Don't configure the build to deadlock. It can be usually avoided in a straightforward way.
  2. Don't implement tasks that concurrently perform waiting for threads and waiting for tasks.

Solution

In general, advising task authors to perform the waiting for input task first, and do the work last should be enough. This is already the recommended workflow for build tasks implementations, therefore there won't be much change.

Another solution is to allow the above transitive waiting, but require the build task authors to delegete the Thread.join call through the build system. In this case we can detect the number of waiting threads.

Notice

There's a chance that this issue may remain open for a prolonged amount of time and be a known bug of saker.build. Generally, this is a rarely occurring bug that can be mitigated by proper implementations of the build tasks. The delegating through build system solution is still a viable partial solution that has a high chance to be implemented.

@Sipkab Sipkab added the bug Something isn't working label Dec 31, 2019
Sipkab added a commit that referenced this issue Jul 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

1 participant