Skip to content
This repository has been archived by the owner on Mar 17, 2021. It is now read-only.

Thread deadlock in TaurusPollingTimer #1178

Open
cpascual opened this issue Mar 1, 2021 · 1 comment
Open

Thread deadlock in TaurusPollingTimer #1178

cpascual opened this issue Mar 1, 2021 · 1 comment
Labels

Comments

@cpascual
Copy link
Member

cpascual commented Mar 1, 2021

I suspect that some of the issues that we are experimenting in the CI may be related to a deadlock in TaurusPollingTimer (thanks @reszelaz for helping with the debugging)

I can reproduce it with the following snippet:

from taurus.core import TaurusPollingTimer
import taurus
import sys

a = taurus.Attribute("eval:1")

pt = TaurusPollingTimer(period=10)
for i in range(100):
    pt.addAttribute(a)
    print(i, end=" ")
    sys.stdout.flush()
    pt.removeAttribute(a)

And here is the backtrace when I use CTRL+C to interrupt it after it got in a deadlock:

$ python /home/cpascual/.config/JetBrains/PyCharmCE2020.2/scratches/scratch_153.py                        (taurus) 
0 1 2 3 4 5 6 7 8 9 ^CTraceback (most recent call last):
TimerLoop 10   WARNING  2021-03-01 17:55:56,591 TaurusPollingTimer[10].Timer on _pollAttributes: loop function took more than loop interval (0.01s)
  File "/home/cpascual/.config/JetBrains/PyCharmCE2020.2/scratches/scratch_153.py", line 12, in <module>
    pt.removeAttribute(a)
  File "/home/cpascual/src/taurus/lib/taurus/core/tauruspollingtimer.py", line 131, in removeAttribute
    self.stop(sync=True)
  File "/home/cpascual/src/taurus/lib/taurus/core/tauruspollingtimer.py", line 63, in stop
    self.timer.stop(sync=sync)
  File "/home/cpascual/src/taurus/lib/taurus/core/util/timer.py", line 86, in stop
    self.__thread.join()
  File "/home/cpascual/miniconda/envs/taurus/lib/python3.9/threading.py", line 1033, in join
    self._wait_for_tstate_lock()
  File "/home/cpascual/miniconda/envs/taurus/lib/python3.9/threading.py", line 1049, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
@cpascual
Copy link
Member Author

cpascual commented Mar 1, 2021

This deadlock was almost certainly introduced in PR #1002

@cpascual cpascual added the bug label Mar 2, 2021
cpascual pushed a commit to cpascual/taurus that referenced this issue Mar 8, 2021
There is a deadlock in TaurusPolligTimer (see
taurus-org#1178 ) when both the _pollAttributes
and the removeAttribute( methods are waiting for the lock.

Fix it by not forcing waiting on timer thread to stop when removing an attribute.
cpascual pushed a commit to cpascual/taurus that referenced this issue Mar 8, 2021
Simplify TaurusPollingTimer (do not use taurus.core.util.timer.Timer)
in order to provide a more robust solution to
taurus-org#1178
The internal worker thread is never stopped

This deprecates the start and stop methods of tauruspollingtimer
jkotan pushed a commit to desy-fsec/taurus that referenced this issue Jun 23, 2023
Refactor TaurusPollingTimer to avoid deadlock

Closes taurus-org#1178

See merge request taurus-org/taurus!1181
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant