-
Notifications
You must be signed in to change notification settings - Fork 18
Forcing a connection close() #45
Comments
@alfredodeza subprocesses are perilous when it comes to Pushy; you need to make sure that the subprocess does not inherit anything that will cause the subprocess to interact with the proxying. That includes I/O, as there's a background thread on the target that reads from the redirected I/O back to the client. If you can either point me at the offending code, or better yet, provide a self contained minimal reproducer, then I will hopefully be able to provide some concrete suggestions. |
@axw we have a few helpers around pushy to make it easier to send remote functions, the workflow can be a bit hard to follow. What we are kind of forced to do (and I can confirm is what causes all of this) is that we set Using However, the particular scenario where this blocks forever, triggering the problem described in this ticket is when that That forked call is also writing to If that script completes normally, then the whole process is able to complete and I can close the connection without blocking. If that call doesn't, then pushy sits waiting for it to complete. Capturing the remote stderr and stdout is imperative for our processes, but for this specific scenario, we know we have completed everything and we want to close all connections to the remote ends. Is there really no way to do this? |
@alfredodeza Can you please try creating remote StringIOs and assigning to stdout/stderr? That way the subprocess shouldn't ever be interacting with a proxied object. i.e. |
Yes, I did tried that with the same result, the connection hangs and I can't close it :( |
Sorry, I'll need a reproducing test case then to look into it further. If you could provide something minimal that'd be great, otherwise I will attempt it myself when I have some spare time. |
@alfredodeza Can you please try passing close_fds=True to subprocess.Popen? I created a program that does something similar to what you describe, and the connection is locking up because the forked process is inheriting and keeping the RPC channel open. Passing close_fds fixed it for my case. |
The only way I can get this to work (other than not capturing stdout/stderr) is by not doing a As soon as I tried using |
@alfredodeza It seems I cannot reproduce this problem based on your descriptions alone. I've managed to get something similar (again), but in my case it blocks even with Popen; it blocks because the subprocess inherits the I/O redirector pipe, which stops the server from exiting until the subprocess exits. However, it's not blocking on MessageStream.close. I will need a reproducing test case to continue analysing. |
That sounds similar. I mentioned Maybe the fact that I am stepping through the code has nothing to do with the actual problem. I would be very interested to know if your fix for this could solve my problem. I attempted numerous times to create a test case for you the past few days but it has been really really hard and have not been able to. The only way I have reproducing this is by using ceph-deploy directly on a remote server that calls the init script, which is absolutely not reproducible outside of that environment. |
I don't have a "fix" for this, if it truly is the same issue. If you spawn a subprocess that inherits the stdout/stderr file descriptors, then this issue will present. If you want to capture the output but don't want to block the connection, you would have to create your own pipes and threads to read the output into the StringIO objects. |
We are having some issues when doing remote subprocess calls (actually using
subprocess.check_call
) where as soon as we hit theclose()
method on the connection object it blocks forever.The (partial) reason for this is because the message streams are acquiring a lock and since there are no timeouts for closing it basically stops all threads from closing the connection and it deadlocks.
This is extremely severe for us as we can't do anything but to
Ctrl-C
the command to exit.On the actual machines that call we see a bunch of these processes around after a while:
All that the remote subprocess command is doing is starting a service, that service in turn backgrounds a script call (something like
bash script.sh 2> /dev/null &
)That script will do a long running process which might not complete but the user does not care, it should just be a fire and forget and then close the connection.
When starting that service manually in the remote machine everything works as expected. There are no errors, tracebacks (or exceptions), it just deadlocks.
The actual part where the deadlock occurs is on this file:
In the MessageStream class, in the
close()
method when trying to do this:Is there anyway I could force closing the connection? I am no longer sure what else to try.
The text was updated successfully, but these errors were encountered: