Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closing the socket client-side results in SIGPIPE, R shutdown which removes the tempdir #100

Open
rfaelens opened this issue Mar 8, 2018 · 6 comments
Assignees

Comments

@rfaelens
Copy link

rfaelens commented Mar 8, 2018

When I connect to RServe on Linux using 5 client-side connections
And I close the 3rd one forcefully (socket.close() in Java)
Then R receives SIGPIPE
And the R tempdir is removed (R_CleanTempDir is called)

RServe should set its own signal handler for R_SIGPIPE, and fail gracefully.

Still trying to build a nice example, but it is difficult to pinpoint the exact cause of SIGPIPE.

@rfaelens
Copy link
Author

rfaelens commented Mar 8, 2018

Based on strace of my application, this always happens in the same method:

15480 13:25:35 [00007f527a6d430d] sendto(4, "\x0a\x08\x01\x00\xa2\x04\x01\x00\x15\xc4\x00\x00\x22\x0c\x00\x00
\x74\x72\x79\x2d\x65\x72\x72\x6f\x72\x00\x01\x01\x13\x08\x00\x00"..., 268, 0, NULL, 0) = -1 EPIPE (Broken pip
e)
 > /usr/lib64/libc-2.17.so(__send+0x1d) [0xf930d]
 > /usr/lib64/R/bin/Rserve(server_send+0xe) [0x494e]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_send_resp+0x9f) [0x4a4f]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_connected+0xeb9) [0x7a29]
 > /usr/lib64/R/bin/Rserve(serverLoop+0x2bc) [0xa01c]
 > /usr/lib64/R/bin/Rserve(main+0x35b) [0x328b]
 > /usr/lib64/libc-2.17.so(__libc_start_main+0xf5) [0x21c05]
 > /usr/lib64/R/bin/Rserve(_start+0x29) [0x3d29]
15480 13:25:35 [00007f527a6d430d] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=15480, si_uid=1000}

@s-u
Copy link
Owner

s-u commented Mar 8, 2018

It's not possible to set the SIGPIPE handler because R itself resets it continuously so apps/packages cannot touch it.

There are two options

  1. use set.tempdir for unique temp dirs (useful in particular when you use user-switching - this is what we do in RCloud)
  2. use something like if (!dir.exists(tempdir())) dir.create(tempdir(),,TRUE) althgouh that's not 100% safe in multi-user environments (since you could have another process blow it away after you started running)

@s-u
Copy link
Owner

s-u commented Mar 8, 2018

I'll see if there is a way to insert a handler before R shutdown so that we can set the tempdir to /dev/null to avoid the deletion.

@s-u s-u changed the title Closing the socket client-side results in SIGPIPE, resetting the tempdir Closing the socket client-side results in SIGPIPE, R shutdown which removes the tempdir Mar 8, 2018
@s-u s-u self-assigned this Mar 8, 2018
@rfaelens
Copy link
Author

rfaelens commented Mar 8, 2018

Thanks for the comments. I do not understand how RServe can go from a SIGPIPE in the Rserve code, to R_CleanTempDir in the main loop of R. Is this libunwind/strace that makes an error, or did I fail to understand something?

3489  15:38:48 [00007fc9d186b37d] rt_sigaction(SIGPIPE, {sa_handler=0x7fc9d1d41cd0, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fc9d186b270}, {sa_handler=0x7fc9d1d41cd0, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fc9d186b270}, 8) = 0
 > /usr/lib64/libc-2.17.so(__GI___libc_sigaction+0xfd) [0x3537d]
 > /usr/lib64/libc-2.17.so(signal+0x66) [0x35186]
 > /usr/lib64/R/lib/libR.so(locale2charset+0x28c5) [0x148ce5]
 > /usr/lib64/libc-2.17.so(killpg+0x40) [0x35270]
 > /usr/lib64/libc-2.17.so(__send+0x1d) [0xf930d]
 > /usr/lib64/R/bin/Rserve(server_send+0xe) [0x494e]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_send_resp+0x9f) [0x4a4f]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_connected+0xeb9) [0x7a29]
 > /usr/lib64/R/bin/Rserve(serverLoop+0x2bc) [0xa01c]
 > /usr/lib64/R/bin/Rserve(main+0x35b) [0x328b]
 > /usr/lib64/libc-2.17.so(__libc_start_main+0xf5) [0x21c05]
 > /usr/lib64/R/bin/Rserve(_start+0x29) [0x3d29]
3489  15:38:48 [00007fc9d186b37d] rt_sigaction(SIGINT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc9d186b270}, {sa_handler=0x7fc9d1d41cb0, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fc9d186b270}, 8) = 0
 > /usr/lib64/libc-2.17.so(__GI___libc_sigaction+0xfd) [0x3537d]
 > /usr/lib64/libc-2.17.so(do_system+0x90) [0x41be0]
 > /usr/lib64/R/lib/libR.so(R_system+0x6) [0x1c0a96]
 > /usr/lib64/R/lib/libR.so(R_CleanTempDir+0x5a) [0x213e9a]
 > /usr/lib64/R/lib/libR.so(R_CleanTempDir+0xe5) [0x213f25]
 > /usr/lib64/R/lib/libR.so(setup_Rmainloop+0x5ec) [0x149d8c]
 > unexpected_backtracing_error [0x6]

@rfaelens
Copy link
Author

rfaelens commented Mar 9, 2018

After testing: set.tempdir does not work.

R_CleanTempDir cleans the directory specified in Sys_TempDir.
This is set in InitTempDir and is currently not modified by unixtools.
See src/main/sysutils.c and src/unix/sys-std.c

The right way to solve this, in my humble opinion, is to set R_ignore_SIGPIPE on any internal code that is using send() and recvfrom(). See also src/main/main.c in the R source tree.

/* this flag is set if R internal code is using send() and does not
   want to trigger an error on SIGPIPE (e.g., the httpd code).
   [It is safer and more portable than other methods of handling
   broken pipes on send().]
 */

#ifndef Win32
// controlled by the internal http server in the internet module
int R_ignore_SIGPIPE = 0;

See also the example of the internal HTTP server within R src/modules/internet/Rhttpd.c. RServe should fall in the same class.

rfaelens added a commit to rfaelens/Rserve that referenced this issue Mar 9, 2018
…eceiving EPIPE

This solution is intended to solve issue s-u#100. The method is very similar
to what is being done in the internal R http server (see Rhttpd.c).
It should therefore not cause any issues in other parts of R.
@rfaelens
Copy link
Author

rfaelens commented Mar 9, 2018

I did some tests with the new code. When receiving a SIGPIPE now, the following happens:

3436  10:29:52 [00007fb764f56bad] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=3436, si_uid=1000} ---
3436  10:29:52 [00007fb764bba37d] rt_sigaction(SIGPIPE, {sa_handler=0x7fb7652accd0, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fb764bba270},  <unfinished ...>
3460  10:29:52 [00007fb764f56bad] sendto(4, "\x01\x00\x01\x00\x48\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", 16, 0, NULL, 0 <unfinished ...>
3436  10:29:52 [00007fb764bba37d] <... rt_sigaction resumed> {sa_handler=0x7fb7652accd0, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fb764bba270}, 8) = 0
 > /usr/lib64/libc-2.17.so(__GI___libc_sigaction+0xfd) [0x3537d]
 > /usr/lib64/libc-2.17.so(signal+0x66) [0x35186]
 > /usr/lib64/R/lib/libR.so(locale2charset+0x28c5) [0x148ce5]
 > /usr/lib64/libc-2.17.so(killpg+0x40) [0x35270]
 > /usr/lib64/libpthread-2.17.so(send+0x1d) [0xebad]
 > /usr/lib64/R/bin/Rserve(server_send+0x18) [0x55a8]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_send_resp+0xa0) [0x54d0]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_connected+0xda0) [0xc7e0]
 > /usr/lib64/R/bin/Rserve(serverLoop+0x262) [0xe242]
 > /usr/lib64/R/bin/Rserve(main+0x359) [0x47c9]
 > /usr/lib64/libc-2.17.so(__libc_start_main+0xf5) [0x21c05]
 > /usr/lib64/R/bin/Rserve(_start+0x29) [0x5367]
3436  10:29:52 [00007fb764bba279] rt_sigreturn({mask=[]} <unfinished ...>
3436  10:29:52 [00007fb764f56bad] <... rt_sigreturn resumed> ) = -1 EPIPE (Broken pipe)
 > /usr/lib64/libpthread-2.17.so(send+0x1d) [0xebad]
 > /usr/lib64/R/bin/Rserve(server_send+0x18) [0x55a8]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_send_resp+0xa0) [0x54d0]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_connected+0xda0) [0xc7e0]
 > /usr/lib64/R/bin/Rserve(serverLoop+0x262) [0xe242]
 > /usr/lib64/R/bin/Rserve(main+0x359) [0x47c9]
 > /usr/lib64/libc-2.17.so(__libc_start_main+0xf5) [0x21c05]
 > /usr/lib64/R/bin/Rserve(_start+0x29) [0x5367]
3436  10:29:52 [00007fb764f56a3d] recvfrom(4,  <unfinished ...>
3436  10:29:52 [00007fb764f56a3d] <... recvfrom resumed> "", 16, 0, NULL, NULL) = 0
 > /usr/lib64/libpthread-2.17.so(recv+0x1d) [0xea3d]
 > /usr/lib64/R/bin/Rserve(server_recv+0x18) [0x55c8]
 > /usr/lib64/R/bin/Rserve(Rserve_QAP1_connected+0x170) [0xbbb0]
 > /usr/lib64/R/bin/Rserve(serverLoop+0x262) [0xe242]
 > /usr/lib64/R/bin/Rserve(main+0x359) [0x47c9]
 > /usr/lib64/libc-2.17.so(__libc_start_main+0xf5) [0x21c05]
 > /usr/lib64/R/bin/Rserve(_start+0x29) [0x5367]
3436  10:29:52 [????????????????] +++ exited with 0 +++
1238  10:29:52 [00007fb764c74783] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3436, si_uid=1000, si_status=0, si_utime=4, si_stime=4} ---

The child now gracefully shuts down, as it nicely detects the socket was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants