Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produces a large number of zombie processes #177

Open
beardnick opened this issue Nov 20, 2024 · 0 comments
Open

Produces a large number of zombie processes #177

beardnick opened this issue Nov 20, 2024 · 0 comments

Comments

@beardnick
Copy link

beardnick commented Nov 20, 2024

Environment

prove

prove --version
TAP::Harness v3.43 and Perl v5.34.0

nginx

nginx -V
nginx version: openresty/1.25.3.1
built by gcc 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04) 
built with OpenSSL 3.2.0 23 Nov 2023
TLS SNI support enabled
configure arguments: --prefix=/usr/local/openresty/nginx --with-debug --with-cc-opt='-DNGX_LUA_USE_ASSERT -DNGX_LUA_ABORT_AT_PANIC -O2 -DAPISIX_RUNTIME_VER=1.2.0 -DNGX_GRPC_CLI_ENGINE_PATH=/usr/local/openresty/libgrpc_engine.so -DNGX_HTTP_GRPC_CLI_ENGINE_PATH=/usr/local/openresty/libgrpc_engine.so -DNGX_LUA_ABORT_AT_PANIC -I/usr/local/openresty/zlib/include -I/usr/local/openresty/pcre/include -I/usr/local/openresty/openssl3/include' --add-module=../ngx_devel_kit-0.3.3 --add-module=../echo-nginx-module-0.63 --add-module=../xss-nginx-module-0.06 --add-module=../ngx_coolkit-0.2 --add-module=../set-misc-nginx-module-0.33 --add-module=../form-input-nginx-module-0.12 --add-module=../encrypted-session-nginx-module-0.09 --add-module=../srcache-nginx-module-0.33 --add-module=../ngx_lua-0.10.26 --add-module=../ngx_lua_upstream-0.07 --add-module=../headers-more-nginx-module-0.37 --add-module=../array-var-nginx-module-0.06 --add-module=../memc-nginx-module-0.20 --add-module=../redis2-nginx-module-0.15 --add-module=../redis-nginx-module-0.3.9 --add-module=../ngx_stream_lua-0.0.14 --with-ld-opt='-Wl,-rpath,/usr/local/openresty/luajit/lib -Wl,-rpath,/usr/local/openresty/wasmtime-c-api/lib -L/usr/local/openresty/zlib/lib -L/usr/local/openresty/pcre/lib -L/usr/local/openresty/openssl3/lib -Wl,-rpath,/usr/local/openresty/zlib/lib:/usr/local/openresty/pcre/lib:/usr/local/openresty/openssl3/lib' --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../mod_dubbo-1.0.2 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../ngx_multi_upstream_module-1.2.0 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../apisix-nginx-module-1.16.0 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../apisix-nginx-module-1.16.0/src/stream --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../apisix-nginx-module-1.16.0/src/meta --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../wasm-nginx-module-0.7.0 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../lua-var-nginx-module-v0.5.3 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../grpc-client-nginx-module-v0.5.0 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../lua-resty-events-0.2.0 --with-poll_module --with-pcre-jit --with-stream --with-stream_ssl_module --with-stream_ssl_preread_module --with-http_v2_module --with-http_v3_module --without-mail_pop3_module --without-mail_imap_module --without-mail_smtp_module --with-http_stub_status_module --with-http_realip_module --with-http_addition_module --with-http_auth_request_module --with-http_secure_link_module --with-http_random_index_module --with-http_gzip_static_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-threads --with-compat --with-stream --without-pcre2 --with-http_ssl_module

apisix

apisix version
3.9.1

test-nginx: master branch

How to reproduce

Run unit tests with prove. There are many errors like timeout when waiting for the process 78711 to exit.

prove -v -I ./test-nginx/lib -I./ t/plugin/openid-connect.t

ok 1 - t/plugin/openid-connect.t TEST 1: Sanity check with minimal valid configuration. - status code ok
ok 2 - t/plugin/openid-connect.t TEST 1: Sanity check with minimal valid configuration. - response_body - response is expected (repeated req 0, req 0)
ok 3 - t/plugin/openid-connect.t TEST 1: Sanity check with minimal valid configuration. - pattern "[error]" does not match a line in error.log (req 0)
t/plugin/openid-connect.t TEST 2: Missing `client_id`. - timeout when waiting for the process 78711 to exit at /workspace/test-nginx/lib/Test/Nginx/Util.pm line 681.
t/plugin/openid-connect.t TEST 2: Missing `client_id`. - WARNING: killing the child process 78711 with force... at /workspace/test-nginx/lib/Test/Nginx/Util.pm line 720.
ok 4 - t/plugin/openid-connect.t TEST 2: Missing `client_id`. - status code ok
ok 5 - t/plugin/openid-connect.t TEST 2: Missing `client_id`. - response_body - response is expected (repeated req 0, req 0)
ok 6 - t/plugin/openid-connect.t TEST 2: Missing `client_id`. - pattern "[error]" does not match a line in error.log (req 0)
t/plugin/openid-connect.t TEST 3: Wrong type for `client_id`. - timeout when waiting for the process 78899 to exit at /workspace/test-nginx/lib/Test/Nginx/Util.pm line 681.
t/plugin/openid-connect.t TEST 3: Wrong type for `client_id`. - WARNING: killing the child process 78899 with force... at /workspace/test-nginx/lib/Test/Nginx/Util.pm line 720.

and there are many defunct nginx processes

ps -ef | grep nginx
root         785       1  0 08:40 ?        00:00:00 [nginx] <defunct>
root        2248       1  0 08:41 ?        00:00:00 [nginx] <defunct>
root        2885       1  0 08:42 ?        00:00:00 [nginx] <defunct>
root        4446       1  0 08:43 ?        00:00:00 [nginx] <defunct>
root        5007       1  0 08:44 ?        00:00:00 [nginx] <defunct>
root       19585       1  0 09:00 ?        00:00:00 [nginx] <defunct>
root       19770       1  0 09:00 ?        00:00:00 [nginx] <defunct>
root       21483       1  0 09:02 ?        00:00:00 [nginx] <defunct>
root       25649       1  0 09:07 ?        00:00:00 [nginx] <defunct>
root       27841       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       27842       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       27843       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       27989       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       27990       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       27991       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       28104       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       28105       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       28106       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       28243       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       28244       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       28245       1  0 09:09 ?        00:00:00 [nginx] <defunct>
root       29939       1  0 09:11 ?        00:00:00 [nginx] <defunct>

The possible reason

The prove will kill the nginx process after completing one unit test. However, nginx may exit too quickly, and the prove hasn't waited for the child process to finish. As a result, the nginx process becomes a zombie process, but is_running still considers it a valid process. The prove will continue attempting to kill the nginx process repeatedly until the timeout.

if (defined $pid) {
if ($ENV{TEST_NGINX_FAST_SHUTDOWN}) {
if ($Verbose) {
warn "sending TERM signal to $pid";
}
kill(SIGTERM, $pid);
} else {
if ($Verbose) {
warn "sending QUIT signal to $pid";
}
kill(SIGQUIT, $pid);
}
}
if ($Verbose) {
warn "waitpid timeout: ", timeout();
}
my $timeout_val = timeout();
while ($timeout_val > 0 && is_running($pid)) {
waitpid($pid, WNOHANG);
sleep 0.05;
$timeout_val -= 0.05;
}

My workaround

I've modified the is_running function to recognize zombie processes, allowing the unit tests to run faster without generating as many error messages. However, it will still produce a large number of zombie processes.

The original function

sub is_running ($) {
my $pid = shift;
return kill 0, $pid;
}

My workaround

sub is_running ($) {
    my $pid = shift;
    return  (kill(0, $pid)) && (not is_defunct($pid));
}

sub is_defunct ($) {
    my $pid = shift;
    my $output = `ps -o stat= -p $pid`;
    chomp($output);
    return $output =~ /Z/;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant