Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay in /tmp/pid File Creation Causes unhealthy Status on Rocky Linux 8.6 #7876

Open
sinbumu opened this issue Nov 13, 2024 · 1 comment
Open

Comments

@sinbumu
Copy link

sinbumu commented Nov 13, 2024

I'm experiencing an issue with running Besu in a Docker/Podman container on Rocky Linux 8.6. The container stays in an unhealthy state for an extended period (approximately 9-10 minutes) before eventually switching to healthy. The delay appears to be related to the /tmp/pid file, which is not created immediately upon container startup, causing the health check to fail repeatedly.

Environment

  • OS: Rocky Linux 8.6
  • Container Engine: Podman (or Docker, also tested)
  • Docker version: 27.2.0
  • Besu version: hyperledger/besu:latest
  • System Specs: 4 vCPUs, 16 GB RAM

Steps to Reproduce

  1. Launch a Rocky Linux 8.6 instance.
  2. Install Podman/Docker.
  3. Pull the latest Besu Docker image.
  4. Run the following command to start Besu:
sudo podman run -d --name besu_node \
  -e BESU_LOGGING=TRACE \
  -p 8545:8545 -p 8546:8546 -p 30303:30303 \
  hyperledger/besu:latest \
  --rpc-http-enabled --rpc-http-host=0.0.0.0 --host-allowlist="*"
  1. Check the container health status (podman ps or docker ps) and observe that it remains in the unhealthy state.

Observed Behavior

The container stays in an unhealthy state for about 9-10 minutes, then transitions to healthy. During this time, the health check repeatedly fails with exitCode=1, which seems to be related to the /tmp/pid file not being available immediately.

Expected Behavior

The Besu container should create the /tmp/pid file promptly upon startup to allow the health check to succeed or provide an alternative health check method that accurately reflects the container's readiness state.

Logs

Here are relevant sections of the Docker logs showing repeated health check failures:

time="2024-11-12T06:58:15.854925216Z" level=debug msg="Health check for container done (exitCode=1)"
...
time="2024-11-12T06:58:20.905606739Z" level=debug msg="Health check for container done (exitCode=1)"

Additional Information

  • Disk Performance: Disk write speed is around 140 MB/s, so storage performance does not seem to be the issue.
  • Attempts with Alternative Configurations: Tried increasing health-interval and StartPeriod, but the delay in healthy state persists.
  • Other OS: The same configuration on Ubuntu does not exhibit this delay, so this issue might be specific to Rocky Linux or RHEL-based systems.

Questions

  • Could there be an OS compatibility issue with Rocky Linux, or specific configurations we should adjust?
@sinbumu sinbumu added the bug Something isn't working label Nov 13, 2024
@joshuafernandes
Copy link
Contributor

joshuafernandes commented Nov 27, 2024

Hi @sinbumu there's a few things at play here re your setup that we'd be speculating at best. 10mins though is definitely long and we don't see this behaviour so can't reproduce this.
That being said, this is where we set it https://github.com/hyperledger/besu/blob/main/docker/Dockerfile#L59 so you could try switching to perhaps a curl readiness check and that should tell you whether its the OS or similar?
cc: @siladu

@macfarla macfarla removed the bug Something isn't working label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants