Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update start-odk.sh memory calculation to support cgroups v2 #876

Open
wants to merge 2 commits into
base: next
Choose a base branch
from

Conversation

spwoodcock
Copy link
Contributor

@spwoodcock spwoodcock commented Jan 16, 2025

Closes #875

Included in this PR

  • The original check for cgroups v1.
  • A check for cgroups v2.
    • If there are memory constraints set, then they are used.
    • If there are no memory constraints, the value will be set to max and we fallback to use the total memory available on the system.
  • Note that it's possible to determine hard and soft memory limits via cgroups, but this is probably overkill.

What has been done to verify that this works as intended?

  • Tested on my local Debian Bookworm setup (using WSL, so it's cgroups v1).
  • Tested on a Debian Bookworm remote server, using cgroups v2.

Why is this the best possible solution? Were any other approaches considered?

  • It seems to be a pretty common way to determine memory available inside a container.
  • Containers are essentially wrappers for standard tools such as cgroups, so the total memory available in the container is determined by a cgroup.
  • It doesn't require any additional tools / only uses standard Linux stuff.
  • This is backwards compatible with cgroups v1 and should work for the foreseeable future (unlikely cgroups will change behaviour for a long time now).

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

  • Users that deployed since the commit hotosm@7476a91, and using cgroups v2 in their deployment infrastructure, may have an incorrectly determined number of workers!
  • Likely they may be running with a single worker.
  • This would increase the number of workers, depending on the memory available.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

  • Probably not.

Before submitting this PR, please make sure you have:

  • branched off and targeted the next branch OR only changed documentation/infrastructure (master is stable and used in production)
  • verified that any code or assets from external sources are properly credited in comments or that everything is internally sourced

Copy link
Member

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good to me but tests are failing in CircleCI. Is it possible that you branched off of a very old version? Could you please rebase on the latest next so we can get a CircleCI sanity check before merging?

@spwoodcock
Copy link
Contributor Author

Thanks! This looks good to me but tests are failing in CircleCI. Is it possible that you branched off of a very old version? Could you please rebase on the latest next so we can get a CircleCI sanity check before merging?

Ah possibly! I'm afk for the weekend now, but will address this on Monday 😄

@spwoodcock
Copy link
Contributor Author

You were right, I had synced master but not next on our fork - rebased to the latest next branch!

Hopefully the CircleCI build can be re-triggered without much hassle 🙏

@spwoodcock
Copy link
Contributor Author

Made a small fix for shellcheck linting:

In files/service/scripts/start-odk.sh line 76:
export WORKER_COUNT=$(determine_worker_count "$MEMTOT")
       ^----------^ SC2155: Declare and assign separately to avoid masking return values.

For more information:
  https://www.shellcheck.net/wiki/SC2155 -- Declare and assign separately to ...

@spwoodcock spwoodcock requested a review from lognaturel January 20, 2025 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

start-odk.sh memory calculation does not support cgroups v2
2 participants