Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Compare image memory limit to sum of shmem and main mem #1770

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

fregataa
Copy link
Member

@fregataa fregataa commented Dec 8, 2023

resolves #1726

This makes the behavior of the backend's interpretation of "mem" ($M$) and "shmem" ($S$) values consistent with the Web UI, which passes them as $(M-S, S)$ instead of $(M, S)$.

  • Compare ai.backend.resource.min.mem with the sum of main memory size and shared memory size instead of main memory size.
  • Configurable ratio between shared memory and memory.

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • API server-client counterparts (e.g., manager API -> client SDK)

@fregataa fregataa self-assigned this Dec 8, 2023
@github-actions github-actions bot added comp:manager Related to Manager component effort:easy Need to understand only a specific region of codes (good first issue, easy). type:enhance Enhance component, behavior, internals without user-facing features urgency:4 As soon as feasible, implementation is essential. size:M 30~100 LoC labels Dec 8, 2023
@fregataa fregataa added this to the 24.03 milestone Dec 8, 2023
@achimnol achimnol force-pushed the feature/better-shared-memory-handling branch from 17a1621 to 3d6ae68 Compare March 22, 2024 22:55
@achimnol
Copy link
Member

achimnol commented Mar 26, 2024

I'm going to hold this PR because the calculation method change will break existing API users (now we have serious ones) and complicates which values to store in the scheduler.

EDIT:
We decided to compare the image_min_slots["mem"] and requested_slots["mem"] directly (without adding shmem value to the former one), but add a check to prevent making shmem too large.

@achimnol achimnol added the action:on hold Hold it. Wait for the restart. label Mar 26, 2024
@achimnol achimnol marked this pull request as draft March 26, 2024 09:34
@github-actions github-actions bot added the require:db-migration Automatically set when alembic migrations are added or updated label Mar 28, 2024
@fregataa fregataa requested a review from yomybaby March 28, 2024 05:22
@fregataa fregataa marked this pull request as ready for review March 28, 2024 05:22
@achimnol achimnol removed the action:on hold Hold it. Wait for the restart. label Apr 8, 2024
src/ai/backend/manager/registry.py Outdated Show resolved Hide resolved
@fregataa fregataa requested a review from achimnol April 8, 2024 08:58
src/ai/backend/manager/defs.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added the comp:common Related to Common component label Jun 20, 2024
@fregataa fregataa requested a review from achimnol June 20, 2024 09:02
@fregataa fregataa modified the milestones: 24.03, 24.09 Jun 20, 2024
@fregataa fregataa changed the title feature: Compare image memory limit to sum of shmem and main mem feature: Configurable ratio between shared memory and memory Jun 20, 2024
@github-actions github-actions bot modified the milestones: 24.09, 24.03 Jun 20, 2024
@fregataa fregataa changed the title feature: Configurable ratio between shared memory and memory feature: Compare image memory limit to sum of shmem and main mem Jun 20, 2024
@fregataa fregataa added the action:on hold Hold it. Wait for the restart. label Jul 6, 2024
@fregataa fregataa removed the action:on hold Hold it. Wait for the restart. label Oct 24, 2024
@fregataa fregataa changed the title feat: Compare image memory limit to sum of shmem and main mem chore: Compare image memory limit to sum of shmem and main mem Oct 24, 2024
@achimnol achimnol modified the milestones: 24.03, 24.12 Oct 24, 2024
@fregataa fregataa force-pushed the feature/better-shared-memory-handling branch from 79a5c45 to 245de35 Compare October 24, 2024 16:10
@fregataa fregataa changed the base branch from main to fix_handle_wrong_value_of_shmem October 24, 2024 16:10
Copy link
Member Author

fregataa commented Oct 24, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @fregataa and the rest of your teammates on Graphite Graphite

Comment on lines +1218 to +1221
raise InvalidAPIParameters(
f"Too large shared memory. Maximum ratio of 'shared memory / memory' is {str(allowed_max_shmem_ratio)}. "
f"(s:{str(shmem)}, m:{str(BinarySize(requested_slots["mem"]))}"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message string is missing a closing parenthesis. The line should be:

f"(s:{str(shmem)}, m:{str(BinarySize(requested_slots['mem']))})"

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

Base automatically changed from fix_handle_wrong_value_of_shmem to main October 25, 2024 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:common Related to Common component comp:manager Related to Manager component effort:easy Need to understand only a specific region of codes (good first issue, easy). size:M 30~100 LoC type:enhance Enhance component, behavior, internals without user-facing features urgency:4 As soon as feasible, implementation is essential.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the shared memory setting for new sessions more intuitive
3 participants