Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod ignores limits. #180

Open
valafon opened this issue Mar 2, 2023 · 2 comments
Open

Pod ignores limits. #180

valafon opened this issue Mar 2, 2023 · 2 comments

Comments

@valafon
Copy link

valafon commented Mar 2, 2023

Hello! I have launched the gpu-manager daemon set on a node. Then, I started a pod on this node which requested tencent.com/vcuda-memory:2. As I understand from the README, 1 vcuda memory request equals 256 MiB. Therefore, I expected that the process inside the image would be limited to using 512 MiB. However, it uses 1500 MiB, as if there are no limits at all. I thought that maybe I need to use https://github.com/tkestack/vcuda-controller in some way. But when I patched thomassong/gpu-manager:1.1.4 with vcuda-controller ./build-img.sh, the final image just exists with code 0 when I try to run it. I really don't understand how to use this whole thing.

I have been searching for a normal Kubernetes solution for a long time, which would make it possible to limit GPU core and memory in the same way as CPU and host memory. On paper, this solution looks exactly like what I have been searching for. Unfortunately, I can't get it to work. If somebody may help me, and maybe have the patience to contact me personally, I would be in debt.

@DennisYoung96
Copy link

in my case, if process running exceeds the limits, is will return CUDA Out of Memory, but vcuda-core won't

@yangcheng-dev
Copy link

I have encountered the same issue as you. Can I leave a contact method?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants