Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update mountPath in kube yaml #592

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sallyom
Copy link
Collaborator

@sallyom sallyom commented Jan 15, 2025

I've found this change is necessary to make a working kube.yaml for podman kube play ramalama_*.yaml

without the fix:

$ podman kube play ramalama_MicXluJjP3.yaml 
Pod:
starting container d5f9654b06b3a6fb209a3a9c38986e9c78313417c57a55435050b08d87ca0fab: not a directory

to test, this will serve granite-code at localhost:8888:

ramalama serve --generate kube granite-code
podman kube play ramalama_*.yaml --publish 8888:8080

Summary by Sourcery

Bug Fixes:

  • Fix issue where podman kube play would fail with a "not a directory" error when the mount path was set to a directory that didn't exist.

Copy link
Contributor

sourcery-ai bot commented Jan 15, 2025

Reviewer's Guide by Sourcery

This pull request updates the mountPath in the Kubernetes YAML configuration to use MNT_FILE instead of MNT_DIR. This change ensures that podman kube play works correctly with the generated YAML.

No diagrams generated as the changes look simple and do not need a visual representation.

File-Level Changes

Change Details Files
Updated the mount path in the Kubernetes configuration.
  • Changed mountPath from MNT_DIR to MNT_FILE in the volumeMounts section of the Kubernetes YAML.
  • Updated the gen_volume function to use MNT_FILE instead of MNT_DIR when generating the mount configuration for the AI image.
  • Removed the subPath property from the volumeMounts section as it is no longer needed with MNT_FILE.
ramalama/kube.py
ramalama/quadlet.py
Removed unused import.
  • Removed the unused MNT_DIR import from ramalama/common.
  • Removed the unused MNT_DIR import from ramalama/common
ramalama/quadlet.py
ramalama/kube.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sallyom - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@ericcurtin
Copy link
Collaborator

ericcurtin commented Jan 15, 2025

Thanks @sallyom I've been meaning to check whether these commands:

$ ramalama --dryrun serve llama:3.1
podman run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_7mb6H6CGJf --pull=newer -t -p 8080:8080 --device /dev/dri --mount=type=image,src=llama:3.1,destination=/mnt/models,ro,subpath=/models quay.io/ramalama/ramalama:latest llama-server --port 8080 -m /mnt/models/model.file -c 2048 --temp 0.8 -ngl 999 --host 0.0.0.0
$ ramalama --dryrun --runtime vllm serve llama:3.1
podman run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_2gEHBiYV0T --pull=newer -t -p 8080:8080 --device /dev/dri --mount=type=image,src=llama:3.1,destination=/mnt/models,ro,subpath=/models quay.io/modh/vllm:rhoai-2.17-cuda --port 8080 --model /mnt/models/model.file --max_model_len 2048

still lineup with our generator equivalents. Maybe they have drifted.

@ericcurtin
Copy link
Collaborator

ericcurtin commented Jan 15, 2025

@rhatdan PTAL. I think you know better than me at this stage what we want the podman run, quadlet, kube file are supposed to look like

@rhatdan
Copy link
Member

rhatdan commented Jan 15, 2025

I think we have to be careful here. I want to try this out with Ollama Models, HuggingFace Models, OCI model car models and OCI Model Raw models. I have a feeling we need to do this two ways. Depending on the model, we mount at the file path or at the directory path. Can not try out right now since I am on train.

@sallyom
Copy link
Collaborator Author

sallyom commented Jan 15, 2025

Also, might be better to use the actual model filename rather than model.file & add ability to load multiple models & switch quickly between them. I figured we'd want to look at this PR closer.

@rhatdan
Copy link
Member

rhatdan commented Jan 16, 2025

The problem is without standardizing the OCI Models, we don't know what the path is, which is why we want the models to always have a model.file within them. If we had standards, we could have an attribute of the OCI Model to tell us what the path is. Certain models use additional data within the MODEL DIR to enhance the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants