Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grab accelerator set off the end of the list instead of by index #4506

Merged
merged 1 commit into from
Jun 26, 2023

Conversation

adamnovak
Copy link
Member

We were using a magic [3] to get the accelerator set, and then I added a new resource pool in #4461 and didn't manually test GPU support.

This now uses a slightly less magic [-1] and should keep working as long as accelerators are the last resource type.

This will fix #4503.

Changelog Entry

To be copied to the draft changelog by merger:

  • Single machine batch system now works with GPUs again

Reviewer Checklist

  • Make sure it is coming from issues/XXXX-fix-the-thing in the Toil repo, or from an external repo.
    • If it is coming from an external repo, make sure to pull it in for CI with:
      contrib/admin/test-pr otheruser theirbranchname issues/XXXX-fix-the-thing
      
    • If there is no associated issue, create one.
  • Read through the code changes. Make sure that it doesn't have:
    • Addition of trailing whitespace.
    • New variable or member names in camelCase that want to be in snake_case.
    • New functions without type hints.
    • New functions or classes without informative docstrings.
    • Changes to semantics not reflected in the relevant docstrings.
    • New or changed command line options for Toil workflows that are not reflected in docs/running/{cliOptions,cwl,wdl}.rst
    • New features without tests.
  • Comment on the lines of code where problems exist with a review comment. You can shift-click the line numbers in the diff to select multiple lines.
  • Finish the review with an overall description of your opinion.

Merger Checklist

  • Make sure the PR passes tests.
  • Make sure the PR has been reviewed since its last modification. If not, review it.
  • Merge with the Github "Squash and merge" feature.
    • If there are multiple authors' commits, add Co-authored-by to give credit to all contributing authors.
  • Copy its recommended changelog entry to the Draft Changelog.
  • Append the issue number in parentheses to the changelog entry.

We were using a magic `[3]` to get the accelerator set, and then I added a new resource pool in #4461 and didn't manually test GPU support.

This now uses a slightly less magic `[-1]` and should keep working as long as accelerators are the last resource type.

This will fix #4503.
@adamnovak
Copy link
Member Author

I should manually test this to make sure it actually solves the problem.

@adamnovak
Copy link
Member Author

OK I manually tested this and it does in fact work great.

@DailyDreaming I'm going to merge this without waiting for a review, since it's just a couple characters and it fixes a user issue.

My test:

cat >gpu_test.wdl <<'EOF'
version 1.0
workflow GPUTest {
    input {
    }
    call testGPU {
    }
    output {
        File result = testGPU.result
    }
}

task testGPU {
    input {
    }
    command <<<
        nvidia-smi
    >>>
    output {
        File result = stdout()
    }
    runtime {
        memory: "1 GB" 
        cpu: 1
        gpuType: "nvidia-tesla-t4"
        gpuCount: 1
        disks: "local-disk 1 SSD"
    }
}
EOF

echo "{}" >empty.json

toil-wdl-runner --logDebug --retryCount=0 gpu_test.wdl empty.json -o test-out

@adamnovak adamnovak merged commit 45bc3f4 into master Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GPU support broken on SingleMachine?
1 participant