Add test for open-llama-3b-v2-f16 model through sharktank. #272

ScottTodd · 2024-06-26T21:45:21Z

This adds one test for a llama model running through https://github.com/nod-ai/sharktank. That project is still getting set up, so new docs for this particular workflow are coming in at nod-ai/sharktank#69 and tests in that repo are in nod-ai/sharktank#70.

Specifically, this exercises:

sharktank/models/llama/llama.py
sharktank/examples/export_paged_llm_v1.py with batch sizes == [4]
The open-llama-3b-v2-f16.gguf file from https://huggingface.co/SlyEcho/open_llama_3b_v2_gguf
Compilation and crashless execution, not numerical correctness (yet)

Ideas for future work:

Test cases for the same model/parameters
- Other batch sizes
- decode() as well as prefill()
- Real inputs with expected outputs (decode() crashes on some faked inputs still 🤔)
- Other flag combinations and target configurations (starting simple though)
Test cases for other models/parameters
- 8b / 70b parameter models
- Mistral, Mixtral, Gemma, etc.

iree_tests/sharktank/llama/open-llama-3b-v2-f16/test_cases.json

…ghtly.

saienduri · 2024-06-26T22:58:29Z

Nice, is there a reason that sharktank needs its own config file? Pytorch models and sharktank have the same starting point of mlir, so maybe just having one overarching models configuration could work.

ScottTodd · 2024-06-26T23:09:52Z

Nice, is there a reason that sharktank needs its own config file? Pytorch models and sharktank have the same starting point of mlir, so maybe just having one overarching models configuration could work.

I'm going back and forth on that, thanks for noticing too.

With multiple files we keep the test lists separate. The lists are short, but unqualified right now. I think I want them to be qualified for merging, so this:

SHARK-TestSuite/iree_tests/configs/config_pytorch_models_cpu_llvm_task.json

Lines 10 to 14 in 4751fab

    
           "skip_compile_tests": [ 
        
             "sdxl-scheduled-unet-3-tank", 
        
             "sdxl-vae-decode-tank", 
        
             "sdxl-prompt-encoder-tank" 
        
           ],

would be

    "skip_compile_tests": [
      "pytorch/models/sdxl-scheduled-unet-3-tank",
      "pytorch/models/sdxl-vae-decode-tank",
      "pytorch/models/sdxl-prompt-encoder-tank"
    ],

then we'd also have new models:

    "skip_compile_tests": [
      "pytorch/models/sdxl-scheduled-unet-3-tank",
      "pytorch/models/sdxl-vae-decode-tank",
      "pytorch/models/sdxl-prompt-encoder-tank",
      "sharktank/llama/open-llama-3b-v2-f16",
    ],

Let me check if that works... the test names are still a bit awkwardly passed through pytest / conftest.py.

ScottTodd · 2024-06-26T23:18:15Z

I'm also debating naming/grouping

PR currently:

test suite	exists in iree_tests?	config name
onnx ops	yes	onnx
pytorch ops	no	pytorch
pytorch models	yes	pytorch_models
sharktank models	yes (this PR)	sharktank

One alternative (grouping ops across frameworks and models across frameworks):

test suite	exists in iree_tests?	config name
onnx ops	yes	ops
pytorch ops	no	ops
pytorch models	yes	models
sharktank models	yes (this PR)	models

ScottTodd · 2024-06-26T23:23:13Z

... and deciding how to run the tests:

By directory, reusing configs:

pytest iree_tests/pytorch/models --config-files=models.json
pytest iree_tests/sharktank --config-files=models.json

By directory, separate configs:

pytest iree_tests/pytorch/models --config-files=pytorch_models.json
pytest iree_tests/sharktank --config-files=sharktank.json

If we ran pytest iree_tests/ --config-files=models.json, that would go down into onnx/node/.

Ah! Nevermind -- we can run pytest dir1 dir2. Maybe not super convenient for local use though - would need to know which configs map to which test suite subfolders.

saienduri · 2024-06-27T21:14:51Z

Let me check if that works... the test names are still a bit awkwardly passed through pytest / conftest.py.

Yup would be nice to have the them qualified and merge.

I think you would just have to change it to check that the test_directory relative path to repo_root is in the config file rather than just the test_directory name as it is now

saienduri · 2024-06-27T21:18:22Z

... and deciding how to run the tests:

By directory, reusing configs:
pytest iree_tests/pytorch/models --config-files=models.json
pytest iree_tests/sharktank --config-files=models.json

I think of the options, by directory reusing configs looks like the cleanest

ScottTodd · 2024-06-28T17:29:32Z

Sorry, building up a stack of merge conflicts with #271 here x_x

ScottTodd · 2024-06-28T18:58:21Z

Splitting off a few smaller PRs from this:

saienduri

LGTM.

Progress on nod-ai/sharktank#22. See nod-ai/SHARK-TestSuite#272 for the specifics of what the new test is exercising. The "models" tests now include `pytorch/models/` and `sharktank/`, so all test names are qualified relative to `iree_tests/` in the test suite repo. (Totally inflating my commit stats here, sorry :P) ci-exactly: build_packages,regression_test

Progress on nod-ai/sharktank#22 This adds one test for a llama model running through https://github.com/nod-ai/sharktank. That project is still getting set up, so new docs for this particular workflow are coming in at nod-ai/sharktank#69 and tests in that repo are in nod-ai/sharktank#70. Specifically, this exercises: * [`sharktank/models/llama/llama.py`](https://github.com/nod-ai/sharktank/blob/main/sharktank/sharktank/models/llama/llama.py) * [`sharktank/examples/export_paged_llm_v1.py`](https://github.com/nod-ai/sharktank/blob/main/sharktank/sharktank/examples/export_paged_llm_v1.py) with batch sizes == [4] * The `open-llama-3b-v2-f16.gguf` file from https://huggingface.co/SlyEcho/open_llama_3b_v2_gguf * Compilation and crashless execution, _not_ numerical correctness (yet) Ideas for future work: * Test cases for the same model/parameters * Other batch sizes * `decode()` as well as `prefill()` * Real inputs with expected outputs (`decode()` crashes on some faked inputs still 🤔) * Other flag combinations and target configurations (starting simple though) * Test cases for other models/parameters * 8b / 70b parameter models * Mistral, Mixtral, Gemma, etc.

Progress on nod-ai/sharktank#22. See nod-ai/SHARK-TestSuite#272 for the specifics of what the new test is exercising. The "models" tests now include `pytorch/models/` and `sharktank/`, so all test names are qualified relative to `iree_tests/` in the test suite repo. (Totally inflating my commit stats here, sorry :P) ci-exactly: build_packages,regression_test Signed-off-by: Lubo Litchev <[email protected]>

ScottTodd added 2 commits June 26, 2024 13:46

Teach download_remote_files.py to use huggingface_hub.

175eab2

Add test for open-llama-3b-v2-f16 model through sharktank.

de2e76f

ScottTodd requested a review from saienduri June 26, 2024 21:45

ScottTodd added 2 commits June 26, 2024 15:02

Switch cpu tests from w7900 to mi250 runner.

9adb13a

Use less_equal for size check.

c38747e

ScottTodd commented Jun 26, 2024

View reviewed changes

iree_tests/sharktank/llama/open-llama-3b-v2-f16/test_cases.json Show resolved Hide resolved

Try pinning to an older (newly promoted) release instead of latest ni…

5fc416a

…ghtly.

ScottTodd force-pushed the sharktank-testing branch from 95c1cab to 5fc416a Compare June 26, 2024 22:36

ScottTodd mentioned this pull request Jun 26, 2024

[punet] CI for quantization import/compilation/golden check nod-ai/sharktank#76

Open

ScottTodd added 4 commits June 28, 2024 10:00

Rename config files to drop the config_ prefix, merge model configs.

9d05f24

Qualify test names relative to iree_tests/.

4ba0d8a

Unpin iree pip packages.

99d73e4

Fix matrix check.

861f21f

Merge remote-tracking branch 'upstream/main' into sharktank-testing

c250173

ScottTodd force-pushed the sharktank-testing branch from 688c6bb to fe8d3bc Compare June 28, 2024 22:50

ScottTodd marked this pull request as ready for review June 28, 2024 22:50

Merge test jobs and READMEs.

332f50f

ScottTodd force-pushed the sharktank-testing branch from fe8d3bc to 332f50f Compare June 28, 2024 22:59

saienduri approved these changes Jun 28, 2024

View reviewed changes

Mark llama xfail on rocm (too many flags?)

fb35199

ScottTodd merged commit 3603a45 into nod-ai:main Jun 28, 2024
2 of 3 checks passed

ScottTodd deleted the sharktank-testing branch June 28, 2024 23:24

ScottTodd mentioned this pull request Jun 28, 2024

Run sharktank llama tests on CI, qualify test names. iree-org/iree#17772

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test for open-llama-3b-v2-f16 model through sharktank. #272

Add test for open-llama-3b-v2-f16 model through sharktank. #272

ScottTodd commented Jun 26, 2024 •

edited

Loading

saienduri commented Jun 26, 2024

ScottTodd commented Jun 26, 2024

ScottTodd commented Jun 26, 2024 •

edited

Loading

ScottTodd commented Jun 26, 2024

saienduri commented Jun 27, 2024

saienduri commented Jun 27, 2024

ScottTodd commented Jun 28, 2024

ScottTodd commented Jun 28, 2024

saienduri left a comment

Add test for open-llama-3b-v2-f16 model through sharktank. #272

Add test for open-llama-3b-v2-f16 model through sharktank. #272

Conversation

ScottTodd commented Jun 26, 2024 • edited Loading

saienduri commented Jun 26, 2024

ScottTodd commented Jun 26, 2024

ScottTodd commented Jun 26, 2024 • edited Loading

ScottTodd commented Jun 26, 2024

saienduri commented Jun 27, 2024

saienduri commented Jun 27, 2024

ScottTodd commented Jun 28, 2024

ScottTodd commented Jun 28, 2024

saienduri left a comment

Choose a reason for hiding this comment

ScottTodd commented Jun 26, 2024 •

edited

Loading

ScottTodd commented Jun 26, 2024 •

edited

Loading