Skip to content

Commit b46e0ae

Browse files
authored
[None][test] update nim and full test list (#7468)
Signed-off-by: Ivy Zhang <[email protected]>
1 parent d38b8e3 commit b46e0ae

File tree

5 files changed

+136
-71
lines changed

5 files changed

+136
-71
lines changed

tests/integration/test_lists/qa/README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,12 +47,12 @@ pip3 install -r ${TensorRT-LLM_PATH}/requirements-dev.txt
4747
This directory contains various test configuration files:
4848

4949
### Functional Test Lists
50-
- `llm_function_full.txt` - Primary test list for single node multi-GPU scenarios (all new test cases should be added here)
51-
- `llm_function_sanity.txt` - Subset of examples for quick torch flow validation
50+
- `llm_function_core.txt` - Primary test list for single node multi-GPU scenarios (all new test cases should be added here)
51+
- `llm_function_core_sanity.txt` - Subset of examples for quick torch flow validation
5252
- `llm_function_nim.txt` - NIM-specific functional test cases
5353
- `llm_function_multinode.txt` - Multi-node functional test cases
5454
- `llm_function_gb20x.txt` - GB20X release test cases
55-
- `llm_function_rtx6kd.txt` - RTX 6000 series specific tests
55+
- `llm_function_rtx6k.txt` - RTX 6000 series specific tests
5656
- `llm_function_l20.txt` - L20 specific tests, only contains single gpu cases
5757

5858
### Performance Test Files
@@ -76,6 +76,12 @@ QA tests are executed on a regular schedule:
7676

7777
- **Weekly**: Automated regression testing
7878
- **Release**: Comprehensive validation before each release
79+
- **Full Cycle Testing**:
80+
run all gpu with llm_function_core.txt + run NIM specific gpu with llm_function_nim.txt
81+
- **Sanity Cycle Testing**:
82+
run all gpu with llm_function_core_sanity.txt
83+
- **NIM Cycle Testing**:
84+
run all gpu with llm_function_core_sanity.txt + run NIM specific gpu with llm_function_nim.txt
7985
- **On-demand**: Manual execution for specific validation needs
8086

8187
## Running Tests

tests/integration/test_lists/qa/llm_function_full.txt renamed to tests/integration/test_lists/qa/llm_function_core.txt

Lines changed: 4 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,9 @@ examples/test_exaone.py::test_llm_exaone_1gpu[disable_weight_only-exaone_3.0_7.8
3535
examples/test_exaone.py::test_llm_exaone_1gpu[enable_weight_only-exaone_deep_2.4b-float16-nb:1] TIMEOUT (90)
3636
examples/test_exaone.py::test_llm_exaone_2gpu[exaone_3.0_7.8b_instruct-float16-nb:1] TIMEOUT (90)
3737
examples/test_gemma.py::test_llm_gemma_1gpu_summary[gemma-2-27b-it-other-bfloat16-8]
38-
examples/test_gemma.py::test_llm_gemma_1gpu_summary_vswa[gemma-3-1b-it-other-bfloat16-8]
3938
examples/test_gemma.py::test_llm_hf_gemma_quantization_1gpu[gemma-2-27b-it-fp8-bfloat16-8]
40-
examples/test_gemma.py::test_llm_hf_gemma_quantization_1gpu_vswa[gemma-3-1b-it-fp8-bfloat16-8]
4139
examples/test_gemma.py::test_hf_gemma_fp8_base_bf16_multi_lora[gemma-2-9b-it]
4240
examples/test_gemma.py::test_hf_gemma_fp8_base_bf16_multi_lora[gemma-2-27b-it]
43-
examples/test_gemma.py::test_hf_gemma_fp8_base_bf16_multi_lora[gemma-3-1b-it]
4441
examples/test_gpt.py::test_llm_gpt2_medium_1gpu[non_streaming-use_py_session-disable_gemm_plugin]
4542
examples/test_gpt.py::test_llm_gpt2_medium_1gpu[streaming-use_cpp_session-enable_gemm_plugin]
4643
examples/test_gpt.py::test_llm_gpt2_medium_1node_4gpus[tp1pp4]
@@ -52,31 +49,11 @@ examples/test_gpt.py::test_llm_gpt2_multi_lora_1gpu[900_stories]
5249
examples/test_gpt.py::test_llm_gpt2_next_prompt_tuning[use_cpp_session-tp1]
5350
examples/test_gpt.py::test_llm_gpt2_parallel_embedding_2gpu[float16-1]
5451
examples/test_gpt.py::test_llm_gpt2_parallel_embedding_2gpu[float16-0]
55-
examples/test_gpt.py::test_llm_gpt2_santacoder_1node_4gpus[parallel_build-enable_fmha-enable_gemm_plugin-enable_attention_plugin]
56-
examples/test_gpt.py::test_llm_gpt2_starcoder_1node_4gpus[starcoder-enable_fmha-enable_gemm_plugin-enable_attention_plugin]
57-
examples/test_gpt.py::test_llm_gpt2_starcoder_1node_4gpus[starcoder2-disable_fmha-enable_gemm_plugin-enable_attention_plugin]
58-
examples/test_gpt.py::test_llm_gpt2_starcoder_1node_4gpus[starcoderplus-enable_fmha-enable_gemm_plugin-enable_attention_plugin]
59-
examples/test_gpt.py::test_llm_gpt2_starcoder_weight_only[starcoder-int4-float16]
60-
examples/test_gpt.py::test_llm_gpt2_starcoder_weight_only[starcoder-int8-float16]
61-
examples/test_gpt.py::test_llm_gpt2_starcoder_weight_only[starcoder2-int4-float16]
62-
examples/test_gpt.py::test_llm_gpt2_starcoder_weight_only[starcoder2-int8-float16]
63-
examples/test_gpt.py::test_llm_gpt2_starcoder_weight_only[starcoderplus-int4-float16]
64-
examples/test_gpt.py::test_llm_gpt2_starcoder_weight_only[starcoderplus-int8-float16]
65-
examples/test_gpt.py::test_llm_gpt3_175b_1node_8gpus[parallel_build-enable_fmha-enable_gemm_plugin-enable_attention_plugin] TIMEOUT (90)
66-
examples/test_gpt.py::test_llm_gpt_starcoder_lora_1gpu[peft-lora-starcoder2-15b-unity-copilot-starcoder2-lora_fp16-base_fp16]
67-
examples/test_gpt.py::test_llm_gpt_starcoder_lora_1gpu[peft-lora-starcoder2-15b-unity-copilot-starcoder2-lora_fp16-base_fp8]
68-
examples/test_gpt.py::test_llm_minitron_fp8_with_pseudo_loras[4b]
69-
examples/test_gpt.py::test_starcoder_fp8_quantization_2gpu[starcoder]
70-
examples/test_gpt.py::test_starcoder_fp8_quantization_2gpu[starcoderplus]
71-
examples/test_gpt.py::test_starcoder_fp8_quantization_2gpu[starcoder2]
72-
examples/test_llama.py::test_mistral_nemo_fp8_with_bf16_lora[Mistral-Nemo-12b-Base]
73-
examples/test_mistral.py::test_mistral_nemo_minitron_fp8_with_bf16_lora[Mistral-NeMo-Minitron-8B-Instruct]
7452
examples/test_phi.py::test_phi_fp8_with_bf16_lora[phi-2]
7553
examples/test_phi.py::test_phi_fp8_with_bf16_lora[Phi-3-mini-128k-instruct]
7654
examples/test_phi.py::test_phi_fp8_with_bf16_lora[Phi-3-small-128k-instruct]
7755
examples/test_phi.py::test_phi_fp8_with_bf16_lora[Phi-3.5-mini-instruct]
7856
examples/test_phi.py::test_phi_fp8_with_bf16_lora[Phi-3.5-MoE-instruct]
79-
examples/test_phi.py::test_phi_fp8_with_bf16_lora[Phi-4-mini-instruct]
8057
examples/test_gpt.py::test_streaming_beam[batch_size_1-disable_return_all_generated_tokens-num_beams_1]
8158
examples/test_gpt.py::test_streaming_beam[batch_size_1-disable_return_all_generated_tokens-num_beams_4]
8259
examples/test_gpt.py::test_streaming_beam[batch_size_1-return_all_generated_tokens-num_beams_1]
@@ -160,8 +137,6 @@ examples/test_medusa.py::test_llm_medusa_1gpu[use_cpp_session-medusa-vicuna-7b-v
160137
examples/test_medusa.py::test_llm_medusa_1gpu[use_cpp_session-medusa-vicuna-7b-v1.3-4-heads-bfloat16-bs8]
161138
examples/test_medusa.py::test_llm_medusa_1gpu[use_py_session-medusa-vicuna-7b-v1.3-4-heads-bfloat16-bs1]
162139
examples/test_medusa.py::test_llm_medusa_1gpu[use_py_session-medusa-vicuna-7b-v1.3-4-heads-bfloat16-bs8]
163-
examples/test_mistral.py::test_llm_mistral_lora_1gpu[komt-mistral-7b-v1-lora-komt-mistral-7b-v1]
164-
examples/test_mistral.py::test_llm_mistral_v1_1gpu[mistral-7b-v0.1-float16-max_attention_window_size_4096-summarization_long]
165140
examples/test_mixtral.py::test_llm_mixtral_moe_plugin_fp8_lora_4gpus[Mixtral-8x7B-v0.1-chinese-mixtral-lora]
166141
examples/test_mixtral.py::test_llm_mixtral_moe_plugin_lora_4gpus[Mixtral-8x7B-v0.1-chinese-mixtral-lora]
167142
examples/test_mixtral.py::test_llm_mixtral_int4_awq_1gpu_summary[mixtral-8x7b-v0.1-AWQ]
@@ -178,13 +153,8 @@ examples/test_multimodal.py::test_llm_multimodal_general[kosmos-2-pp:1-tp:1-floa
178153
examples/test_multimodal.py::test_llm_multimodal_general[kosmos-2-pp:1-tp:1-float16-bs:8-cpp_e2e:False-nb:1]
179154
examples/test_multimodal.py::test_llm_multimodal_general[llava-1.5-7b-hf-pp:1-tp:1-float16-bs:1-cpp_e2e:False-nb:1]
180155
examples/test_multimodal.py::test_llm_multimodal_general[llava-1.5-7b-hf-pp:1-tp:1-float16-bs:8-cpp_e2e:False-nb:1]
181-
examples/test_multimodal.py::test_llm_multimodal_general[llava-v1.6-mistral-7b-hf-pp:1-tp:1-float16-bs:1-cpp_e2e:False-nb:1]
182-
examples/test_multimodal.py::test_llm_multimodal_general[llava-v1.6-mistral-7b-hf-pp:1-tp:1-float16-bs:8-cpp_e2e:False-nb:1]
183-
examples/test_multimodal.py::test_llm_multimodal_general[llava-v1.6-mistral-7b-hf-vision-trtllm-pp:1-tp:1-float16-bs:1-cpp_e2e:False-nb:1]
184-
examples/test_multimodal.py::test_llm_multimodal_general[llava-v1.6-mistral-7b-hf-vision-trtllm-pp:1-tp:2-float16-bs:1-cpp_e2e:False-nb:1]
185156
examples/test_multimodal.py::test_llm_multimodal_general[llava-onevision-qwen2-7b-ov-hf-pp:1-tp:1-float16-bs:1-cpp_e2e:False-nb:1]
186157
examples/test_multimodal.py::test_llm_multimodal_general[llava-onevision-qwen2-7b-ov-hf-video-pp:1-tp:1-float16-bs:1-cpp_e2e:False-nb:1]
187-
examples/test_multimodal.py::test_llm_multimodal_general[Mistral-Small-3.1-24B-Instruct-2503-pp:1-tp:1-bfloat16-bs:8-cpp_e2e:False-nb:1]
188158
examples/test_multimodal.py::test_llm_multimodal_general[nougat-base-pp:1-tp:1-bfloat16-bs:1-cpp_e2e:False-nb:1]
189159
examples/test_multimodal.py::test_llm_multimodal_general[nougat-base-pp:1-tp:1-bfloat16-bs:8-cpp_e2e:False-nb:1]
190160
examples/test_multimodal.py::test_llm_multimodal_general[video-neva-pp:1-tp:1-bfloat16-bs:1-cpp_e2e:False-nb:1]
@@ -197,15 +167,7 @@ examples/test_multimodal.py::test_llm_multimodal_general[fuyu-8b-pp:1-tp:1-float
197167
examples/test_multimodal.py::test_llm_multimodal_general[kosmos-2-pp:1-tp:1-float16-bs:8-cpp_e2e:True-nb:1]
198168
examples/test_multimodal.py::test_llm_multimodal_general[llava-1.5-7b-hf-pp:1-tp:1-float16-bs:8-cpp_e2e:True-nb:1]
199169
examples/test_multimodal.py::test_llm_fp8_multimodal_general[fp8-fp8-scienceqa-Llama-3.2-11B-Vision-Instruct-pp:1-tp:1-bfloat16-bs:1-cpp_e2e:False]
200-
examples/test_nemotron.py::test_llm_nemotron_3_8b_1gpu[bfloat16-full_prec]
201-
examples/test_nemotron.py::test_llm_nemotron_3_8b_1gpu[bfloat16-int4_awq]
202-
examples/test_nemotron.py::test_llm_nemotron_4_15b_1gpu[bfloat16-fp8]
203-
examples/test_nemotron.py::test_llm_nemotron_4_15b_1gpu[bfloat16-full_prec]
204-
examples/test_nemotron.py::test_llm_nemotron_4_15b_2gpus[bfloat16-fp8]
205-
examples/test_nemotron.py::test_llm_nemotron_4_15b_2gpus[bfloat16-full_prec]
206-
examples/test_nemotron.py::test_llm_nemotron_4_15b_2gpus[bfloat16-int4_awq]
207-
examples/test_nemotron_nas.py::test_nemotron_nas_summary_1gpu[DeciLM-7B]
208-
examples/test_nemotron_nas.py::test_nemotron_nas_summary_2gpu[DeciLM-7B]
170+
209171
examples/test_phi.py::test_llm_phi_1node_2gpus_summary[Phi-3.5-MoE-instruct-nb:1]
210172
examples/test_phi.py::test_llm_phi_lora_1gpu[Phi-3-mini-4k-instruct-ru-lora-Phi-3-mini-4k-instruct-lora_fp16-base_fp16]
211173
examples/test_phi.py::test_llm_phi_lora_1gpu[Phi-3-mini-4k-instruct-ru-lora-Phi-3-mini-4k-instruct-lora_fp16-base_fp8]
@@ -305,8 +267,6 @@ accuracy/test_cli_flow.py::TestPhi3Mini128kInstruct::test_auto_dtype
305267
accuracy/test_cli_flow.py::TestPhi3Small8kInstruct::test_auto_dtype
306268
accuracy/test_cli_flow.py::TestPhi3Small128kInstruct::test_auto_dtype
307269
accuracy/test_cli_flow.py::TestPhi3_5MiniInstruct::test_auto_dtype
308-
accuracy/test_cli_flow.py::TestPhi4MiniInstruct::test_auto_dtype
309-
accuracy/test_cli_flow.py::TestPhi4MiniInstruct::test_tp2
310270
accuracy/test_cli_flow.py::TestLongAlpaca7B::test_auto_dtype
311271
accuracy/test_cli_flow.py::TestLongAlpaca7B::test_multiblock_aggressive
312272
accuracy/test_cli_flow.py::TestMamba130M::test_auto_dtype
@@ -385,9 +345,6 @@ accuracy/test_llm_api_pytorch.py::TestLlama3_2_3B::test_auto_dtype
385345
accuracy/test_llm_api_pytorch.py::TestLlama3_2_3B::test_fp8_prequantized
386346
accuracy/test_cli_flow.py::TestLlama3_3_70BInstruct::test_fp8_prequantized_tp4
387347
accuracy/test_cli_flow.py::TestLlama3_3_70BInstruct::test_nvfp4_prequantized_tp4
388-
accuracy/test_cli_flow.py::TestMistral7B::test_beam_search
389-
accuracy/test_cli_flow.py::TestMistral7B::test_fp8_tp4pp2
390-
accuracy/test_cli_flow.py::TestMistral7B::test_smooth_quant_tp4pp1
391348
accuracy/test_cli_flow.py::TestMixtral8x7B::test_fp8_tp2pp2
392349
accuracy/test_cli_flow.py::TestMixtral8x7B::test_fp8_tp2pp2_manage_weights
393350
accuracy/test_cli_flow.py::TestMixtral8x7B::test_fp4_plugin
@@ -421,9 +378,6 @@ accuracy/test_cli_flow.py::TestQwen2_57B_A14B::test_tp2pp2
421378
accuracy/test_llm_api.py::TestLlama3_1_8BInstruct::test_guided_decoding[xgrammar]
422379
accuracy/test_llm_api.py::TestLlama3_1_8BInstruct::test_guided_decoding_4gpus[xgrammar]
423380
accuracy/test_llm_api.py::TestLlama3_1_8BInstruct::test_gather_generation_logits_cuda_graph
424-
accuracy/test_llm_api.py::TestLlama3_1_8BInstruct::test_logprobs
425-
accuracy/test_llm_api.py::TestPhi4MiniInstruct::test_auto_dtype
426-
accuracy/test_llm_api.py::TestPhi4MiniInstruct::test_fp8
427381
accuracy/test_llm_api.py::TestQwen2_5_1_5BInstruct::test_auto_dtype
428382
accuracy/test_llm_api.py::TestQwen2_5_1_5BInstruct::test_weight_only
429383
accuracy/test_llm_api.py::TestLlama3_1_8B::test_fp8_rowwise
@@ -432,13 +386,6 @@ accuracy/test_llm_api.py::TestQwen2_5_0_5BInstruct::test_fp8
432386
accuracy/test_llm_api.py::TestQwen2_5_1_5BInstruct::test_fp8
433387
accuracy/test_llm_api.py::TestQwen2_5_7BInstruct::test_fp8
434388
accuracy/test_llm_api.py::TestQwen2_5_7BInstruct::test_fp8_kvcache
435-
accuracy/test_llm_api.py::TestMistral7B_0_3::test_quant_tp4[int4]
436-
accuracy/test_llm_api.py::TestMistral7B_0_3::test_quant_tp4[int4_awq]
437-
accuracy/test_llm_api.py::TestMistral7B_0_3::test_quant_tp4[int8_awq]
438-
accuracy/test_llm_api.py::TestMistralNemo12B::test_auto_dtype
439-
accuracy/test_llm_api.py::TestMistralNemo12B::test_auto_dtype_tp2
440-
accuracy/test_llm_api.py::TestMistralNemo12B::test_fp8
441-
accuracy/test_llm_api.py::TestMistral_NeMo_Minitron_8B_Instruct::test_fp8
442389
accuracy/test_llm_api.py::TestMixtral8x7B::test_tp2
443390
accuracy/test_llm_api.py::TestMixtral8x7B::test_smooth_quant_tp2pp2
444391
accuracy/test_llm_api.py::TestMixtral8x7BInstruct::test_awq_tp2
@@ -691,7 +638,7 @@ test_e2e.py::test_trtllm_bench_pytorch_backend_sanity[meta-llama/Llama-3.1-8B-ll
691638
test_e2e.py::test_ptp_scaffolding[DeepSeek-R1-Distill-Qwen-7B-DeepSeek-R1/DeepSeek-R1-Distill-Qwen-7B]
692639
unittest/llmapi/test_llm_pytorch.py::test_gemma3_1b_instruct_multi_lora
693640
examples/test_medusa.py::test_codellama_medusa_1gpu[CodeLlama-7b-Instruct]
694-
examples/test_medusa.py::test_mistral_medusa_1gpu[mistral-7b-v0.1]
641+
695642
examples/test_medusa.py::test_qwen_medusa_1gpu[qwen_7b_chat]
696643
examples/test_medusa.py::test_qwen_medusa_1gpu[qwen1.5_7b_chat]
697644
examples/test_medusa.py::test_qwen_medusa_1gpu[qwen2_7b_instruct]
@@ -706,8 +653,7 @@ examples/test_eagle.py::test_codellama_eagle_1gpu[CodeLlama-7b-Instruct-eagle1]
706653
examples/test_eagle.py::test_llama_eagle_1gpu[llama-v2-7b-hf-eagle1]
707654
examples/test_eagle.py::test_llama_eagle_1gpu[llama-3.2-1b-eagle1]
708655
examples/test_eagle.py::test_llama_eagle_1gpu[llama-3.1-8b-eagle1]
709-
examples/test_eagle.py::test_mistral_eagle_1gpu[mistral-7b-v0.1-eagle1]
710-
examples/test_eagle.py::test_mistral_nemo_eagle_1gpu[Mistral-Nemo-12b-Base-eagle1]
656+
711657
examples/test_eagle.py::test_qwen_eagle_1gpu[qwen_7b_chat-eagle1]
712658
examples/test_eagle.py::test_qwen_eagle_1gpu[qwen1.5_7b_chat-eagle1]
713659
examples/test_eagle.py::test_qwen_eagle_1gpu[qwen2_7b_instruct-eagle1]
@@ -721,8 +667,7 @@ examples/test_eagle.py::test_codellama_eagle_1gpu[CodeLlama-7b-Instruct-eagle2]
721667
examples/test_eagle.py::test_llama_eagle_1gpu[llama-v2-7b-hf-eagle2]
722668
examples/test_eagle.py::test_llama_eagle_1gpu[llama-3.2-1b-eagle2]
723669
examples/test_eagle.py::test_llama_eagle_1gpu[llama-3.1-8b-eagle2]
724-
examples/test_eagle.py::test_mistral_eagle_1gpu[mistral-7b-v0.1-eagle2]
725-
examples/test_eagle.py::test_mistral_nemo_eagle_1gpu[Mistral-Nemo-12b-Base-eagle2]
670+
726671
examples/test_eagle.py::test_qwen_eagle_1gpu[qwen_7b_chat-eagle2]
727672
examples/test_eagle.py::test_qwen_eagle_1gpu[qwen1.5_7b_chat-eagle2]
728673
examples/test_eagle.py::test_qwen_eagle_1gpu[qwen2_7b_instruct-eagle2]

0 commit comments

Comments
 (0)