|
| 1 | +# Description |
| 2 | + |
| 3 | +This folder contains QA test definitions for TensorRT-LLM, which are executed on a daily/release schedule. These tests focus on end-to-end validation, accuracy verification, disaggregated testing, and performance benchmarking. |
| 4 | + |
| 5 | +## Test Categories |
| 6 | + |
| 7 | +QA tests are organized into three main categories: |
| 8 | + |
| 9 | +### 1. Functional Tests |
| 10 | +Functional tests include E2E (end-to-end), accuracy, and disaggregated test cases: |
| 11 | + |
| 12 | +- **E2E Tests**: Complete workflow validation from model loading to inference output |
| 13 | +- **Accuracy Tests**: Model accuracy verification against reference implementations |
| 14 | +- **Disaggregated Tests**: Distributed deployment and multi-node scenario validation |
| 15 | + |
| 16 | +### 2. Performance Tests |
| 17 | +Performance tests focus on benchmarking and performance validation: |
| 18 | +- Baseline performance measurements |
| 19 | +- Performance regression detection |
| 20 | +- Throughput and latency benchmarking |
| 21 | +- Resource utilization analysis |
| 22 | + |
| 23 | +### 3. Triton Backend Tests |
| 24 | +Triton backend tests validate the integration with NVIDIA Triton Inference Server: |
| 25 | +- Backend functionality validation |
| 26 | +- Model serving capabilities |
| 27 | +- API compatibility testing |
| 28 | +- Integration performance testing |
| 29 | + |
| 30 | +## Dependencies |
| 31 | + |
| 32 | +The following Python packages are required for running QA tests: |
| 33 | + |
| 34 | +```bash |
| 35 | +pip install mako oyaml rouge_score lm_eval |
| 36 | +``` |
| 37 | + |
| 38 | +### Dependency Details |
| 39 | + |
| 40 | +- **mako**: Template engine for test generation and configuration |
| 41 | +- **oyaml**: YAML parser with ordered dictionary support |
| 42 | +- **rouge_score**: ROUGE evaluation metrics for text generation quality assessment |
| 43 | +- **lm_eval**: Language model evaluation framework |
| 44 | + |
| 45 | +## Test Files |
| 46 | + |
| 47 | +This directory contains various test configuration files: |
| 48 | + |
| 49 | +### Functional Test Lists |
| 50 | +- `llm_function_full.txt` - Primary test list for single node multi-GPU scenarios (all new test cases should be added here) |
| 51 | +- `llm_function_sanity.txt` - Subset of examples for quick torch flow validation |
| 52 | +- `llm_function_nim.txt` - NIM-specific functional test cases |
| 53 | +- `llm_function_multinode.txt` - Multi-node functional test cases |
| 54 | +- `llm_function_gb20x.txt` - GB20X release test cases |
| 55 | +- `llm_function_rtx6kd.txt` - RTX 6000 Ada specific tests |
| 56 | + |
| 57 | +### Performance Test Files |
| 58 | +- `llm_perf_full.yml` - Main performance test configuration |
| 59 | +- `llm_perf_cluster.yml` - Cluster-based performance tests |
| 60 | +- `llm_perf_sanity.yml` - Performance sanity checks |
| 61 | +- `llm_perf_nim.yml` - NIM-specific performance tests |
| 62 | +- `llm_trt_integration_perf.yml` - Integration performance tests |
| 63 | +- `llm_trt_integration_perf_sanity.yml` - Integration performance sanity checks |
| 64 | + |
| 65 | +### Triton Backend Tests |
| 66 | +- `llm_triton_integration.txt` - Triton backend integration tests |
| 67 | + |
| 68 | +### Release-Specific Tests |
| 69 | +- `llm_digits_func.txt` - Functional tests for DIGITS release |
| 70 | +- `llm_digits_perf.txt` - Performance tests for DIGITS release |
| 71 | + |
| 72 | +## Test Execution Schedule |
| 73 | + |
| 74 | +QA tests are executed on a regular schedule: |
| 75 | + |
| 76 | +- **Weekly**: Automated regression testing |
| 77 | +- **Release**: Comprehensive validation before each release |
| 78 | +- **On-demand**: Manual execution for specific validation needs |
| 79 | + |
| 80 | +## Running Tests |
| 81 | + |
| 82 | +### Manual Execution |
| 83 | + |
| 84 | +To run specific test categories: |
| 85 | + |
| 86 | +```bash |
| 87 | +# direct to defs folder |
| 88 | +cd tests/integration/defs |
| 89 | +# Run all fp8 functional test |
| 90 | +pytest --no-header -vs --test-list=../test_lists/qa/llm_function_full.txt -k fp8 |
| 91 | +# Run a single test case |
| 92 | +pytest -vs accuracy/test_cli_flow.py::TestLlama3_1_8B::test_auto_dtype |
| 93 | +``` |
| 94 | + |
| 95 | +### Automated Execution |
| 96 | + |
| 97 | +QA tests are typically executed through CI/CD pipelines with appropriate test selection based on: |
| 98 | + |
| 99 | +- Release requirements |
| 100 | +- Hardware availability |
| 101 | +- Test priority and scope |
| 102 | + |
| 103 | +## Test Guidelines |
| 104 | + |
| 105 | +### Adding New Test Cases |
| 106 | +- **Primary Location**: For functional testing, new test cases should be added to `llm_function_full.txt` first |
| 107 | +- **Categorization**: Test cases should be categorized based on their scope and execution time |
| 108 | +- **Validation**: Ensure test cases are properly validated before adding to any test list |
0 commit comments