Skip to content

Commit

Permalink
Results from self hosted Github actions - NVIDIARTX4090
Browse files Browse the repository at this point in the history
  • Loading branch information
arjunsuresh committed Jan 19, 2025
1 parent 28b8711 commit ebea480
Show file tree
Hide file tree
Showing 11 changed files with 383 additions and 383 deletions.
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
| Model | Scenario | Accuracy | Throughput | Latency (in ms) |
|---------------------|------------|------------|--------------|-------------------|
| stable-diffusion-xl | offline | () | 0.352 | - |
| Model | Scenario | Accuracy | Throughput | Latency (in ms) |
|---------|------------|------------|--------------|-------------------|
| gptj-99 | offline | 264 | 48.533 | - |
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ pip install -U cmind

cm rm cache -f

cm pull repo gateoverflow@mlperf-automations --checkout=2d0f337e2813814778fe386c2ade45506c154333
cm pull repo gateoverflow@mlperf-automations --checkout=85fc982f55fc029717a901c61d580eb4f5025ef9

cm run script \
--tags=app,mlperf,inference,generic,_reference,_gptj-99,_pytorch,_cuda,_test,_r5.0-dev_default,_float16,_offline \
Expand Down Expand Up @@ -105,4 +105,4 @@ Model Precision: fp32
`GEN_LEN`: `264.0`, Required accuracy for closed division `>= 42.55663`

### Performance Results
`Samples per second`: `55.3675`
`Samples per second`: `48.5327`
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,17 @@ Constructing QSL
Encoding Samples
Finished constructing QSL.
Loading PyTorch model...
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]Loading checkpoint shards: 33%|███▎ | 1/3 [00:00<00:01, 1.46it/s]Loading checkpoint shards: 67%|██████▋ | 2/3 [00:01<00:00, 1.48it/s]Loading checkpoint shards: 100%|██████████| 3/3 [00:01<00:00, 2.03it/s]Loading checkpoint shards: 100%|██████████| 3/3 [00:01<00:00, 1.84it/s]
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]Loading checkpoint shards: 33%|███▎ | 1/3 [00:00<00:01, 1.42it/s]Loading checkpoint shards: 67%|██████▋ | 2/3 [00:01<00:00, 1.57it/s]Loading checkpoint shards: 100%|██████████| 3/3 [00:01<00:00, 2.03it/s]Loading checkpoint shards: 100%|██████████| 3/3 [00:01<00:00, 1.86it/s]
Some weights of the model checkpoint at /home/cmuser/CM/repos/local/cache/31767c21a8f149e5/checkpoint/checkpoint-final were not used when initializing GPTJForCausalLM: ['transformer.h.0.attn.bias', 'transformer.h.0.attn.masked_bias', 'transformer.h.1.attn.bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.10.attn.bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.11.attn.bias', 'transformer.h.11.attn.masked_bias', 'transformer.h.12.attn.bias', 'transformer.h.12.attn.masked_bias', 'transformer.h.13.attn.bias', 'transformer.h.13.attn.masked_bias', 'transformer.h.14.attn.bias', 'transformer.h.14.attn.masked_bias', 'transformer.h.15.attn.bias', 'transformer.h.15.attn.masked_bias', 'transformer.h.16.attn.bias', 'transformer.h.16.attn.masked_bias', 'transformer.h.17.attn.bias', 'transformer.h.17.attn.masked_bias', 'transformer.h.18.attn.bias', 'transformer.h.18.attn.masked_bias', 'transformer.h.19.attn.bias', 'transformer.h.19.attn.masked_bias', 'transformer.h.2.attn.bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.20.attn.bias', 'transformer.h.20.attn.masked_bias', 'transformer.h.21.attn.bias', 'transformer.h.21.attn.masked_bias', 'transformer.h.22.attn.bias', 'transformer.h.22.attn.masked_bias', 'transformer.h.23.attn.bias', 'transformer.h.23.attn.masked_bias', 'transformer.h.24.attn.bias', 'transformer.h.24.attn.masked_bias', 'transformer.h.25.attn.bias', 'transformer.h.25.attn.masked_bias', 'transformer.h.26.attn.bias', 'transformer.h.26.attn.masked_bias', 'transformer.h.27.attn.bias', 'transformer.h.27.attn.masked_bias', 'transformer.h.3.attn.bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.4.attn.bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.6.attn.bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.7.attn.bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.8.attn.bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.9.attn.bias', 'transformer.h.9.attn.masked_bias']
- This IS expected if you are initializing GPTJForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing GPTJForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Casting models to GPU...
0%| | 0/285 [00:00<?, ?it/s]100%|██████████| 285/285 [00:00<00:00, 1962851.63it/s]
0%| | 0/285 [00:00<?, ?it/s]100%|██████████| 285/285 [00:00<00:00, 1628578.53it/s]
Running LoadGen test...
Number of Samples in query_samples : 1
0%| | 0/1 [00:00<?, ?it/s]/home/cmuser/venv/cm/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:676: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
warnings.warn(
100%|██████████| 1/1 [00:01<00:00, 1.28s/it]100%|██████████| 1/1 [00:01<00:00, 1.28s/it]
100%|██████████| 1/1 [00:01<00:00, 1.32s/it]100%|██████████| 1/1 [00:01<00:00, 1.32s/it]

No warnings encountered during test.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,5 @@
],
"CM_HOST_PLATFORM_FLAVOR": "x86_64",
"CM_HOST_PYTHON_BITS": "64",
"CM_HOST_SYSTEM_NAME": "943d335e77c7"
"CM_HOST_SYSTEM_NAME": "e31fe17416a4"
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,25 @@ Constructing QSL
Encoding Samples
Finished constructing QSL.
Loading PyTorch model...
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]Loading checkpoint shards: 33%|███▎ | 1/3 [00:00<00:01, 1.44it/s]Loading checkpoint shards: 67%|██████▋ | 2/3 [00:01<00:00, 1.66it/s]Loading checkpoint shards: 100%|██████████| 3/3 [00:01<00:00, 2.20it/s]Loading checkpoint shards: 100%|██████████| 3/3 [00:01<00:00, 1.98it/s]
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]Loading checkpoint shards: 33%|███▎ | 1/3 [00:02<00:04, 2.30s/it]Loading checkpoint shards: 67%|██████▋ | 2/3 [00:04<00:02, 2.15s/it]Loading checkpoint shards: 100%|██████████| 3/3 [00:05<00:00, 1.62s/it]Loading checkpoint shards: 100%|██████████| 3/3 [00:05<00:00, 1.78s/it]
Some weights of the model checkpoint at /home/cmuser/CM/repos/local/cache/31767c21a8f149e5/checkpoint/checkpoint-final were not used when initializing GPTJForCausalLM: ['transformer.h.0.attn.bias', 'transformer.h.0.attn.masked_bias', 'transformer.h.1.attn.bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.10.attn.bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.11.attn.bias', 'transformer.h.11.attn.masked_bias', 'transformer.h.12.attn.bias', 'transformer.h.12.attn.masked_bias', 'transformer.h.13.attn.bias', 'transformer.h.13.attn.masked_bias', 'transformer.h.14.attn.bias', 'transformer.h.14.attn.masked_bias', 'transformer.h.15.attn.bias', 'transformer.h.15.attn.masked_bias', 'transformer.h.16.attn.bias', 'transformer.h.16.attn.masked_bias', 'transformer.h.17.attn.bias', 'transformer.h.17.attn.masked_bias', 'transformer.h.18.attn.bias', 'transformer.h.18.attn.masked_bias', 'transformer.h.19.attn.bias', 'transformer.h.19.attn.masked_bias', 'transformer.h.2.attn.bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.20.attn.bias', 'transformer.h.20.attn.masked_bias', 'transformer.h.21.attn.bias', 'transformer.h.21.attn.masked_bias', 'transformer.h.22.attn.bias', 'transformer.h.22.attn.masked_bias', 'transformer.h.23.attn.bias', 'transformer.h.23.attn.masked_bias', 'transformer.h.24.attn.bias', 'transformer.h.24.attn.masked_bias', 'transformer.h.25.attn.bias', 'transformer.h.25.attn.masked_bias', 'transformer.h.26.attn.bias', 'transformer.h.26.attn.masked_bias', 'transformer.h.27.attn.bias', 'transformer.h.27.attn.masked_bias', 'transformer.h.3.attn.bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.4.attn.bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.6.attn.bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.7.attn.bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.8.attn.bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.9.attn.bias', 'transformer.h.9.attn.masked_bias']
- This IS expected if you are initializing GPTJForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing GPTJForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Casting models to GPU...
0%| | 0/285 [00:00<?, ?it/s]100%|██████████| 285/285 [00:00<00:00, 1998957.59it/s]
0%| | 0/285 [00:00<?, ?it/s]100%|██████████| 285/285 [00:00<00:00, 1885452.11it/s]
Running LoadGen test...
Number of Samples in query_samples : 1
0%| | 0/1 [00:00<?, ?it/s]/home/cmuser/venv/cm/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:676: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
warnings.warn(
100%|██████████| 1/1 [00:01<00:00, 1.25s/it]100%|██████████| 1/1 [00:01<00:00, 1.25s/it]
100%|██████████| 1/1 [00:01<00:00, 1.42s/it]100%|██████████| 1/1 [00:01<00:00, 1.42s/it]
================================================
MLPerf Results Summary
================================================
SUT name : PySUT
Scenario : Offline
Mode : PerformanceOnly
Samples per second: 0.802428
Tokens per second (inferred): 55.3675
Samples per second: 0.703373
Tokens per second (inferred): 48.5327
Result is : VALID
Min duration satisfied : Yes
Min queries satisfied : Yes
Expand All @@ -29,15 +29,15 @@ Result is : VALID
================================================
Additional Stats
================================================
Min latency (ns) : 1246217670
Max latency (ns) : 1246217670
Mean latency (ns) : 1246217670
50.00 percentile latency (ns) : 1246217670
90.00 percentile latency (ns) : 1246217670
95.00 percentile latency (ns) : 1246217670
97.00 percentile latency (ns) : 1246217670
99.00 percentile latency (ns) : 1246217670
99.90 percentile latency (ns) : 1246217670
Min latency (ns) : 1421721585
Max latency (ns) : 1421721585
Mean latency (ns) : 1421721585
50.00 percentile latency (ns) : 1421721585
90.00 percentile latency (ns) : 1421721585
95.00 percentile latency (ns) : 1421721585
97.00 percentile latency (ns) : 1421721585
99.00 percentile latency (ns) : 1421721585
99.90 percentile latency (ns) : 1421721585

================================================
Test Parameters Used
Expand Down
Loading

0 comments on commit ebea480

Please sign in to comment.