Skip to content

Commit

Permalink
continue updating SCC'23 tutorial (added Google colab page)
Browse files Browse the repository at this point in the history
  • Loading branch information
gfursin committed Nov 5, 2023
1 parent 84def49 commit 8f89b71
Show file tree
Hide file tree
Showing 6 changed files with 37 additions and 12 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

### News

* The ACM YouTube channel has released the ACM REP'23 keynote about the MLCommons CM automation language and CK playground:
[toward a common language to facilitate reproducible research and technology transfer](https://youtu.be/_1f9i_Bzjmg?si=7XoXRtcU0rglRJr0).
* The MLCommons Task Force on Automation and Reproducibility is resuming [weekly conf-calls](https://docs.google.com/document/d/1zMNK1m_LhWm6jimZK6YE05hu4VH9usdbKJ3nBy-ZPAw/edit) -
it is open to everyone!
* [The community](https://access.cknowledge.org/playground/?action=contributors) has successfully validated
Expand Down
1 change: 1 addition & 0 deletions docs/_generator/generate_toc.cmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
cd ../tutorials

cm create-toc-from-md utils --input=scc23-mlperf-inference-bert.md
cm create-toc-from-md utils --input=sc22-scc-mlperf.md
cm create-toc-from-md utils --input=sc22-scc-mlperf-part2.md
cm create-toc-from-md utils --input=sc22-scc-mlperf-part3.md
Expand Down
5 changes: 5 additions & 0 deletions docs/news.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

## MLCommons CK and CM news

### 202311

* The ACM YouTube channel has released the ACM REP'23 keynote about the MLCommons CM automation language and CK playground:
[toward a common language to facilitate reproducible research and technology transfer](https://youtu.be/_1f9i_Bzjmg?si=7XoXRtcU0rglRJr0).

### 202310

* The MLCommons Task Force on Automation and Reproducibility is resuming [weekly conf-calls](https://docs.google.com/document/d/1zMNK1m_LhWm6jimZK6YE05hu4VH9usdbKJ3nBy-ZPAw/edit) -
Expand Down
2 changes: 1 addition & 1 deletion docs/taskforce.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ to develop an open-source automation technology that can help everyone [co-desig
that can run AI and ML applications in the most efficient way across diverse models, data sets, software and hardware from any vendor
using the [MLPerf methodology](https://arxiv.org/abs/1911.02549).

As an outcome of this community effort, we have developed the [MLCommons CM automation language](https://doi.org/10.5281/zenodo.8105339),
As an outcome of this community effort, we have developed the [MLCommons CM automation language](https://youtu.be/_1f9i_Bzjmg?si=l0Qqon2Rt7pSji36),
[MLCommons C++ Modular Inference Library (MIL)](../cm-mlops/script/app-mlperf-inference-cpp/README-extra.md)
and the [MLCommons CK playground](https://access.cKnowledge.org).
This open-source technology was successfully validated during the [1st mass-scale community MLPerf inference submission](https://www.linkedin.com/feed/update/urn:li:activity:7112057645603119104/)
Expand Down
3 changes: 2 additions & 1 deletion docs/tutorials/modular-image-classification.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
[ [Back to index](../README.md) ]

*This tutorial is also available in [Google Colab](https://colab.research.google.com/drive/1fPFw86BKOQ79U1-lksTkAtJHn3_jhP9o?usp=sharing).*
*An interactive version of this tutorial is also available
at this [Google Colab page](https://colab.research.google.com/drive/1fPFw86BKOQ79U1-lksTkAtJHn3_jhP9o?usp=sharing).*

# Trying CM: modular image classification

Expand Down
36 changes: 26 additions & 10 deletions docs/tutorials/scc23-mlperf-inference-bert.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,21 +25,22 @@
* [Download the SQuAD validation dataset](#download-the-squad-validation-dataset)
* [Detect or install ONNX runtime for CPU](#detect-or-install-onnx-runtime-for-cpu)
* [Download Bert-large model (FP32, ONNX format)](#download-bert-large-model-fp32-onnx-format)
* [Pull MLPerf inference sources with reference implementations](#pull-mlperf-inference-sources-with-reference-implementations)
* [Pull MLPerf inference sources with reference benchmark implementations](#pull-mlperf-inference-sources-with-reference-benchmark-implementations)
* [Run short reference MLPerf inference benchmark to measure accuracy (offline scenario)](#run-short-reference-mlperf-inference-benchmark-to-measure-accuracy-offline-scenario)
* [Run short MLPerf inference benchmark to measure performance (offline scenario)](#run-short-mlperf-inference-benchmark-to-measure-performance-offline-scenario)
* [Prepare minimal MLPerf submission to the SCC committee](#prepare-minimal-mlperf-submission-to-the-scc-committee)
* [Publish results at the live SCC'23 dashboard](#publish-results-at-the-live-scc'23-dashboard)
* [Run optimized implementation of the MLPerf inference benchmark](#run-optimized-implementation-of-the-mlperf-inference-benchmark)
* [Showcasing CPU performance (x64 or Arm64)](#showcasing-cpu-performance-x64-or-arm64)
* [int8](#int8)
* [fp32](#fp32)
* [Quantized and pruned BERT model (int8)](#quantized-and-pruned-bert-model-int8)
* [Pruned BERT model (fp32)](#pruned-bert-model-fp32)
* [Showcasing Nvidia GPU performance](#showcasing-nvidia-gpu-performance)
* [Showcasing Nvidia AMD performance](#showcasing-nvidia-amd-performance)
* [Optimize benchmark yourself](#optimize-benchmark-yourself)
* [Using quantized models](#using-quantized-models)
* [Changing batch size](#changing-batch-size)
* [Adding support for multi-node execution](#adding-support-for-multi-node-execution)
* [Adding new implementation for new hardware](#adding-new-implementation-for-new-hardware)
* [The next steps](#the-next-steps)
* [Acknowledgments](#acknowledgments)
* [Nvidia MLPerf inference backend](#nvidia-mlperf-inference-backend)
* [DeepSparse MLPerf inference backend](#deepsparse-mlperf-inference-backend)
Expand Down Expand Up @@ -76,7 +77,8 @@ that you will submit to the SCC organizers to get points.




*An interactive version of the short versionof this tutorial is available
at this [Google colab page](https://colab.research.google.com/drive/1kgw1pdKi8QcCTqPZu1Vh_ur1NOeTRdWJ?usp=sharing)*.



Expand Down Expand Up @@ -565,9 +567,18 @@ cm show cache --tags=get,ml-model,bert-large,_onnx

*Note that you will have a different CM UID consisting of 16 hexadecimal lowercase characters.*

You can find downloaded model as follows:

```bash
ls `cm find cache "download ml-model bert-large"` -l

...
1340711828 Nov 5 14:31 model.onnx
...
```

### Pull MLPerf inference sources with reference implementations

### Pull MLPerf inference sources with reference benchmark implementations

You should now download and cache the MLPerf inference sources using the following command:

Expand Down Expand Up @@ -1012,23 +1023,25 @@ Don't forget to set this environment if you use Python virtual environment insta
export CM_SCRIPT_EXTRA_CMD="--adr.python.name=mlperf"
```

#### Int8 pruned BERT model
#### Quantized and pruned BERT model (int8)

First you can make a full (valid) run of the MLPerf inference benchmark with quantized and pruned Int8 BERT model,
batch size of 128 and DeepSparse backend via CM as follows:

```
cmr "run mlperf inference generate-run-cmds _submission _short" \
cmr "run mlperf inference generate-run-cmds _submission _short _dashboard" \
--submitter="SCC23" \
--hw_name=default \
--implementation=reference \
--model=bert-99 \
--backend=deepsparse \
--device=cpu \
--scenario=Offline \
--execution-mode=valid \
--execution-mode=test \
--test_query_count=2000 \
--adr.mlperf-inference-implementation.max_batchsize=128 \
--env.CM_MLPERF_NEURALMAGIC_MODEL_ZOO_STUB=zoo:nlp/question_answering/mobilebert-none/pytorch/huggingface/squad/14layer_pruned50_quant-none-vnni \
--dashboard_wb_project=cm-mlperf-scc23-bert-offline \
--quiet \
--output_tar=mlperf_submission_1.tar.gz \
--output_summary=mlperf_submission_1_summary \
Expand All @@ -1037,7 +1050,10 @@ cmr "run mlperf inference generate-run-cmds _submission _short" \
```


#### fp32 pruned BERT model



#### Pruned BERT model (fp32)

```bash
cmr "run mlperf inference generate-run-cmds _submission _short" \
Expand Down

0 comments on commit 8f89b71

Please sign in to comment.