Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
export.py		export.py
main.py		main.py
prepare_data.sh		prepare_data.sh
prepare_model.sh		prepare_model.sh
requirements.txt		requirements.txt
run_benchmark.sh		run_benchmark.sh
run_quant.sh		run_quant.sh

README.md

Step-by-Step

This example load a MobileBERT model and confirm its accuracy and speed based on GLUE data.

Prerequisite

1. Environment

pip install neural-compressor
pip install -r requirements.txt

Note: Validated ONNX Runtime Version.

2. Prepare Dataset

download the GLUE data with prepare_data.sh script.

export GLUE_DIR=path/to/glue_data
export TASK_NAME=MRPC

bash prepare_data.sh --data_dir=$GLUE_DIR --task_name=$TASK_NAME

3. Prepare Model

Please refer to Bert-GLUE_OnnxRuntime_quantization guide for detailed model export. The following is a simple example.

Use Huggingface Transformers to fine-tune the model based on the MRPC example with command like:

export OUT_DIR=/path/to/out_dir/
python ./run_glue.py \ 
    --model_type mobilebert \
    --model_name_or_path google/mobilebert-uncased \ 
    --task_name $TASK_NAME \
    --do_train \
    --do_eval \
    --do_lower_case \
    --data_dir $GLUE_DIR/$TASK_NAME \
    --max_seq_length 128 \
    --per_gpu_eval_batch_size=8  \
    --per_gpu_train_batch_size=8  \
    --learning_rate 2e-5 \
    --num_train_epochs 5.0 \
    --save_steps 100000 \
    --output_dir $OUT_DIR

Run the prepare_model.sh script:

bash prepare_model.sh --input_dir=$OUT_DIR \
                      --task_name=$TASK_NAME \
                      --output_model=path/to/model # model path as *.onnx

Run

1. Quantization

Dynamic quantization:

bash run_quant.sh --input_model=path/to/model \ # model path as *.onnx
                   --output_model=path/to/model_tune \ # model path as *.onnx
                   --dataset_location=path/to/glue_data

2. Benchmark

bash run_benchmark.sh --input_model=path/to/model \ # model path as *.onnx
                      --dataset_location=path/to/glue_data \
                      --batch_size=batch_size \
                      --mode=performance # or accuracy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ptq_dynamic

ptq_dynamic

README.md

Step-by-Step

Prerequisite

1. Environment

2. Prepare Dataset

3. Prepare Model

Run

1. Quantization

2. Benchmark

Files

ptq_dynamic

Directory actions

More options

Directory actions

More options

Latest commit

History

ptq_dynamic

Folders and files

parent directory

README.md

Step-by-Step

Prerequisite

1. Environment

2. Prepare Dataset

3. Prepare Model

Run

1. Quantization

2. Benchmark