Onnxruntime based Inference Optimization of Roberta text classification model.
Analysed Inference optimization based on
- Graph optimization
- Quatnization
- CPU archtecture AVX2,AVX512
Metrics : Conclusion based on
- Classification Accuracy
- Performance in milliseconds(ms)
Dataset : Huggingface model :