-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom model had very slow performance (fps) #107
Comments
maybe bacause the input size? try change to smaller input size |
I tried YOLO11n 640x640 it has ~11ms. Is there any profiler tools I can use to investigate performance bottlenecks? |
maybe you can change different output node to debug which node spend so much time |
your model is simple, change different output node and export bf16 or int8 both fast, just try |
and don't use --quant_input arg if you use MaixPy |
Hi, I tried to run custom model and it run very slow, compared to YOLO. I tested with
examples/vision/ai_vision/nn_forward.py
and my model had forward time ~280ms compared to 11ms for YOLOv8n. But mine model is 4x time smaller. Actually I tried to run SuperPoint CDNN.I've exported PyTorch model to ONNX, it runs well, on cpu it has same ~200ms forward time. There is model structure from Netron.app:
convert_model.sh
Then I used this script to quantize model to cvitek format and set output tensor to last convolution layers:
convert_model.sh
Although my model has only 18 nodes compared to 80 in yolov8n, it has enormous ION memory need - 46.7Mb (CviModel Need ION Memory Size: (46.68 MB)) compared to 4.4Mb for YOLO (CviModel Need ION Memory Size: (4.40 MB)).
Also that tensor map in resulting cvimodel looks odd, batch of Relu's, where in ONNX model has Conv→Relu→Conv→Relu structure.
I have read "cvitek tpu quick start guide" and tpumlir.org docs and didn't found any clue.
I definitely missing something, please help.
cvimodel_tool full dump
The text was updated successfully, but these errors were encountered: