-
We convert dataset to VOC format. I use UA-DETRAC dataset, and we can use ./VOCdevkit/ files to convert dataset.
-
In the official yolov4-tiny, there is a slice operation to realize the CSPnet, but the quantitative tools don't support the operation, so I use a 1*1 convolution to replace it.
-
Then we can use train.py to train the model, and save the model structure and weights as model.json and model.h5. I use TensorFlow-gpu 2.2.0.
-
Then we can generate pb file that is suitable for deployment tools. We can see ./frozon_result/readme.md for details.
-
Then we use Vitis-AI to quantify our model. We can use ./scripts/1_vitisAI_tf_printNode.sh to find the input and output, and use ./scripts/2_vitisAI_tf_quantize.sh to quantify our model.
-
We can compile our model. We can use ./scripts/3_vitisAI_tf_compile.sh to compile our model.
-
We should use vivado and Vitis to build the hardware platform. (./edge/readme.md)
-
The last, we can run our model on Ultra96-v2 board. There is an example that using yolo model to detate vehicles (./edge/dpu_yolo_v4_tiny.ipynb). There are the results, the fps is 25 with 320*320 images.
-
In order to achieve faster detection speed, I try to use Yolo-Fastest (Yolo-Fastest) and implement it with tensorflow, then deploy it to Ultra96-v2 board. There are the results, it can achieve 30fps+.
-
Now we support model pruning. We use keras-surgeon 0.2.0 and nni 1.5 to prune the model, you can see in ./Model_pruning. I modified the source code of nni (compressor.py) and fixed some bugs, then we can choose the layer that we want to prune, and I gave a demo that use FPGM to prune the model.
References: