Skip to content

Latest commit

 

History

History
55 lines (25 loc) · 3.06 KB

File metadata and controls

55 lines (25 loc) · 3.06 KB

Ultra96 Yolov4-tiny and Yolo-Fastest

  1. We convert dataset to VOC format. I use UA-DETRAC dataset, and we can use ./VOCdevkit/ files to convert dataset.

  2. In the official yolov4-tiny, there is a slice operation to realize the CSPnet, but the quantitative tools don't support the operation, so I use a 1*1 convolution to replace it.

  3. Then we can use train.py to train the model, and save the model structure and weights as model.json and model.h5. I use TensorFlow-gpu 2.2.0.

  4. Then we can generate pb file that is suitable for deployment tools. We can see ./frozon_result/readme.md for details.

  5. Then we use Vitis-AI to quantify our model. We can use ./scripts/1_vitisAI_tf_printNode.sh to find the input and output, and use ./scripts/2_vitisAI_tf_quantize.sh to quantify our model.

  6. We can compile our model. We can use ./scripts/3_vitisAI_tf_compile.sh to compile our model.

  7. We should use vivado and Vitis to build the hardware platform. (./edge/readme.md)

  8. The last, we can run our model on Ultra96-v2 board. There is an example that using yolo model to detate vehicles (./edge/dpu_yolo_v4_tiny.ipynb). There are the results, the fps is 25 with 320*320 images.

    1

    2

  9. In order to achieve faster detection speed, I try to use Yolo-Fastest (Yolo-Fastest) and implement it with tensorflow, then deploy it to Ultra96-v2 board. There are the results, it can achieve 30fps+.

    3

    4

  10. Now we support model pruning. We use keras-surgeon 0.2.0 and nni 1.5 to prune the model, you can see in ./Model_pruning. I modified the source code of nni (compressor.py) and fixed some bugs, then we can choose the layer that we want to prune, and I gave a demo that use FPGM to prune the model.

References:

Yolov4-tiny-tf2

Yolo-v3-Xilinx

Yolo-v4-tutorial-Xilinx

Yolo-v3-dnndk

UA-DETRAC to VOC

Vitis-AI 1.1

Yolo-Fastest

keras-surgeon

nni