Skip to content

HW1: Implementing Tiny YOLOv2 using tensorflow API

Compare
Choose a tag to compare
@snowphone snowphone released this 20 Apr 08:04
· 14 commits to master since this release
94de1fc
Hw1 (#1)

* Fill out blanked codes

I didn't test yet, but I wrote most of logical flows that the homework requires.
I wrote draw function and main logic. However, I didn't understand 4th requirements.

* Change indentation unit to tab

In previous commit, two spaces and four spaces are mixed for indentation.
So, I changed all to tab.

* Create layers, not yet weights

I created YOLO-v2-tiny model, but pretrained model is delivered in pickle format, so I haven't understood how to use it. But soon I'll load it

* Refactor duplicated codes into a function

* Fix bugs in opening video file and resizing function

* Set batch to 1 because it only proceeds inference

* (Incomplete) Consume Too much(more than 100GB) memory

* Fix a typo

* Fix memory allocation error

I tried to make filters from 16 until 1024, but I accidently did from 16 to pow(16, 9) ~ 6e10 ~ 60G.
That was why my code cannot allocate enough memory.

Plus, I fixed professor's wrong code. sess.run evaluates only last layer.

* Set fixed precision on throughput

* Add some information about yolov2tiny architecture

* (Incomplete) Find a bottleneck on non-max-suppression

To find out, I made a tracer and set on some functions

* Fix wrongly indented codes in nms function

* Reshape bounding boxes from (416, 416) to original video resolution

* Add parameters

* Show attributes

* Store every layer on first frame into intermediate folder

* WIP: Make a room for bias, but not yet implemented cuz still figuring out to use weights

* Change video codec to mp4v

* WIP: create layers hierarchically

YOLO-v2-tiny consists of nine composit layers.
And each layer consists of smaller layers such as conv, batch_norm, bias, maxpool, leakyReLU.
Therefore, I mimicked its hierarchy.

* Infer objects correctly

* Add explicit bias layers right after each conv layer.
* Load weights by using tf.Variable and attentive layers in tf.nn.
* Use leftupper coordinate and rightbottom coordinate in draw function since coordinates are already shaped in restore_shape function.
* Remove unused comments and debug lines
* Use original image in draw function.

* Measure inference, end-to-end, fps and total time

* Update for LaTeX

* Add report template

* Limit yolo takes only 70% of GPU VRAM

In small VRAM environment, allow_grouth option is not enough for preventing out of memory error.
So, referencing some information, I forced not to take more than 70% VRAM

* Clarify which values we save

* Update yolov2tiny.py

delete out_chan, default value of stride
통일성 위해서 maxpool도 그냥 max_pool2d로 하면 어떤지? (일단 주석해놈)

* Update yolov2tiny.py

* Update __init__.py

Add start time of "end-to-end time", beg_start and change previous beg -> beg_infer
Does change to name of "beg" in obj_detection effect to measure function? (I'm not sure)

* Update yolov2tiny.py

put back n_... values to post processing

* Create consider.txt

* Update yolov2tiny.py

confirm "tf.nn.max_pool2d" working well

* Update consider.txt

* Update __init__.py

move down  the saving intermediate first frame result(tensor) part for measuring necessary time

* Update consider.txt

* Update __init__.py

add printing total time

* Add some details

* Add comment about inference FPS

* Write detailed info of why I chose tf.nn functions

* Upload whole model visualization

I visualized the whole tf graph by using tf.train.Saver.
The only catch is, it is too verbose to see the main logic.
But I decided to save the visualized graph just in case.

* Add GPU benchmark

* Add cpu benchmark

* Add first draft of the report

* Update report.tex

Some change

* Update report.tex

* Change code location and table

* Edit figures

Co-authored-by: jehoon315 <[email protected]>