Object Detection with YoloV1

This repository contains an implementation of YOLO (You Only Look Once) Object Detection which uses a deep convolutional neural network. Object detection is the problem involving when given an image, we would like to identify all the objects in the image and draw a bounding box around them.

We use a pretrained ResNet network that is trained on Imagenet's 224x224 dataset. The dataset used for fine-tuning the detection is Pascal VOC, located here: http://host.robots.ox.ac.uk/pascal/VOC/

General Algorithm of YoloV1

YoloV1 uses an architecture coined 'darknet' by authors, which is a deep convolutional neural network with maxpooling and many filters to learn spatial localities of an image.
Yolo splits up an image into an SxS grid of cells; in this repository, I've chosen S=7.
Each cell will predict B=2 bounding boxes, which a length 30 tuple with [0:19] = 20 classes of objects, [20:24] = 1st bounding box, [25:29] = 2nd bounding box (these indices are inclusive on both ends)
In the end, the model will output a predictions_matrix, which is of shape (batchSize, S * S * 30)
We use non-max suppression to filter predictions_matrix to get our final bounding box
Then, we pass all of this information into a loss function based off of the sum of squared differences (SSD) error for backpropogation. The loss function penalizes the bounding boxes "responsible" for a cell (A bounding box is "responsible" for a cell if it has the highest IoU out of all the boxes in that cell.

Key Ideas

Non-Max Suppression

Problem to Solve: We have too many bounding boxes, and we need to suppress

Intersection Over Union (IoU)

We need a metric to determine how much too bounding boxes overlap

Mean Average Precision (mAP)

To quantify the performance of an object detection model, we need to evaluate all of its bounding boxes. In data science, we have 2 important metrics: (note: T/F = True/False; P/N = Positive/Negative, where positive means we predicted a bbox)

Precision = TP / (TP + FP) "Of all the bounding boxes we predicted, what percentage was correct?"
Recall = TP / (TP + FN) "Of all the target bounding boxes that were present, what percentage did we identify correctly?

General Algorithm of mAP

Calculate the running precision and recall of all the bounding boxes (for a specific class) we predicted sorted by their confidence in decreasing order
Calculate the area under the PR curve (precision on y-axis vs. recall on x-axis), called the AP.
We calculate an AP for every class and average them to get the average AP. (This average AP value is just for a specific threshold)
To get the mAP, we need to find the average AP for all threshold values (usually from .5 to .95 with a stepsize of 0.05).

Results

Yolo Struggles and Improvements:

YoloV1 heavily struggles on small objects that are densely packed, such as a flock of birds. Because it really only outputs 2 bounding boxes per cell, it is limited to detecting S*S objects at most.
Data Augmentation - Yolo struggles to generalize to

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.vscode		.vscode
README.md		README.md
dataset.py		dataset.py
evaluation.ipynb		evaluation.ipynb
loss.py		loss.py
model.py		model.py
resnet_model.py		resnet_model.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object Detection with YoloV1

General Algorithm of YoloV1

Key Ideas

Non-Max Suppression

Intersection Over Union (IoU)

Mean Average Precision (mAP)

Results

Yolo Struggles and Improvements:

About

Releases

Packages

Contributors 2

Languages

AntonioMacaronio/YoloV1-ObjectDetection-DataAugmentation

Folders and files

Latest commit

History

Repository files navigation

Object Detection with YoloV1

General Algorithm of YoloV1

Key Ideas

Non-Max Suppression

Intersection Over Union (IoU)

Mean Average Precision (mAP)

Results

Yolo Struggles and Improvements:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages