GitHub - hdetroja/intro_to_computer_vision: Introduction to Computer Vision

Lectures

Lecture	Notebook/Slides	Required Reading/Viewing	Additional Reading/Viewing	Key Topics
1	A Brief History of Neural Networks	-	Goodfellow Chapter 1, fastai dl lesson 1	Perceptrons, Multilayer Perceptrons, Neural Networks, The Rise of Deep Learning
Optional	Introduction to Jupyter and Python	-	fastai ml lesson 1	iPython, The Jupyter Notebook, Numpy, Matplotlib, Working with Image Data
2	Computer Vision State of the Art	Alexnet Paper	-	State of the art in Classification, Detection, Pose Estimation, Image Generation, and other problems
3	Computer Vision Applications	-	-	What can we do with comptuer vision?

Programming Challenges

1. Image Processing Script

The main purpose of this challenge is to familiarize ourselves with basic image processing utilities in computer vision. There are multiple in-built functions that aid in this, but the key to this challenge lies in implementing these functions using the numpy package alone. This will help us to see an image as a computer sees it, a matrix of pixels.

Instructions:

In the challenge directory of this repo, you'll find a sample_student.py script. Your job is to complete the image processing methods in this script.
The package numpy will already be imported in our evaluation script.
You can test your sample_student.py script locally by running the evaluate.py script in the challenge directory.
Sample images in the data directory have been used in the evaluate.py for better understanding of how the script will run. Please note that different images will be used for testing on our evaluation server.
You are allowed 3 submissions to the evaluation server, which will provide immediate scores and feedback.

2. State-of-the-Art on Your Own Data

In this module, we'll quite a few State-of-the-Art computer vision algorithms. One of the really exciting things about computer vision right now is the amount of high quality, publically available code. For this part of your assignment, your job is to run one publically avaialable algorithm on your own video or images. Your deliverable is a short video, posted to YouTube, showing your results. For example, you could shoot your own video, and use and Mask RCNN to process each frame, and stitch these results together into a short video.

A Sample of The Computer Vision State of the Art in 2019

PROBLEM	PAPER	CODE
Classification	“ResNet” Deep Residual Learning for Image Recognition	Implemented in keras, pytorch, fastai
Detection	RetinaNet: Focal Loss for Dense Object Detection Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks SSD: Single Shot MultiBox Detector YOLOv3: An Incremental Improvement	Part of FAIR’s Detectron Part of Tensorflow Object Detection API Part of Tensorflow Object Detection API CODE
Semantic Segmentation	“Deeplab v3” Rethinking Atrous Convolution for Semantic Image Segmentation	CODE
Instance Segmentation	Mask R-CNN	CODE
Human Pose Estimation	OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields	CODE
Hand Pose Estimation	GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
Face Detection	Selective Refinement Network for High Performance Face Detection	CODE
Face Recognition	FaceNet: A Unified Embedding for Face Recognition and Clustering	CODE
Tracking	Fast Online Object Tracking and Segmentation: A Unifying Approach	CODE
Depth Estimation	Digging Into Self-Supervised Monocular Depth Estimation	CODE
Structure from Motion		opensfm
Image Generation	LARGE SCALE GAN TRAINING FOR HIGH FIDELITY NATURAL IMAGE SYNTHESIS
Face Generation	StyleGAN: A Style-Based Generator Architecture for Generative Adversarial Networks	CODE
Image to Image	Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks	CODE
Style Transfer	A Closed-form Solution to Photorealistic Image Stylization	CODE
Keypoint Detection and Tracking	SuperPoint: Self-Supervised Interest Point Detection and Description	CODE
Image Captioning	Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering	CODE
Text to Image	StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks	CODE

Setup

The Python 3 Anaconda Distribution is the easiest way to get going with the notebooks and code presented here.

(Optional) You may want to create a virtual environment for this repository:

conda create -n cv python=3 
source activate cv

You'll need to install the jupyter notebook to run the notebooks:

conda install jupyter

# You may also want to install nb_conda (Enables some nice things like change virtual environments within the notebook)
conda install nb_conda

This repository requires the installation of a few extra packages, you can install them with:

conda install -c pytorch -c fastai fastai
conda install jupyter
conda install -c conda-forge opencv

(Optional) jupyterthemes can be nice when presenting notebooks, as it offers some cleaner visual themes than the stock notebook, and makes it easy to adjust the default font size for code, markdown, etc. You can install with pip:

pip install jupyterthemes

Recommend jupyter them for presenting these notebook (type into terminal before launching notebook):

jt -t grade3 -cellw=90% -fs=20 -tfs=20 -ofs=20 -dfs=20

Recommend jupyter them for viewing these notebook (type into terminal before launching notebook):

jt -t grade3 -cellw=90% -fs=14 -tfs=14 -ofs=14 -dfs=14

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
challenge		challenge
data		data
graphics		graphics
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Lectures

Programming Challenges

1. Image Processing Script

Instructions:

2. State-of-the-Art on Your Own Data

A Sample of The Computer Vision State of the Art in 2019

Setup

About

Uh oh!

Releases

Packages

Languages

hdetroja/intro_to_computer_vision

Folders and files

Latest commit

History

Repository files navigation

Lectures

Programming Challenges

1. Image Processing Script

Instructions:

2. State-of-the-Art on Your Own Data

A Sample of The Computer Vision State of the Art in 2019

Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages