Skip to content

hdetroja/intro_to_computer_vision

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lectures

Lecture Notebook/Slides Required Reading/Viewing Additional Reading/Viewing Key Topics
1 A Brief History of Neural Networks - Goodfellow Chapter 1, fastai dl lesson 1 Perceptrons, Multilayer Perceptrons, Neural Networks, The Rise of Deep Learning
Optional Introduction to Jupyter and Python - fastai ml lesson 1 iPython, The Jupyter Notebook, Numpy, Matplotlib, Working with Image Data
2 Computer Vision State of the Art Alexnet Paper - State of the art in Classification, Detection, Pose Estimation, Image Generation, and other problems
3 Computer Vision Applications - - What can we do with comptuer vision?

Programming Challenges

1. Image Processing Script

The main purpose of this challenge is to familiarize ourselves with basic image processing utilities in computer vision. There are multiple in-built functions that aid in this, but the key to this challenge lies in implementing these functions using the numpy package alone. This will help us to see an image as a computer sees it, a matrix of pixels.

Instructions:

  • In the challenge directory of this repo, you'll find a sample_student.py script. Your job is to complete the image processing methods in this script.
  • The package numpy will already be imported in our evaluation script.
  • You can test your sample_student.py script locally by running the evaluate.py script in the challenge directory.
  • Sample images in the data directory have been used in the evaluate.py for better understanding of how the script will run. Please note that different images will be used for testing on our evaluation server.
  • You are allowed 3 submissions to the evaluation server, which will provide immediate scores and feedback.

2. State-of-the-Art on Your Own Data

In this module, we'll quite a few State-of-the-Art computer vision algorithms. One of the really exciting things about computer vision right now is the amount of high quality, publically available code. For this part of your assignment, your job is to run one publically avaialable algorithm on your own video or images. Your deliverable is a short video, posted to YouTube, showing your results. For example, you could shoot your own video, and use and Mask RCNN to process each frame, and stitch these results together into a short video.

A Sample of The Computer Vision State of the Art in 2019

PROBLEM PAPER CODE
Classification “ResNet” Deep Residual Learning for Image Recognition Implemented in keras, pytorch, fastai
Detection RetinaNet: Focal Loss for Dense Object Detection


Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks


SSD: Single Shot MultiBox Detector


YOLOv3: An Incremental Improvement
Part of FAIR’s Detectron


Part of Tensorflow Object Detection API


Part of Tensorflow Object Detection API


CODE
Semantic Segmentation “Deeplab v3” Rethinking Atrous Convolution for Semantic Image Segmentation CODE
Instance Segmentation Mask R-CNN CODE
Human Pose Estimation OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields CODE
Hand Pose Estimation GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
Face Detection Selective Refinement Network for High Performance Face Detection CODE
Face Recognition FaceNet: A Unified Embedding for Face Recognition and Clustering CODE
Tracking Fast Online Object Tracking and Segmentation: A Unifying Approach CODE
Depth Estimation Digging Into Self-Supervised Monocular Depth Estimation CODE
Structure from Motion opensfm
Image Generation LARGE SCALE GAN TRAINING FOR HIGH FIDELITY NATURAL IMAGE SYNTHESIS
Face Generation StyleGAN: A Style-Based Generator Architecture for Generative Adversarial Networks CODE
Image to Image Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks CODE
Style Transfer A Closed-form Solution to Photorealistic Image Stylization CODE
Keypoint Detection and Tracking SuperPoint: Self-Supervised Interest Point Detection and Description CODE
Image Captioning Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering CODE
Text to Image StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks CODE

Setup

The Python 3 Anaconda Distribution is the easiest way to get going with the notebooks and code presented here.

(Optional) You may want to create a virtual environment for this repository:

conda create -n cv python=3 
source activate cv

You'll need to install the jupyter notebook to run the notebooks:

conda install jupyter

# You may also want to install nb_conda (Enables some nice things like change virtual environments within the notebook)
conda install nb_conda

This repository requires the installation of a few extra packages, you can install them with:

conda install -c pytorch -c fastai fastai
conda install jupyter
conda install -c conda-forge opencv

(Optional) jupyterthemes can be nice when presenting notebooks, as it offers some cleaner visual themes than the stock notebook, and makes it easy to adjust the default font size for code, markdown, etc. You can install with pip:

pip install jupyterthemes

Recommend jupyter them for presenting these notebook (type into terminal before launching notebook):

jt -t grade3 -cellw=90% -fs=20 -tfs=20 -ofs=20 -dfs=20

Recommend jupyter them for viewing these notebook (type into terminal before launching notebook):

jt -t grade3 -cellw=90% -fs=14 -tfs=14 -ofs=14 -dfs=14

About

Introduction to Computer Vision

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.9%
  • Python 0.1%