Skip to content

This is a Simple Perception Stack for Self-Driving Cars; the project for CSE483 - Computer Vision course in the Faculty of Engineering, Ain Shams University.

License

Notifications You must be signed in to change notification settings

vadrif-draco/asufecse483project-simpleperceptionstack

Repository files navigation

Simple Perception Stack for Self-Driving Cars

Contributors Issues Forks Stargazers

This is a Simple Perception Stack for Self-Driving Cars; the major task project for the CSE483 - Computer Vision course in the Faculty of Engineering, Ain Shams University; for Spring 2022.
Table of Contents
  1. Foreword
  2. Phase One Details and Requirements
  3. Phase Two Details and Requirements
  4. Getting Started
  5. Usage
  6. Contributing
  7. Acknowledgments

Foreword

Self-driving cars have piqued human interest for centuries. Leonardo Da Vinci sketched out the plans for a hypothetical self-propelled cart in the late 1400s, and mechanical autopilots for airplanes emerged in the 1930s. In the 1960s an autonomous vehicle was developed as a possible moon rover for the Apollo astronauts. A true self-driving car has remained elusive until recently. Technological advancements in global positioning systems (GPS), digital mapping, computing power, and sensor systems have finally made it a reality.

In this project we are going to create a simple perception stack for self-driving cars (SDCs). Although a typical perception stack for a self-driving car may contain different data sources from different sensors (ex.: cameras, lidar, radar, etc…), we’re only going to be focusing on video streams from cameras for simplicity. We’re mainly going to be analyzing the road ahead, detecting the lanes and their lines, detecting other cars/agents on the road, and estimating some useful information that may help other SDCs stacks. The project is split into two phases.

Built With

Python OpenCV Numpy Jupyter Notebooks Google Colab

Phase One Details and Requirements

Sample Before Computer Vision

The main perception stack, focusing on analyzing camera video streams caught by the vehicle itself from the front. This helps perform estimates on how to plan the next move, go left, go right, slow down, speed up, by how much, etc... Of course these estimates are not achievable through only analyzing the video stream, so this part is only concerned with what can be inferred from that simple means of information. That would be limited to detecting/identifying the boundaries of the lane that the car is currently driving on. Basically, the lane lines of the road ahead. This is under the assumption that road rules are being followed by drivers.

The lane detection will be manifested through coloring the borders (lines) of the lane with different colors as well as coloring the lane itself that lies in between. Additionally, some metrics describing the car's location and the lane's structure should be detected, such as the lane's radius of curvature and car's position w.r.t. the center of the lane. Of course, this will require conversion from pixels to meters. To simplify matters, the camera is assumed to be mounted at the car's center.

For consistency, the lane should be marked in green and its borders in yellow and/or red. The lanes should be detected in the pipeline as solid lines, interpolated to whatever function that fits them (may use OpenCV to get the coefficients of a fitting quadratic function). Note that all lane types are to be detectable (could add an indicator that tells us what type the detected lane is). Also note that "marking" is supposed to be semi-transparent highlighting, not opaque coloring.

Milestones

  • Create repository (April 16th, 2022)
  • Create initial README.md (April 23rd, 2022)
  • Create pipeline and test it against static image samples in assets (April 28th, 2022)
    • straight_lines1.jpg Pass
    • straight_lines2.jpg Pass
    • test1.jpg Pass
    • test2.jpg Pass
    • test3.jpg Pass
    • test4.jpg Pass
    • test5.jpg Pass
    • test6.jpg Pass
  • Test pipeline against video samples in assets
    • project_video.mp4 Pass
    • challenge_video.mp4 Pass
    • harder_challenge_video.mp4 Pass
  • Full code clean-up and documentation with debugging
    • Show the individual image/video processing steps
    • Show any relevant statistics
  • A simple batch/shell script to call python to run the code with arguments to run in debugging mode... (April 30th, 2022)
    • Example cmd for normal run: pipeline_run.sh input_path output_path
    • Example cmd for debug run: pipeline_run.bat input_path output_path -d
  • Demo jupyter notebook for demonstrating the features of the pipeline (April 27th, 2022)
  • Upload demo result images and videos as ssues
  • Review README.md with respect to phase 1
    • Add/update necessary information to the Getting Started section (how to install/run the code, etc...)
    • Add/update necessary information to the Usage section (how to actually play with the code)

Random Ideas:

  • Main problem is the first frame and how to detect lanes dynamically from it... Afterwards, the process should be iterative/relative
  • Could surround the detected lines by two parallel detected lines and constrain the search within them for the edges per frame, update them per frame too
    • This could keep track of the lanes and identify left from right
    • Also can help when either lane goes poof
  • Peeking center algorithm
    • If something's above it within some limit (3 pixels for example)
    • Keep moving left or right, pixel-by-pixel until you get out of its way
    • Deciding the direction is done by scanning left and right, closest pixel = move away from it (and commit in that direction)
    • If you hit a pixel while going in that direction, terminate
  • For non-initial lane detections, use the different algorithms you have, then pick the one that overlaps best
    • Overlapping = Their intersection / Their union (lane-wise) OR how many polyfit points lie within range of previous ones

(back to top)

Phase Two Details and Requirements

In this phase we are required to do more of a research on the topic of object detection in Computer Vision with the aid of artificial intelligence, namely machine and deep learning techniques such as CNNs, R-CNNs, ViTs, YOLO, etc.. We are also required to use YOLOv3 in object detection

Like in phase 1, the object detection will be visualized with boxes surrounding the detected objects. We are only concerned with cars.

Milestones are similar to those of phase 1.

(back to top)

Getting Started

Prepare the environment

Before you start, you need to install the following libraries:

  • MoviePy
    pip install moviepy
  • NumPy
    pip install numpy
  • OpenCV
    pip install opencv-python
  • Docopt
    pip install docopt

(back to top)

Usage

Windows

Note: Run this script in Anaconda Prompt powershell

File paths are relative to script location

./shell.ps1 [--verbose] [--debug] INPUT_PATH OUTPUT_PATH 

Linux/MacOS

python3 main.py [--verbose] [--debug] INPUT_PATH OUTPUT_PATH 

Options:

--verbose

show perspective transform, binary image

--debug

Enable debugging mode

About

This is a Simple Perception Stack for Self-Driving Cars; the project for CSE483 - Computer Vision course in the Faculty of Engineering, Ain Shams University.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages