Skip to content

This project was developed for identifying vehicles in a video stream. The project is a corner stone for a real time vehicle tracking algorithm that employ semantic pixel-wise methods. This project solves the tracking problem for the Udacity final project in a different way that the general approach presented in the course. Instead of using the …

Notifications You must be signed in to change notification settings

AloshkaD/P5_object_tracking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Object Tracking with Semantic Pixel-Wise Convolutional Neural Networks

Introduction

This project was developed for identifying vehicles in a video stream. The project is a corner stone for a real time vehicle tracking algorithm that employ semantic pixel-wise methods. This project solves the tracking problem for the Udacity final project using a different approach than that presented in the course. Instead of using the HOG features and other features extracted from the color space of the images, we used the U-Net[1] which is a convolutional network for biomedical image segmentation. We modified the original network with batch normalization in keras in order to speed up the network. There are many benefits to using neural networks for segmentation and tracking such as improved accuracy and portability. ConvNetts generally capable of performing segmentation tasks by transforming fully connected layers into convolution layers that output a spatial map for end-to-end pixel wise learning [6]. In this submission we also discuss the internal covariant shift issue in ConvNets and test two implementations of the same neural network with and without batch normalization.

Data Pre-processing

We have implemented camera calibration routine to the video file, however we haven't calibrated the images from the datasets. However, each image was normalized and then smoothed with a Gaussian filter. In order to train the network, we have masked each image without covering the cars that were reported in the CSV file. The images were randomly processed with a brightness filter to help the network generalize to different lighting conditions.

U-Net Convolutional Network

The U-Net has been proposed as a solution for the dstl satellite imagery feature detection[2]. It was primarily chosen based on the evaluation of fast segmentation ConvNets by Fabian Tschopp[9].The U-Net has two sections • Contracting: Two convolutions followed by one max pooling layer. • Expanding: Deconvolution followed by a convolution to reduce the number of feature maps, a mergecrop and two convolution layers. In this project we used the U-Net original implantation as it was described in its original paper [1] and modified it for batch normalization.

Batch Normalization

Internal covariant shift in deep neural networks affects the learning speed in ConvNets. Batch normalization was recently proposed to reduce the distribution of each layer’s input to accelerate the training process. It also reduces over fitting and eliminates the need for using dropout in the fully connected layers. In order to determine the usefulness of implementing batch normalization in neural networks that don’t use fully connected layers we evaluated the performance of encoder-decoder ConvNets with and without using batch normalization. We found that batch normalization increased the learning performance by 18% but also increased the learning time in each epoch by 26%. The figures below show the training accuracy plots from a U-Net with batch normalization implemented and without batch normalization implementation respectively.

Jaccard similarity coefficient

In evaluating the model I've investigated several metrics including the Mean IU[7] Intersection over Union [5] and the Jaccard coefficient [4]. The idea is to maximize the overlap between the predicted region and the ground truth bounding box.

We eventually decided to use the Jaccard coeef. The Jaccard similarity coefficient is defined as the size of the intersection divided by the size of the union of two regions. This metric is used to compare the predicted labels to a set of labels in y_true [4].

The coefficients are given by

J(A,B) = |A∩B| / |A∪B|=|A∩B|/|A|+|B|-|A∩B|

(If A and B are both empty, we define J(A,B) = 1.)

Image source [4]

Training

The datasets were obtained from [11] and contained two sets of annotated data. The first batch of datasets contained over 65,000 labels that were obtained from 9,423 frames with a total size of 1.5GB. A Point Grey research cameras running at full resolution of 1920x1200 at 2hz were used in creating the images. The dataset was annotated by CrowdAI using a combination of machine learning and crowed sourcing. The second batch of datasets was similar to the first datasets but contained additional fields for occlusion and an additional label for traffic lights. The dataset was annotated entirely by humans using Autti and had 15,000 frames with a total size of 3.3GB.

We used Adam optimizer for training the network with a training rate of 1e-4. The used loss method was Binary Crossentropy. We used a GPU AWS node with 12GB Nvidia TitanX but despite using this machine the batch size was reduced to 1 image because increasing it crashed the machine. The total number of epochs was 5 with 3000 images each. The training time for each epoch was 817 seconds with batch normalization and 611 seconds without batch normalization. We have decided to train with 5 epochs only because increasing the number to 10 had a slight improve on the training accuracy from 75% to 78% only when the network with batch normalization was used. We have decided to use 3000 images only because the GPU occasionally crashed with sized larger than 5000 images. The figures for the training accuracy were shown

Results

[![Object tracking](https://img.youtube.com/vi/IbvwsHkxv8E/0.jpg)] (https://www.youtube.com/watch?v=IbvwsHkxv8E)

Reflections and limitations

This was an awesome project, I've learned so much reading and implementing various approaches to solve it. I took an unorthodox approach to solving it by using Neural networks instead of computer vision. I'm currently improving it by applying new techniques to turn the project into a tracking algorithms with neural networks. I will also improve the algorithm to remove the false positive detection seen in some frames. I will also test a mIoU metric for evaluating the network because it is a more stringent metric than class average accuracy since it penalizes false positive predictions[8].

Acknowledgment

I can't thank Udacity enough for this great opportunity to learn state-of-the-art technology. The course materials and provided methods were the foundation for the project. I've also learned so much from Stanford university's deep learning course [3] which was very helpful in selecting what ConvNet I should choose. I would like to give credits to Artur Kuzin who's code on kaggle [2] was very helpful in solving sticky problems. This work could have not been possible without the fascinating posts of Vivek Yadav [12] who was an inspiration and helpful is solving countless problems. Finally, this repository [10] was a great source for segmentation techniques using ConvNets.

Refrences

[1]https://arxiv.org/abs/1505.04597

[2]https://www.kaggle.com/drn01z3/dstl-satellite-imagery-feature-detection/end-to-end-baseline-with-u-net-keras/code

[3]Stanford University CS224d: Deep Learning for Natural Language, https://www.youtube.com/watch?v=2xtx-gk3PqY

[4]https://en.wikipedia.org/wiki/Jaccard_index

[5]http://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/

[6]https://arxiv.org/pdf/1605.06211.pdf

[7]https://github.com/martinkersner/py_img_seg_eval

[8]https://arxiv.org/pdf/1511.00561.pdf

[9]https://arxiv.org/pdf/1509.03371.pdf

[10]https://github.com/kjw0612/awesome-deep-vision

[11]https://github.com/edrones/self-driving-car/tree/master/annotations

[12]https://chatbotslife.com/small-u-net-for-vehicle-detection-9eec216f9fd6#.45e4cn1xk

% matplotlib inline
import cv2
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from moviepy.editor import VideoFileClip
from IPython.display import HTML
import numpy as np
import glob
import time
from keras.models import Model
from keras.layers import Input, merge, Convolution2D, MaxPooling2D, UpSampling2D,Lambda
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras import backend as K
from scipy.ndimage.measurements import label
import pandas as pd
import keras.backend as K
from sklearn.metrics import jaccard_similarity_score
from keras.layers import Input, Dense, BatchNormalization, merge, Activation
import time
## Read in data from "object detection crowdai" and plot it to check for any issues with the datasets

data_15GB = pd.read_csv('object-detection-crowdai/labels.csv', header=0)


data_15GB.head()
xmin xmax ymin ymax Frame Label Preview URL
0 785 533 905 644 1479498371963069978.jpg Car http://crowdai.com/images/Wwj-gorOCisE7uxA/vis...
1 89 551 291 680 1479498371963069978.jpg Car http://crowdai.com/images/Wwj-gorOCisE7uxA/vis...
2 268 546 383 650 1479498371963069978.jpg Car http://crowdai.com/images/Wwj-gorOCisE7uxA/vis...
3 455 522 548 615 1479498371963069978.jpg Truck http://crowdai.com/images/Wwj-gorOCisE7uxA/vis...
4 548 522 625 605 1479498371963069978.jpg Truck http://crowdai.com/images/Wwj-gorOCisE7uxA/vis...
# the majority of labels are for cars and therefore I decided to train the classifier 

car_label = data_15GB[(data_15GB['Label']=='Car')]
truck_label = data_15GB[(data_15GB['Label']=='Truck')]
print ("Car Label",len(car_label))
print ("Truck Label",len(truck_label))
Car Label 62570
Truck Label 3819
# Perform data cleanup and getrid on the fields that aren't required such as the URL
cars_15GB = data_15GB[(data_15GB['Label']=='Car') | (data_15GB['Label']=='Truck')].reset_index()
cars_15GB  = cars_15GB.drop('index', 1)
cars_15GB['File_Path'] =  'object-detection-crowdai' + '/' +cars_15GB['Frame']
cars_15GB  = cars_15GB.drop('Preview URL', 1)
cars_15GB.head()
xmin xmax ymin ymax Frame Label File_Path
0 785 533 905 644 1479498371963069978.jpg Car object-detection-crowdai/1479498371963069978.jpg
1 89 551 291 680 1479498371963069978.jpg Car object-detection-crowdai/1479498371963069978.jpg
2 268 546 383 650 1479498371963069978.jpg Car object-detection-crowdai/1479498371963069978.jpg
3 455 522 548 615 1479498371963069978.jpg Truck object-detection-crowdai/1479498371963069978.jpg
4 548 522 625 605 1479498371963069978.jpg Truck object-detection-crowdai/1479498371963069978.jpg
# Read in data from "objectdataset" and plot it to check for any issues with the datasets. The datasets aren't labeled
# and contain classes other than cars and trucks 

data_35GB = pd.read_csv('object-dataset/labels.csv', header=0)
data_35GB.head()

 
1478019952686311006.jpg 950 574 1004 620 0 car Unnamed: 7
0 1478019952686311006.jpg 1748 482 1818 744 0 pedestrian NaN
1 1478019953180167674.jpg 872 586 926 632 0 car NaN
2 1478019953689774621.jpg 686 566 728 618 1 truck NaN
3 1478019953689774621.jpg 716 578 764 622 0 car NaN
4 1478019953689774621.jpg 826 580 880 626 0 car NaN
data_35GB.columns= ['Frame',  'xmin', 'xmax', 'ymin','ymax', 'class', 'Label','Unnamed']
car_label = data_35GB[(data_35GB['Label']=='car')]
truck_label = data_35GB[(data_35GB['Label']=='truck')]
pedestrian_label = data_35GB[(data_35GB['Label']=='pedestrian')]
everything_else_label = data_35GB[(data_35GB['Label']!='pedestrian')&(data_35GB['Label']!='truck')&(data_35GB['Label']!='car')]
print ("Car Labels",len(car_label))
print ("Truck Labels",len(truck_label))
print ("Pedestrian labels",len(pedestrian_label))
print ("everything else label",len(everything_else_label))
Car Labels 60787
Truck Labels 3503
Pedestrian labels 9866
everything else label 18929
cars_35GB = data_35GB[(data_35GB['Label']=='car') | (data_35GB['Label']=='truck')].reset_index()
cars_35GB = cars_35GB.drop('index', 1)
cars_35GB = cars_35GB.drop('class', 1)
cars_35GB = cars_35GB.drop('Unnamed', 1)
cars_35GB['File_Path'] =  'object-dataset' + '/' +cars_35GB['Frame']
cars_35GB.head() 
Frame xmin xmax ymin ymax Label File_Path
0 1478019953180167674.jpg 872 586 926 632 car object-dataset/1478019953180167674.jpg
1 1478019953689774621.jpg 686 566 728 618 truck object-dataset/1478019953689774621.jpg
2 1478019953689774621.jpg 716 578 764 622 car object-dataset/1478019953689774621.jpg
3 1478019953689774621.jpg 826 580 880 626 car object-dataset/1478019953689774621.jpg
4 1478019953689774621.jpg 1540 488 1680 608 car object-dataset/1478019953689774621.jpg
cars_15_35GB = pd.concat([cars_15GB,cars_35GB]).reset_index()
cars_15_35GB = cars_15_35GB.drop('index',1)
cars_15_35GB.columns =['File_Path','Frame','Label','ymin','xmin','ymax','xmax']
cars_15_35GB.head()
print(len(cars_15_35GB))
130679
#test we have the correct pathname
file_name = cars_15_35GB['File_Path'][1]
img_name = file_name.split('/',1)[1]
print(img_name)
print(file_name)
1479498371963069978.jpg
object-detection-crowdai/1479498371963069978.jpg
#Method to read image. CV2 reads images in BGR and the simulator provides images in RGB. Therefore convert to 
#RGB domain
def read_img(img):
    img = cv2.imread(img)
    img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
    return img

#Image brigtness changing method, based on Vivek Yadav's [2] approach for changing image brightness
def brightness_images(img):
    post_img = cv2.cvtColor(img,cv2.COLOR_RGB2HSV)
    random_bright = .25+np.random.uniform()
    post_img[:,:,2] = np.multiply(post_img[:,:,2],random_bright)
    post_img = cv2.cvtColor(post_img,cv2.COLOR_HSV2RGB)
    return post_img
# Another approach to adjust brightness used for experimentation 
def brightness_images_2(img):
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)  
    h, s, v = cv2.split(hsv)
    v += 255
    final_hsv = cv2.merge((h, s, v))
    img = cv2.cvtColor(final_hsv, cv2.COLOR_HSV2BGR)
    return img
# Resize the image to the givin dimensions 
def resize_img(image, col, row):
    image = cv2.resize(image, (col,row), interpolation=cv2.INTER_AREA)
    return image
# Crop away the car hood from the orginal image  
def crop_img(img):
    shape = img.shape
    img = img[0:shape[0]-20,0:shape[1]]
    img = resize_img(img, 64, 64)
    return img

# Define a function to draw bounding boxes
def draw_boxes(img, bboxes, color=(0, 0, 255), thick=6):
    # Make a copy of the image
    imcopy = np.copy(img)
    # Iterate through the bounding boxes
    for bbox in bboxes:
        # Draw a rectangle given bbox coordinates
        cv2.rectangle(imcopy, bbox[0], bbox[1], color, thick)
    # Return the image copy with boxes drawn
    return imcopy
def gaussian_noise(img, kernel_size):
    """Applies a Gaussian Noise kernel"""
    return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)

def transform(img):
    imshape = img.shape
    img_size = (img.shape[1], img.shape[0])
    #src=np.float32([[160,imshape[0]],[imshape[1]/2-60, imshape[0]/2+90],[imshape[1]/2+100, imshape[0]/2+90], [imshape[1]-20,imshape[0]]])
    #dst=np.float32([[(240,imshape[0]),(240, 0),(imshape[1]-130, 0), (imshape[1]-130,imshape[0])]])
    src = np.float32([[490, 482],[810, 482],
                     [1250, 720],[40, 720]])
    dst = np.float32([[0, 0], [1280, 0], 
                     [1250, 720],[40, 720]])
    
    M = cv2.getPerspectiveTransform(src, dst)
    Minv = cv2.getPerspectiveTransform(dst, src)
    wraped =  cv2.warpPerspective(img,M,img_size, flags=cv2.INTER_LINEAR)
    
    return  Minv, wraped


def draw_boxes(img, bboxes, color=(0, 0, 255), thick=6):
    # Make a copy of the image
    imcopy = np.copy(img)
    # Iterate through the bounding boxes
    for bbox in bboxes:
        # Draw a rectangle given bbox coordinates
        cv2.rectangle(imcopy, bbox[0], bbox[1], color, thick)
    # Return the image copy with boxes drawn
    return imcopy

#Implement calibration on the images that will be used
def undistort(img, read=True, display=True, write=False):

# Test undistortion on an image
    
    if read:
        img = cv2.imread(img)
    img_size = (img.shape[1], img.shape[0])
#img = cv2.cvtColor(dst, cv2.COLOR_BGR2RGB)
# Do camera calibration given object points and image points
    ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, img_size,None,None)
    dst = cv2.undistort(img, mtx, dist, None, mtx)
#dst = cv2.cvtColor(dst, cv2.COLOR_RGB2BGR)
    if write:
        cv2.imwrite('Undistorted/test6.jpg',dst)
# Save the camera calibration result for later use (we won't worry about rvecs / tvecs)
#dist_pickle = {}
#dist_pickle["mtx"] = mtx
#dist_pickle["dist"] = dist
#pickle.dump( dist_pickle, open( "calibration_wide/wide_dist_pickle.p", "wb" ) )
#dst = cv2.cvtColor(dst, cv2.COLOR_BGR2RGB)
# Visualize undistortion
    if display:
        f, (ax1, ax2) = plt.subplots(1, 2, figsize=(20,10))
        img_RGB=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        ax1.imshow(img_RGB)
        ax1.set_title('Original Image', fontsize=30)
        dst_RGB=cv2.cvtColor(dst, cv2.COLOR_BGR2RGB)
        ax2.imshow(dst_RGB)
        ax2.set_title('Undistorted Image', fontsize=30)
    else:
        return dst
## Method to get the image and resize it if required
def im_read(img_path,classID,resize = True, size=(640,300),augumentation=True, display=True):
    #Return the index at which the image is first found in a list - Python
    file_name = img_path['File_Path'][classID]
    #img = read_img(file_name)
    img = cv2.imread(file_name)
    img_size = np.shape(img)
    #print(img_size )
    img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
    if resize == True:
        img = cv2.resize(img,size)
    if augumentation == True:
        img = brightness_images(img)
    #str.split(str="", num=string.count(str)).
    img_name = file_name.split('/',1)[1]
    bb_boxes = img_path[img_path['Frame'] == img_name].reset_index()
    img_size_post = np.shape(img)  
    bb_boxes['xmin'] = np.round(bb_boxes['xmin']/img_size[1]*img_size_post[1])
    bb_boxes['xmax'] = np.round(bb_boxes['xmax']/img_size[1]*img_size_post[1])
    bb_boxes['ymin'] = np.round(bb_boxes['ymin']/img_size[0]*img_size_post[0])
    bb_boxes['ymax'] = np.round(bb_boxes['ymax']/img_size[0]*img_size_post[0])
    bb_boxes['Area'] = (bb_boxes['xmax']- bb_boxes['xmin'])*(bb_boxes['ymax']- bb_boxes['ymin']) 
    if display == True:
        plt.figure(figsize=(16,10))
        plt.subplot(1,1,1)
        plot_im_bbox(img,bb_boxes)
    return img_name,img,bb_boxes

def get_mask_seg(img,bb_boxes_f):
    img_mask = np.zeros_like(img[:,:,0])
    for i in range(len(bb_boxes_f)):
        bb_box_i = [bb_boxes_f.iloc[i]['xmin'],bb_boxes_f.iloc[i]['ymin'],
                bb_boxes_f.iloc[i]['xmax'],bb_boxes_f.iloc[i]['ymax']]
        img_mask[bb_box_i[1]:bb_box_i[3],bb_box_i[0]:bb_box_i[2]]= 1.
        img_mask = np.reshape(img_mask,(np.shape(img_mask)[0],np.shape(img_mask)[1],1))
    return img_mask

def plot_bbox(bb_boxes,ind_bb,color='r',linewidth=2):
    bb_box_i = [bb_boxes.iloc[ind_bb]['xmin'],
                bb_boxes.iloc[ind_bb]['ymin'],
                bb_boxes.iloc[ind_bb]['xmax'],
                bb_boxes.iloc[ind_bb]['ymax']]
    plt.plot([bb_box_i[0],bb_box_i[2],bb_box_i[2],
                  bb_box_i[0],bb_box_i[0]],
             [bb_box_i[1],bb_box_i[1],bb_box_i[3],
                  bb_box_i[3],bb_box_i[1]],
             color,linewidth=linewidth)
def plot_im_bbox(im,bb_boxes):
    plt.imshow(im)
    for i in range(len(bb_boxes)):
        plot_bbox(bb_boxes,i,'g')

        bb_box_i = [bb_boxes.iloc[i]['xmin'],bb_boxes.iloc[i]['ymin'],
                bb_boxes.iloc[i]['xmax'],bb_boxes.iloc[i]['ymax']]
    plt.axis('off'); 
 
test_img_name,img,bb_boxes = im_read(cars_15_35GB,120,resize = True, size=(640,300), augumentation=False, display = True)

png

# Test an image with it's bounding boxes
#print (cars_15_35GB)
test_img_name,img,bb_boxes = im_read(cars_15_35GB,100,resize = True, size=(640,300), augumentation=False, display = False)
img_masked = get_mask_seg(img,bb_boxes)
plt.figure(figsize=(14,8))
plt.subplot(1,3,1)
plot_im_bbox(img,bb_boxes)
plt.title('Raw image')
plt.axis('off')
plt.subplot(1,3,2)
plt.imshow(img_masked[:,:,0])
plt.title('Raw image')
plt.axis('off')
plt.subplot(1,3,3)
plt.imshow(cv2.bitwise_and(img,img,mask=img_masked))
plt.title('Raw image')
plt.axis('off')
(-0.5, 639.5, 299.5, -0.5)

png

def generate_train_batch(data,batch_size):
    #col=size[0]
    #row=size[1]
    batch_images = np.zeros((batch_size, row, col, 3))
    #batch_steering = np.zeros(batch_size)
    batch_masks = np.zeros((batch_size, row, col, 1))
    training_batch = len(data)-100000
    while 1:
        for i_batch in range(batch_size):
            
            process_line = np.random.randint(training_batch)
            #generator_csv= data.iloc[[process_line]].reset_index()
            #x,y = all_filters_train(generator_csv)
            #print(data)
            img_name,img,bb_boxes = im_read(data,process_line,resize = True, size=(col,row), augumentation=True, display = False)
            img_masked = get_mask_seg(img,bb_boxes)
            batch_images[i_batch] = img
            batch_masks[i_batch] = img_masked
        yield batch_images, batch_masks

def generate_test_batch(data,batch_size):
    batch_images = np.zeros((batch_size, row, col, 3))
    #batch_steering = np.zeros(batch_size)
    batch_masks = np.zeros((batch_size, row, col, 1))
    training_batch = len(data)-100000
    while 1:
        for i_batch in range(batch_size):
            process_line = np.random.randint(training_batch)
            process_line = process_line+training_batch
            #generator_csv= data.iloc[[process_line]].reset_index()
            #x,y = all_filters_train(generator_csv)
            img_name,img,bb_boxes = im_read(data,process_line,resize = True, size=(col,row), augumentation=True, display = False)
            img_masked = get_mask_seg(img,bb_boxes)
            batch_images[i_batch] = img
            batch_masks[i_batch] = img_masked
        yield batch_images, batch_masks
"""        
def generate_validation_patch(data):
    while 1:
        for process_line in range(len(data)):
            generator_csv = data.iloc[[process_line]].reset_index()
            x = all_filters_validate(data)
            x = x.reshape(1, x.shape[0], x.shape[1], x.shape[2])
            y = generator_csv['steering'][0]
            y = np.array([[y]])
            yield x, y
def save_model(fileModelJSON,fileWeights):
    if Path(fileModelJSON).is_file():
        os.remove(fileModelJSON)
    json_string = model.to_json()
    with open(fileModelJSON,'w' ) as f:
        json.dump(json_string, f)
    if Path(fileWeights).is_file():
        os.remove(fileWeights)
    model.save_weights(fileWeights)
    
"""
"        \ndef generate_validation_patch(data):\n    while 1:\n        for process_line in range(len(data)):\n            generator_csv = data.iloc[[process_line]].reset_index()\n            x = all_filters_validate(data)\n            x = x.reshape(1, x.shape[0], x.shape[1], x.shape[2])\n            y = generator_csv['steering'][0]\n            y = np.array([[y]])\n            yield x, y\ndef save_model(fileModelJSON,fileWeights):\n    if Path(fileModelJSON).is_file():\n        os.remove(fileModelJSON)\n    json_string = model.to_json()\n    with open(fileModelJSON,'w' ) as f:\n        json.dump(json_string, f)\n    if Path(fileWeights).is_file():\n        os.remove(fileWeights)\n    model.save_weights(fileWeights)\n    \n"
row = 640
col = 960
# for custom metrics
#custom metrix source : https://keras.io/metrics/
#Use this as a template
#def mean_pred(y_true, y_pred):
    #return K.mean(y_pred)

#def false_rates(y_true, y_pred):
    #false_neg = ...
    #false_pos = ...
    #return {
        #'false_neg': false_neg,
        #'false_pos': false_pos,
    #}

#model.compile(optimizer='rmsprop',
              #loss='binary_crossentropy',
             # metrics=['accuracy', mean_pred, false_rates])

def jaccard_coef(y_true, y_pred):
    # __author__ = Vladimir Iglovikov
    intersection = K.sum(y_true * y_pred, axis=[0, -1, -2])
    sum_ = K.sum(y_true + y_pred, axis=[0, -1, -2])

    jac = (intersection + smooth) / (sum_ - intersection + smooth)

    return K.mean(jac)

def jaccard_coef_int(y_true, y_pred):
    # __author__ = Vladimir Iglovikov
    y_pred_pos = K.round(K.clip(y_pred, 0, 1))

    intersection = K.sum(y_true * y_pred_pos, axis=[0, -1, -2])
    sum_ = K.sum(y_true + y_pred, axis=[0, -1, -2])
    jac = (intersection + smooth) / (sum_ - intersection + smooth)
    return K.mean(jac)

#to do: fbeta_score(y_true, y_pred, beta=1) 
# U-Net with batch normalization

def get_small_unet():
    inputs = Input((row,col,3))
    inputs_norm = Lambda(lambda x: x/127.5 - 1.)
    
    ##
    #layer=Dense(d, init='glorot_normal', bias=True)(x)
    #layer=BatchNormalization()(layer)
    #layer=Activation('relu')(layer)
    
    conv1 = Convolution2D(8, 3, 3, border_mode='same')(inputs)
    conv1 = BatchNormalization()(conv1)
    conv1 = Activation('relu')(conv1)
    
    conv1 = Convolution2D(8, 3, 3, border_mode='same')(conv1)
    conv1 = BatchNormalization()(conv1)
    conv1 = Activation('relu')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Convolution2D(16, 3, 3, border_mode='same')(pool1)
    conv2 = BatchNormalization()(conv2)
    conv2 = Activation('relu')(conv2)
    
    
    conv2 = Convolution2D(16, 3, 3, border_mode='same')(conv2)
    conv2 = BatchNormalization()(conv2)
    conv2 = Activation('relu')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

    conv3 = Convolution2D(32, 3, 3, border_mode='same')(pool2)
    conv3 = BatchNormalization()(conv3)
    conv3 = Activation('relu')(conv3)
    
    
    conv3 = Convolution2D(32, 3, 3, border_mode='same')(conv3)
    conv3 = BatchNormalization()(conv3)
    conv3 = Activation('relu')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)

    conv4 = Convolution2D(64, 3, 3, border_mode='same')(pool3)
    conv4 = BatchNormalization()(conv4)
    conv4 = Activation('relu')(conv4)
    
    conv4 = Convolution2D(64, 3, 3, border_mode='same')(conv4)
    conv4 = BatchNormalization()(conv4)
    conv4 = Activation('relu')(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)

    conv5 = Convolution2D(128, 3, 3, border_mode='same')(pool4)
    conv5 = BatchNormalization()(conv5)
    conv5 = Activation('relu')(conv5)
    
    conv5 = Convolution2D(128, 3, 3, border_mode='same')(conv5)
    conv5 = BatchNormalization()(conv5)
    conv5 = Activation('relu')(conv5)

    up6 = merge([UpSampling2D(size=(2, 2))(conv5), conv4], mode='concat', concat_axis=3)
    conv6 = Convolution2D(64, 3, 3, border_mode='same')(up6)
    conv6 = BatchNormalization()(conv6)
    conv6 = Activation('relu')(conv6)
    
    conv6 = Convolution2D(64, 3, 3, border_mode='same')(conv6)
    conv6 = BatchNormalization()(conv6)
    conv6 = Activation('relu')(conv6)

    up7 = merge([UpSampling2D(size=(2, 2))(conv6), conv3], mode='concat', concat_axis=3)
    conv7 = Convolution2D(32, 3, 3, border_mode='same')(up7)
    conv7 = BatchNormalization()(conv7)
    conv7 = Activation('relu')(conv7)
    
    conv7 = Convolution2D(32, 3, 3, border_mode='same')(conv7)
    conv7 = BatchNormalization()(conv7)
    conv7 = Activation('relu')(conv7)

    up8 = merge([UpSampling2D(size=(2, 2))(conv7), conv2], mode='concat', concat_axis=3)
    conv8 = Convolution2D(16, 3, 3, border_mode='same')(up8)
    conv8 = BatchNormalization()(conv8)
    conv8 = Activation('relu')(conv8)
    
    
    conv8 = Convolution2D(16, 3, 3, border_mode='same')(conv8)
    conv8 = BatchNormalization()(conv8)
    conv8 = Activation('relu')(conv8)

    up9 = merge([UpSampling2D(size=(2, 2))(conv8), conv1], mode='concat', concat_axis=3)
    conv9 = Convolution2D(8, 3, 3, border_mode='same')(up9)
    conv9 = BatchNormalization()(conv9)
    conv9 = Activation('relu')(conv9)
    
    conv9 = Convolution2D(8, 3, 3, border_mode='same')(conv9)
    conv9 = BatchNormalization()(conv9)
    conv9 = Activation('relu')(conv9)

    conv10 = Convolution2D(1, 1, 1, activation='sigmoid')(conv9)

    model = Model(input=inputs, output=conv10)

    
    return model
# U-Net without batch normalization
"""
def get_small_unet():
    inputs = Input((img_rows, img_cols,3))
    inputs_norm = Lambda(lambda x: x/127.5 - 1.)
    conv1 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(inputs)
    conv1 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(pool1)
    conv2 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

    conv3 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(pool2)
    conv3 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)

    conv4 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(pool3)
    conv4 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)

    conv5 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(pool4)
    conv5 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(conv5)

    up6 = merge([UpSampling2D(size=(2, 2))(conv5), conv4], mode='concat', concat_axis=3)
    conv6 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(up6)
    conv6 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(conv6)

    up7 = merge([UpSampling2D(size=(2, 2))(conv6), conv3], mode='concat', concat_axis=3)
    conv7 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(up7)
    conv7 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv7)

    up8 = merge([UpSampling2D(size=(2, 2))(conv7), conv2], mode='concat', concat_axis=3)
    conv8 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(up8)
    conv8 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(conv8)

    up9 = merge([UpSampling2D(size=(2, 2))(conv8), conv1], mode='concat', concat_axis=3)
    conv9 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(up9)
    conv9 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(conv9)

    conv10 = Convolution2D(1, 1, 1, activation='sigmoid')(conv9)

    model = Model(input=inputs, output=conv10)


    return model
"""
### Generator 

training_gen = generate_train_batch(cars_15_35GB,1)
smooth = 1.
model = get_small_unet()
model.compile(optimizer=Adam(lr=1e-4), loss='binary_crossentropy', metrics=[jaccard_coef, jaccard_coef_int, 'accuracy'])
model.summary()
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_3 (InputLayer)             (None, 640, 960, 3)   0                                            
____________________________________________________________________________________________________
convolution2d_39 (Convolution2D) (None, 640, 960, 8)   224         input_3[0][0]                    
____________________________________________________________________________________________________
batchnormalization_37 (BatchNorm (None, 640, 960, 8)   32          convolution2d_39[0][0]           
____________________________________________________________________________________________________
activation_37 (Activation)       (None, 640, 960, 8)   0           batchnormalization_37[0][0]      
____________________________________________________________________________________________________
convolution2d_40 (Convolution2D) (None, 640, 960, 8)   584         activation_37[0][0]              
____________________________________________________________________________________________________
batchnormalization_38 (BatchNorm (None, 640, 960, 8)   32          convolution2d_40[0][0]           
____________________________________________________________________________________________________
activation_38 (Activation)       (None, 640, 960, 8)   0           batchnormalization_38[0][0]      
____________________________________________________________________________________________________
maxpooling2d_9 (MaxPooling2D)    (None, 320, 480, 8)   0           activation_38[0][0]              
____________________________________________________________________________________________________
convolution2d_41 (Convolution2D) (None, 320, 480, 16)  1168        maxpooling2d_9[0][0]             
____________________________________________________________________________________________________
batchnormalization_39 (BatchNorm (None, 320, 480, 16)  64          convolution2d_41[0][0]           
____________________________________________________________________________________________________
activation_39 (Activation)       (None, 320, 480, 16)  0           batchnormalization_39[0][0]      
____________________________________________________________________________________________________
convolution2d_42 (Convolution2D) (None, 320, 480, 16)  2320        activation_39[0][0]              
____________________________________________________________________________________________________
batchnormalization_40 (BatchNorm (None, 320, 480, 16)  64          convolution2d_42[0][0]           
____________________________________________________________________________________________________
activation_40 (Activation)       (None, 320, 480, 16)  0           batchnormalization_40[0][0]      
____________________________________________________________________________________________________
maxpooling2d_10 (MaxPooling2D)   (None, 160, 240, 16)  0           activation_40[0][0]              
____________________________________________________________________________________________________
convolution2d_43 (Convolution2D) (None, 160, 240, 32)  4640        maxpooling2d_10[0][0]            
____________________________________________________________________________________________________
batchnormalization_41 (BatchNorm (None, 160, 240, 32)  128         convolution2d_43[0][0]           
____________________________________________________________________________________________________
activation_41 (Activation)       (None, 160, 240, 32)  0           batchnormalization_41[0][0]      
____________________________________________________________________________________________________
convolution2d_44 (Convolution2D) (None, 160, 240, 32)  9248        activation_41[0][0]              
____________________________________________________________________________________________________
batchnormalization_42 (BatchNorm (None, 160, 240, 32)  128         convolution2d_44[0][0]           
____________________________________________________________________________________________________
activation_42 (Activation)       (None, 160, 240, 32)  0           batchnormalization_42[0][0]      
____________________________________________________________________________________________________
maxpooling2d_11 (MaxPooling2D)   (None, 80, 120, 32)   0           activation_42[0][0]              
____________________________________________________________________________________________________
convolution2d_45 (Convolution2D) (None, 80, 120, 64)   18496       maxpooling2d_11[0][0]            
____________________________________________________________________________________________________
batchnormalization_43 (BatchNorm (None, 80, 120, 64)   256         convolution2d_45[0][0]           
____________________________________________________________________________________________________
activation_43 (Activation)       (None, 80, 120, 64)   0           batchnormalization_43[0][0]      
____________________________________________________________________________________________________
convolution2d_46 (Convolution2D) (None, 80, 120, 64)   36928       activation_43[0][0]              
____________________________________________________________________________________________________
batchnormalization_44 (BatchNorm (None, 80, 120, 64)   256         convolution2d_46[0][0]           
____________________________________________________________________________________________________
activation_44 (Activation)       (None, 80, 120, 64)   0           batchnormalization_44[0][0]      
____________________________________________________________________________________________________
maxpooling2d_12 (MaxPooling2D)   (None, 40, 60, 64)    0           activation_44[0][0]              
____________________________________________________________________________________________________
convolution2d_47 (Convolution2D) (None, 40, 60, 128)   73856       maxpooling2d_12[0][0]            
____________________________________________________________________________________________________
batchnormalization_45 (BatchNorm (None, 40, 60, 128)   512         convolution2d_47[0][0]           
____________________________________________________________________________________________________
activation_45 (Activation)       (None, 40, 60, 128)   0           batchnormalization_45[0][0]      
____________________________________________________________________________________________________
convolution2d_48 (Convolution2D) (None, 40, 60, 128)   147584      activation_45[0][0]              
____________________________________________________________________________________________________
batchnormalization_46 (BatchNorm (None, 40, 60, 128)   512         convolution2d_48[0][0]           
____________________________________________________________________________________________________
activation_46 (Activation)       (None, 40, 60, 128)   0           batchnormalization_46[0][0]      
____________________________________________________________________________________________________
upsampling2d_9 (UpSampling2D)    (None, 80, 120, 128)  0           activation_46[0][0]              
____________________________________________________________________________________________________
merge_9 (Merge)                  (None, 80, 120, 192)  0           upsampling2d_9[0][0]             
                                                                   activation_44[0][0]              
____________________________________________________________________________________________________
convolution2d_49 (Convolution2D) (None, 80, 120, 64)   110656      merge_9[0][0]                    
____________________________________________________________________________________________________
batchnormalization_47 (BatchNorm (None, 80, 120, 64)   256         convolution2d_49[0][0]           
____________________________________________________________________________________________________
activation_47 (Activation)       (None, 80, 120, 64)   0           batchnormalization_47[0][0]      
____________________________________________________________________________________________________
convolution2d_50 (Convolution2D) (None, 80, 120, 64)   36928       activation_47[0][0]              
____________________________________________________________________________________________________
batchnormalization_48 (BatchNorm (None, 80, 120, 64)   256         convolution2d_50[0][0]           
____________________________________________________________________________________________________
activation_48 (Activation)       (None, 80, 120, 64)   0           batchnormalization_48[0][0]      
____________________________________________________________________________________________________
upsampling2d_10 (UpSampling2D)   (None, 160, 240, 64)  0           activation_48[0][0]              
____________________________________________________________________________________________________
merge_10 (Merge)                 (None, 160, 240, 96)  0           upsampling2d_10[0][0]            
                                                                   activation_42[0][0]              
____________________________________________________________________________________________________
convolution2d_51 (Convolution2D) (None, 160, 240, 32)  27680       merge_10[0][0]                   
____________________________________________________________________________________________________
batchnormalization_49 (BatchNorm (None, 160, 240, 32)  128         convolution2d_51[0][0]           
____________________________________________________________________________________________________
activation_49 (Activation)       (None, 160, 240, 32)  0           batchnormalization_49[0][0]      
____________________________________________________________________________________________________
convolution2d_52 (Convolution2D) (None, 160, 240, 32)  9248        activation_49[0][0]              
____________________________________________________________________________________________________
batchnormalization_50 (BatchNorm (None, 160, 240, 32)  128         convolution2d_52[0][0]           
____________________________________________________________________________________________________
activation_50 (Activation)       (None, 160, 240, 32)  0           batchnormalization_50[0][0]      
____________________________________________________________________________________________________
upsampling2d_11 (UpSampling2D)   (None, 320, 480, 32)  0           activation_50[0][0]              
____________________________________________________________________________________________________
merge_11 (Merge)                 (None, 320, 480, 48)  0           upsampling2d_11[0][0]            
                                                                   activation_40[0][0]              
____________________________________________________________________________________________________
convolution2d_53 (Convolution2D) (None, 320, 480, 16)  6928        merge_11[0][0]                   
____________________________________________________________________________________________________
batchnormalization_51 (BatchNorm (None, 320, 480, 16)  64          convolution2d_53[0][0]           
____________________________________________________________________________________________________
activation_51 (Activation)       (None, 320, 480, 16)  0           batchnormalization_51[0][0]      
____________________________________________________________________________________________________
convolution2d_54 (Convolution2D) (None, 320, 480, 16)  2320        activation_51[0][0]              
____________________________________________________________________________________________________
batchnormalization_52 (BatchNorm (None, 320, 480, 16)  64          convolution2d_54[0][0]           
____________________________________________________________________________________________________
activation_52 (Activation)       (None, 320, 480, 16)  0           batchnormalization_52[0][0]      
____________________________________________________________________________________________________
upsampling2d_12 (UpSampling2D)   (None, 640, 960, 16)  0           activation_52[0][0]              
____________________________________________________________________________________________________
merge_12 (Merge)                 (None, 640, 960, 24)  0           upsampling2d_12[0][0]            
                                                                   activation_38[0][0]              
____________________________________________________________________________________________________
convolution2d_55 (Convolution2D) (None, 640, 960, 8)   1736        merge_12[0][0]                   
____________________________________________________________________________________________________
batchnormalization_53 (BatchNorm (None, 640, 960, 8)   32          convolution2d_55[0][0]           
____________________________________________________________________________________________________
activation_53 (Activation)       (None, 640, 960, 8)   0           batchnormalization_53[0][0]      
____________________________________________________________________________________________________
convolution2d_56 (Convolution2D) (None, 640, 960, 8)   584         activation_53[0][0]              
____________________________________________________________________________________________________
batchnormalization_54 (BatchNorm (None, 640, 960, 8)   32          convolution2d_56[0][0]           
____________________________________________________________________________________________________
activation_54 (Activation)       (None, 640, 960, 8)   0           batchnormalization_54[0][0]      
____________________________________________________________________________________________________
convolution2d_57 (Convolution2D) (None, 640, 960, 1)   9           activation_54[0][0]              
====================================================================================================
Total params: 494,081
Trainable params: 492,609
Non-trainable params: 1,472
____________________________________________________________________________________________________
### Using previously trained data. Set load_pretrained = False, increase epochs and train for full training.
load_pretrained = True
if load_pretrained == True:
    model.load_weights("model_detect_Unet_BN_5000.h5")        
"""
history = model.fit_generator(training_gen,
            samples_per_epoch=2000, 
                              nb_epoch=6)
"""                            
'\nhistory = model.fit_generator(training_gen,\n            samples_per_epoch=2000, \n                              nb_epoch=6)\n'
#model.save('model_detect_Unet_BN_5000.h5')
### Save weights 
#model.save_weights("model_segn_small_udacity_0p71.h5", overwrite=True)
#print(history.history.keys())
## summarize history for accuracy
#plt.plot(history.history['acc'])
#plt.plot(history.history['val_acc'])
#plt.title('model accuracy')
#plt.ylabel('accuracy')
#plt.xlabel('epoch')
#plt.legend(['train', 'test'], loc='upper left')
#plt.show()
## summarize history for loss
#plt.plot(history.history['loss'])
#plt.plot(history.history['val_loss'])
#plt.title('model loss')
#plt.ylabel('loss')
#plt.xlabel('epoch')
#plt.legend(['train', 'test'], loc='upper left')
#plt.show()
smooth = 1.
heatmap_prev = np.zeros((640,960))
heatmap_10 = [np.zeros((640,960))]*10
def smooth_heatmap(heatmap):
    #Credit Vive Yadav
    # Smoothing heatmap as average of 10 previous frames
    global heatmap_10
    heatmap_10_1 = heatmap_10[1:]
    heatmap_10_1.append(heatmap)
    heatmap_10 = heatmap_10_1
    heatmap = np.mean(heatmap_10,axis=0)
    
    #heatmap = heatmap_prev*.2 + heatmap*.8
    #heatmap[heatmap>240] = 255
    #heatmap[heatmap<240] = 0
    return heatmap 
 

def next_img(img, resize=True):
    if resize == True:
        img = cv2.resize(img,(col,row))
        img = np.reshape(img,(1,row, col,3))
    pred = model.predict(img)
    return pred,img[0]

def draw_labeled_bboxes(img, labels):
    """Return image with bounding boxes drawn around the labelled regions.
    """
    for car_number in range(1, labels[1]+1):
        # Find pixels with each car_number label value
        nonzero = (labels[0] == car_number).nonzero()
        # Identify x and y values of those pixels
        nonzeroy = np.array(nonzero[0])
        nonzerox = np.array(nonzero[1])
        # Define a bounding box based on min/max x and y
        # increasing or reducing the sensetivity of bounding box noise
        if ((np.max(nonzeroy)-np.min(nonzeroy)>70) & (np.max(nonzerox)-np.min(nonzerox)>70)):
            bbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy)))      
            cv2.rectangle(img, bbox[0], bbox[1], (0,0,255),6)
    # Return the image
    return img

def get_BB_new_img(img):
    # Take in RGB image
    pred,img = next_img(img)
    img  = np.array(img,dtype= np.uint8)
    img_pred = np.array(255*pred[0],dtype=np.uint8)
    heatmap = im_pred[:,:,0]
    #Create an image with some features, then label it using the default (cross-shaped) structuring element:
    labels = label(heatmap)
    draw_img = draw_labeled_bboxes(np.copy(img), labels)
    return draw_img

def get_labeled_bboxes(img, labels):
    # Get labeled boxex
    bbox_all = []
    for car_number in range(1, labels[1]+1):
        # Find pixels with each car_number label value
        nonzero = (labels[0] == car_number).nonzero()
        # Identify x and y values of those pixels
        nonzeroy = np.array(nonzero[0])
        nonzerox = np.array(nonzero[1])
        
        # Define a bounding box based on min/max x and y
        if ((np.max(nonzeroy)-np.min(nonzeroy)> 40) & (np.max(nonzerox)-np.min(nonzerox)> 40)):
            bbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy)))
            # Draw the box on the image       
            #cv2.rectangle(img, bbox[0], bbox[1], (0,0,255),6)
            bbox_all.append(bbox)
    # Return the image
    return bbox_all

#credits Vivek Yadav
def get_BB_new(img):
    # Take in RGB image
    pred,img = next_img(img)
    img  = np.array(img,dtype= np.uint8)
    img_pred = np.array(255*pred[0],dtype=np.uint8)
    heatmap = img_pred[:,:,0]
    heatmap = smooth_heatmap(heatmap)
    #print(np.max(heatmap))
    heatmap[heatmap> 240] = 255
    heatmap[heatmap<=240] = 0    
    labels = label(heatmap)
    
    bbox_all = get_labeled_bboxes(np.copy(img), labels)
    return bbox_all

def get_Unet_mask(img):
    
    # Take in RGB image
    pred,img = next_img(img)
    img  = np.array(img,dtype= np.uint8)
    img_pred = np.array(255*pred[0],dtype=np.uint8)
    heatmap = img_pred[:,:,0]
    heatmap = smooth_heatmap(heatmap)
    labels = label(heatmap)
    return heatmap
start = time.time()
training_gen = generate_train_batch(cars_15_35GB,1)
batch_img,batch_mask = next(training_gen)
end = time.time()
end-start
0.209944486618042
start = time.time()
sample_imgs=1 
testing_gen = generate_test_batch(cars_15_35GB,sample_imgs)
pre_final_predictions= model.predict(batch_img)
end = time.time()
end-start
3.302438735961914
### Test on last frames of data
start = time.time()
batch_img,batch_mask = next(testing_gen)
pre_final_predictions= model.predict(batch_img)
np.shape(pre_final_predictions)
for i in range(sample_imgs):
    im=batch_img[i]
    pred,im = next_img(im)
    im  = np.array(im,dtype= np.uint8)
    im_pred = np.array(255*pred[0],dtype=np.uint8)
    im_mask = np.array(255*batch_mask[i],dtype=np.uint8)
    rgb_mask_true= cv2.cvtColor(im_mask,cv2.COLOR_GRAY2RGB)
    rgb_mask_true[:,:,0] = 0*rgb_mask_true[:,:,0]
    rgb_mask_true[:,:,2] = 0*rgb_mask_true[:,:,2]
    img_true = cv2.addWeighted(rgb_mask_true,0.70,im,0.70,0)
    rgb_mask_pred = cv2.cvtColor(im_pred,cv2.COLOR_GRAY2RGB)
    rgb_mask_pred[:,:,1:3] = 0*rgb_mask_pred[:,:,1:2]
    img_pred = cv2.addWeighted(rgb_mask_pred,0.70,im,1,0)
    draw_img = get_BB_new_img(im)
    plt.figure(figsize=(14,5))
    plt.subplot(1,4,1)
    plt.imshow(im)
    plt.title('Original')
    plt.axis('off')
    plt.subplot(1,4,2)
    plt.imshow(img_pred)
    plt.title('Segmented')
    plt.axis('off')
    plt.subplot(1,4,3)
    plt.imshow(draw_img)
    plt.title('Predicted')
    plt.axis('off')
    plt.subplot(1,4,4)
    plt.imshow(img_true)
    plt.title('Gtruth')
    plt.axis('off')

end = time.time()
end-start   
10.933016300201416

png

test_img = 'scene01021.jpg'
im = cv2.imread(test_img)
im = cv2.cvtColor(im,cv2.COLOR_BGR2RGB)
pred,im = next_img(im)
im  = np.array(im,dtype= np.uint8)
im_pred = np.array(255*pred[0],dtype=np.uint8)
rgb_mask_pred = cv2.cvtColor(im_pred,cv2.COLOR_GRAY2RGB)
rgb_mask_pred[:,:,1:3] = 0*rgb_mask_pred[:,:,1:2]

img_pred = cv2.addWeighted(rgb_mask_pred,0.70,im,1,0)

draw_img = get_BB_new_img(im)

plt.figure(figsize=(14,5))
plt.subplot(1,3,1)
plt.imshow(im)
plt.title('Original')
plt.axis('off')
plt.subplot(1,3,2)
plt.imshow(img_pred)
plt.title('Segmentated')
plt.axis('off')
plt.subplot(1,3,3)
plt.imshow(draw_img)
plt.title('Bounding Box')
plt.axis('off');

png

Performing camera calibration

Camera calibration is performed in order to correct the deformation in the images that is caused to the optic lens curvature. The first step is to print a chessboard and take random pictures of it. Then count the chess intersecting squires to provide "objp" which holds the (x,y,z) coordinates of these corners. Z=0 here and the object points are the same for all images in the calibration folder. The objpoints will be appended in "objp" every time the method successfully detect all chessboard corners in a test image. "imgpoints" will be appended with the (x, y) pixel position of each of the corners in the image plane with each successful chessboard detection. "objpoints" and "imgpoints" were used to compute the camera calibration and distortion coefficients using the "cv2.calibrateCamera()" function on a test image in "cv2.undistort()"

# prepare object points. The number of corners are 6x9
objp = np.zeros((6*9,3), np.float32)
objp[:,:2] = np.mgrid[0:9, 0:6].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
objpoints = [] # 3d points in real world space
imgpoints = [] # 2d points in image plane.
# Make a list of calibration images, all located in camera_cal
images = glob.glob('camera_cal/calibration*.jpg')
# Step through the list and search for chessboard corners
for idx, fname in enumerate(images):
    img = cv2.imread(fname)
    # imread reads images in BGR format
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    # Find the chessboard corners
    ret, corners = cv2.findChessboardCorners(gray, (9,6), None)
    # If found, add object points, image points
    if ret == True:
        objpoints.append(objp)
        imgpoints.append(corners)
        #Draw and display the corners
        cv2.drawChessboardCorners(img, (9,6), corners, ret)
        #write_name = 'corners_found'+str(idx)+'.jpg'
        #cv2.imwrite(write_name, img)
        #cv2.imshow('img', img)
        #cv2.waitKey(500)
#cv2.destroyAllWindows()

Perform distortion removal on test images¶

  1. Has the distortion correction been correctly applied to each image? Undistortion is performed on the provided test images before they are used in the pipeline. This also applies to the video frames. "dst" holds undistorted frames from "cv2.undistort" that were computed using "mtx".
for image_name in glob.glob("Frames/*"):
    im = cv2.imread(image_name)
    im = cv2.cvtColor(im,cv2.COLOR_BGR2RGB)
    im = undistort(im,read=False, display = False, write = False)
    pred,im = next_img(im)
    im  = np.array(im,dtype= np.uint8)
    im_pred = np.array(255*pred[0],dtype=np.uint8)
    rgb_mask_pred = cv2.cvtColor(im_pred,cv2.COLOR_GRAY2RGB)
    rgb_mask_pred[:,:,1:3] = 0*rgb_mask_pred[:,:,1:2]
    img_pred = cv2.addWeighted(rgb_mask_pred,0.50,im,1,0)
    draw_img = get_BB_new_img(im)
    plt.figure(figsize=(14,5))
    plt.subplot(1,3,1)
    plt.imshow(im)
    plt.title('Original')
    plt.axis('off')
    plt.subplot(1,3,2)
    plt.imshow(img_pred)
    plt.title('Segmentated')
    plt.axis('off')
    plt.subplot(1,3,3)
    plt.imshow(draw_img)
    plt.title('Bounding Box')
    plt.axis('off');

heatmap_10 = [np.zeros((640,960))]*10
    

png

png

png

png

png

png

png

png

png

png

png

png

png

png

png

def video_pipeline(image):
    #test_img = 'scene01021.jpg'
    #im = cv2.imread(im)
    #im = cv2.cvtColor(im,cv2.COLOR_BGR2RGB)
    #img = get_BB_new_img(im)
    # Apply bounding box to image
    image_bb = np.copy(image)
    bbox_cars = get_BB_new(image_bb)
    img_size = np.shape(image)
    result = image
    img_res_shape = result.shape
    for bbox in bbox_cars:
        cv2.rectangle(result,(np.int32(bbox[0][0]*img_res_shape[1]/960),np.int32(bbox[0][1]*img_res_shape[0]/640)), (np.int32(bbox[1][0]*img_res_shape[1]/960),np.int32(bbox[1][1]*img_res_shape[0]/640)),(0,255,0),6)
    #heatmap = get_Unet_mask(image_bb)
    #plt.imshow(img)
    return result
video_output = 'project_video_output.mp4'
clip1 = VideoFileClip("project_video.mp4")
white_clip = clip1.fl_image(video_pipeline) 
white_clip.write_videofile(video_output, audio=False)
[MoviePy] >>>> Building video project_video_output.mp4
[MoviePy] Writing video project_video_output.mp4


100%|█████████▉| 1260/1261 [1:13:58<00:04,  4.56s/it]  | 1/1261 [00:02<1:00:02,  2.86s/it]


[MoviePy] Done.
[MoviePy] >>>> Video ready: project_video_output.mp4 
from IPython.display import HTML
HTML("""
<video width="640" height="360" controls>
  <source src="{0}">
</video>
""".format('project_video_output.mp4'))

About

This project was developed for identifying vehicles in a video stream. The project is a corner stone for a real time vehicle tracking algorithm that employ semantic pixel-wise methods. This project solves the tracking problem for the Udacity final project in a different way that the general approach presented in the course. Instead of using the …

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published