=====================================
This research is done based on supercombo model commit 83112b47-3b23-48e4-b65b-8c058766f4c1/100 by comma.ai
The model has 4 inputs.
- Size (12,128,256) image
- One hot coding input of length 8
- rnn_state or the vehicle state from openpilot of length 512
- traffic_convention of length 2
The model image input will first be converted to YUV_I420 from BGR, this will reduce the dimension of the image by 1. The image is then transformed from eon_intrinsic frame to medmodel_intrinsic frame, a copy is saved for the recurrent network. The output of the transformed image is stacked together to form a (1,12,128,256) tensor.
The output of the model has 11 outputs
- 192 outputs points spaced 1m apart from the car reference frame of the path
- 192 left lane points spaced 1m apart from the car reference frame
- 192 right lane points spaced 1m apart from the car reference frame
- lead car state vector of the lead car of length 58
- longitudinal_x of length 200
- longitudinal_v of length 200
- longitudinal_a of length 200
- meta of length 4
- snpe_pleaser2 of length 4
- pose of length 32, used for posenet to get the homogenous transformation matrix between two device frame (attached to the camera) and the vehicle frame, read about calibration
- add3 (unknown what is this currently)
Under this objective, the designed model must be able to produce high accuracy in lane feature extraction. The model must then be able to do transfer learning to other critical tasks such as lead vehicle state vector estimation and driver's state monitoring. The model must include a recurrent unit.
The comparison of EfficientNet and other state of the other models are shown below:
In November 2019, Andrej Karpathy, the Senior Director of Artificial Intelligence at Tesla mentioned in the Pytorch Devcon that the autopilot has most of its models based off a ResNet50. Openpilot by comma.ai has been using ResNet18 for quite awhile until recently they have switched to the use of EfficientNet-B2.
EfficientNets are family of models which was optimised accuracy and floating operation per seconds (FLOPs). The baseline model was developed by leveraging a multi-objective neural architecture search. This has became the baseline called the EfficientNet-B0. The paper has demonstrated that the EfficientNet-B7 achieved state-of-the-art 84.3% top-1 accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on inference than best existing ConvNet. Not only that, this baseline model transfer well and achieve state-of-the-art accuracy on CIFAR-100 (91.7%), Flowers (98.8%), and 3 other transfer learning datasets, with an order of magnitude fewer parameters. For this reason, the EfficientNet is chosen for its efficiency and its application universality with transfer learning.
The decision between EfficientNet-B2 to B4 is still being considered considering the hardware constraints.
It is recommended to use Python 3.6 or above.
- Install all the necessary dependencies.
pip3 install -r requirements.txt
- Run the program
python3 dynamic_lane.py <path-to-sample-hevc>
- Update the requirements.txt
- Train a simple model which can do path prediction, driver monitoring and lead car state vector estimation.
- path_predict_simple_efficientnet. Input :