Visual Odometry is a crucial concept in Robotics Perception for estimating the trajectory of the robot (the camera on the robot to be precise). The concepts involved in Visual Odometry are quite the same for SLAM which needless to say is an integral part of Perception.
In this project we take the driving sequence frames taken by a camera in a car then we estimate the 3D motion of the camera using the pipeline developed, and provide as output a plot of the trajectory of the camera. The dataset can be found in this folder.
This project is under the BSD License.
The ARTag detection and Tracking package requires following dependencies.
- Numpy
- random
- glob
- math
- matplotlib
- Python 3.5.4
- OpenCV 4.0.0
- Theinput images given to me was in the Bayer Format which we converted to the color images usingfollowing command.
color_image = cv2.cvtColor(img,cv2.COLOR_BayerGR2BGR)
- We then extracted the camera calibration matrix K using the following command.
fx, fy, cx, cy, Gcameraimage, LUT=ReadCameraModel(‘./model’)
- The images are undistorted using the following function in the extraction.py file.
Undistorted_image = UndistortImage(original_image,LUT)
- We are using the Oriented FAST and Rotated BRIEF(ORB)forfinding features in the imageand using BFMatcherfor matching those features betweentwo images.The following command in the code does that.
orb = cv2.ORB_create()
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)
- We then extract the points from the matches and store them in separate arrays. They are the forwarded to find the Fundamental Matrix.
- For determining the correspondence of the key-points between two consecutive frames, we define the fundamental matrix(F). Fundamental matrix is a transformation matrix that transforms the key-point in the first image frame to the epipolar line in the next frame. The point of correspondence lies on that epipolar line.
- Fundamental matrix can be determined for two consecutive frames if the corresponding key-points are given. This can be determined by any feature detector algorithm (eg. SIFT, FAST, ORB, etc). For this project we will use ORB feature detector. Feature detectors are rotation and scaling invariant. They provide the functionality of feature descriptor and feature matching.
- Now we have the matching key-points. Thus, we use 8-point algorithm
- We know that a correct Fundamental Matrix will verify the correct correspondence of the feature points. Thus, the following equation will be satisfied. Where F is fundamental matrix, x is the feature point array in image1 and x’ is a point array in image 2.
- As, the algorithm selects 8 points at random, there is a very high chance that the selected points are outliers. Thus, theresultant F matrix obtained is erroneous.
- To improve the result, we apply RANSAC to reject the outliers.
- RANSACis a common method of outlier rejection. The idea behind this to iteratively apply the 8-point algorithm by selecting the random points and calculating the number of inliers using the generated Fundamental matrix.
- Thus,by repeating this process for a specific number of times, we select the Fundamental Matrix that produces the highest number of inliers.•This method will improve the F Matrix generation as it makes sure the resultant F matrix has high number of inliers.
- The inliers are determined by the following equation where the inliers will have the value very close to zero and this will be compared by a threshold value.
- This, method looks as it will generate the accurate result for the fundamental matrix. But, the problem with this method is that it selects the 8 points randomly for the 8-point algorithm.
- This may lead to repetition of points or selecting the points from the same feature area. Hence, the result of the fundamental matrix is not accurate enough.
- To improve on this problem, we use method of Robust Fundamental Matrix Estimation by Zhang.
- In the following method we divide the image in 8x8 grids as shown below.
- Then, we select randomly a grid and the select a point from the grid randomly. Also, we make sure that no grids are repeated if other options are available.
- Then after selecting the 8-points we use the 8 point algorithm with RANSAC. (RANSAC will select random points in the random grid manner as mentioned above).
- This method improves the resultant F matrix as it avoids the repeated point selection.
- This method is implemented through the command below.
fmatrix, inliersPoints1, inliersPoints2 = findRANSAC(point1.T, point2.T, w, h)
- This essential matrix E is calculated by multiplying the calibration matrix calculated instep 1 as follows.
𝐄=𝐊𝐓𝐅𝐊
- The essential matrix is implemented through the following command
E1 = findEssentialMatrix(fmatrix, calibrationMatrix, calibrationMatrix.T)5)
- The camera poses is calculated from the essential matrix using the following formula.
C=−C and R=−R
3. The camera poses are obtained by implementing the following command.C1, R1, C2, R2, C3, R3, C4, R4 = getCameraPose(E1)
- The linear triangulation is done by first calculatingthe projection matrices for the two camera imagesusing all the four sets of camera poses.
- Then we take the correspondence pointsand calculatethe skew matrices and multiply with them with the corresponding projection matricesto calculate matricesA1 and A2whichwe then stack up andthen decompose it to find its eigen vectors.
- The X is the last column of V whichwe divide by the 4thelement to convert it into homogeneousformand store it.
- Cheirality check is done for all corresponding inliers and the R and C matrix that gives the maximum positivesfor the following equation is chosen.
r3(X−C)>0
- The triangulation is programmed in the function named linear Triangulate
- Due to the noise in the image the final camera poses which we get is noisy and hence need to be treated by performing some heuristic checks.
- We have conducted 5 heuristic checks which will be discussed in this section.
- The first check is that the Z coordinate of the pose should be positive considering that car is always moving forward.
- The second check is for checking the elements of the rotational matrices.
- The third check corrects the noisy readings in the y direction
- The fourth and fifth check is regarding restricting the rotation to only about Y-axis.
- To find the translation from the origin we multiply the transpose of the rotation matrix obtained from disambiguating and multiply it with the translation matrix. Then we multiply the negative of the rotation matrix from origin to the output of the previous multiplication and add it to the previous translation matrix to get the final distance travelled by the camera.
- The Rotation matrix is basically the previous rotation matrix multiplies by new rotation matrix obtained after disambiguating.
- The custom function performs the same things, but it is distributed in across various functions.
- Here the trajectory is found without knowing the scale. It is only possible to know the scale if the output is integrated with other inertial devices like the odometer.
Download the dependencies and follow the below commands to run the code.
- Open the extraction.py code in an editor or terminal
- Type “python extraction.py” in terminal or press run in an editor
- The code will post all the undistorted images in the undistorted folder
- Then open the visualOdometry.py code in an editor or terminal
- Type “python visualOdometry.py” in terminal or press run in an editor
- This code will post all the outputs in the output folder We have also provided a video of our codes output for you reference
The output will be printed in the folder of the code. We have also provided sample outputs.
The above pipeline is developed specifically of the given videos under data folder and is not generalized. We have provided following folders and files
- output: Here the output plots will be stored
- undistorted: Contains the undistorted images
- extraction.py: Contains code for reading and undistorting the images
- ReadCameraModel.py: Contains code for getting camera parameters
- UndistortImage.py: Contains function that undistorts image
- visualOdometry.py: This file runs the visual odometry code using the undistorted images.