-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The meaning of several properties in dataset #2
Comments
Hi, |
Hi @kathy-lee To understand those objects, I will first explain globally the overall architecture (sorry if redundant, I may be explaining some things you already you already know). As explained in the paper, the voxelnet is made of Three parts : the Feature Learning Network which takes the voxelized pointcloud as input, the Convolutional NN and the region Proposal Network (RPN) which is the last part. The RPN outputs two tensors : the regresion map and the probability map.
Why are they called maps? It is in this sense that the voxelnet will output for a pointcloud:
We've already seen why the two first dimensions of the map are X and Y (birdview). Before explaining this, let's recall that anchors are the possible bounding boxes we can parameterize in our pointcloud grid space. Of course the real bounding boxes will not match some of those anchors perfectly, but well defined anchors must be in a way that real bounding boxes can find strong match (very close) with some anchors. We have defined wh anchor bounding boxes where each bounding box has two orientations (90° and 0° around Z axis), this has been defined by the authors for simplicity which gives a total of wh*2 anchors. Well, for the regression map, Voxelnet will output 2 bounding boxes (explaining why the last dimension is 14); actually they are the same bounding boxes at a given cell but with a different orientation (this is why I wanted us to remind the anchors defined). And for the probability map, we also output 2 probabilities for each bounding box (at every cell of the map), each probability representing the chance that we have a bounding box centered at the given cell with one of the 2 orientations. PS : Voxelnet does not really work like object detections networks like YOLO able to learn to detect multiple objects. When you build and a train a Voxelnet, it is for a specific SINGLE object. Now that all of this has been explained , I will now actually answer your question:The objects that you pointed can be considered (for some of them) as masks during training.
I hope that it answers your question ;). |
Also, for the LIDAR_COORD, it is a shift to place the pointclouds on the coordinate system we will be working on (for a better view of the cars or the pedestrians, I guess); but this is not a shift from the camera to the lidar. You will find the methods that actually do that in the utils script. |
Hi @steph1793 , |
Hi @steph1793 , |
Hi,
thanks a lot for your nice Voxelnet code sharing! I am not so clear about several properties of dataset, could you explain it a little bit please? These 5 properties in the generation of dataset(data.py) "pos_equal_one", "neg_equal_one" ,"targets" , "pos_equal_one_reg" ,"pos_equal_one_sum" , "neg_equal_one_sum" . Many thanks in advance!
The text was updated successfully, but these errors were encountered: