Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unusual Z-Value Predictions and Multiple Predictions per Class on Custom Dataset #22

Open
3 tasks done
Shromm-Gaind opened this issue Feb 13, 2024 · 7 comments
Open
3 tasks done

Comments

@Shromm-Gaind
Copy link

Shromm-Gaind commented Feb 13, 2024

Prerequisites

  • Searched through existing Issues and Discussions without finding a solution. Also reviewed relevant GitHub issues.

  • Experimented with adjusting score_thr to improve visualization outcomes.

  • Confirmed that the unexpected results are not due to errors in the training dataset.

Issue Description:
I am encountering issues with unrealistic prediction values and multiple predictions for the same object class while training a model on a custom dataset. Specifically, some predicted bounding boxes have z-values that are highly unrealistic (e.g., a maximum z-value of 4000 meters) and multiple predictions for a single class object. I am trying to understand how the network generates these predictions and what might be causing such unusual results.

Environment

  • Docker environment built from a Dockerfile.
  • GPU: NVIDIA GeForce RTX 3090.

Dataset and Model Performance:

  • Custom dataset with 26 classes, comprised of 2000 point clouds each for training and validation.
  • Achieved mAP@50 of 0.76 and mAP@25 of 0.87.

Issue Details
During visualization tests, I observed some bounding boxes with unrealistic z-values. For example, in a generated text file, the format '{label} {x_min} {y_min} {z_min} {x_max} {y_max} {z_max}' shows z-values that are significantly off, such as (an example from the dataset):

(Link to the problematic bounding box file:1840_boxes.txt)

6 0.7325 0.0170 -8.6887 0.8170 0.1027 11.7481
6 0.4817 -0.1613 -321.8918 0.5632 -0.0771 325.0209
6 0.6177 -0.0779 -9.3888 0.7026 0.0087 12.4312

In particular, for class ID 6, which showed a mAP@50 of 0.50, and class ID 7, with a mAP@50 of 0.07, there were notably inaccurate predictions. While I understand that lower mAP scores might lead to poorer predictions, I have adjusted the score_thr to 0.6 for the following example, but it made no changes to the predicted bounding boxes. It did however decrease the mAP@50 to .73.

Questions:

  1. How does the network produce such unrealistic z-value predictions?
  2. How could I filter out these bad predictions?

Steps Taken
I have successfully created and trained the model on my custom dataset following the mmdetection3d documentation. The anomaly was detected during post-training visualization of the detection results.

@filaPro
Copy link
Contributor

filaPro commented Feb 13, 2024

Achieved mAP@50 of 0.76 and mAP@25 of 0.87

Looks like the metrics are very good. So these large boxes should have low scores. What scores do they have?

@Shromm-Gaind
Copy link
Author

Clarification and Results Update:

I would like to clarify regarding the performance metrics for class ID 6 and class ID 7. Specifically, I had mistakenly referred to the metrics as mAP when I intended to say AP (Average Precision). Additionally, I have found that by adjusting the score_thr to 0.7, I was able to effectively filter out some of the inaccurate predictions that were previously mentioned at the expense of mAP obviously.

Updated Performance Metrics:

Below are the detailed results for each class:
+----------------+---------+---------+---------+---------+
| classes | AP_0.25 | AR_0.25 | AP_0.50 | AR_0.50 |
+----------------+---------+---------+---------+---------+
| Sitting | 0.9978 | 0.9979 | 0.9869 | 0.9918 |
| Snout | 0.9961 | 0.9964 | 0.9662 | 0.9704 |
| Neck | 0.9027 | 0.9495 | 0.5060 | 0.6835 |
| Base left ear | 0.0006 | 0.0409 | 0.0003 | 0.0288 |
| Tip left ear | 0.9663 | 0.9796 | 0.7737 | 0.8437 |
| Left shoulder | 0.9912 | 0.9940 | 0.9840 | 0.9880 |
| Left elbow | 0.9711 | 0.9929 | 0.8876 | 0.9365 |
| Left hand | 0.9878 | 0.9932 | 0.9597 | 0.9694 |
| Right hand | 0.9784 | 0.9812 | 0.9430 | 0.9471 |
| Left flank | 0.7280 | 0.8568 | 0.2968 | 0.5339 |
| Left hip | 0.9896 | 0.9921 | 0.9523 | 0.9615 |
| Left knee | 0.9594 | 0.9708 | 0.9033 | 0.9248 |
| Left foot | 0.9766 | 0.9815 | 0.9305 | 0.9372 |
| Base tail | 0.9162 | 0.9727 | 0.5688 | 0.7575 |
| Tip tail | 0.8337 | 0.8668 | 0.4773 | 0.6197 |
| Base right ear | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| Tip right ear | 0.9664 | 0.9734 | 0.7627 | 0.8299 |
| Right shoulder | 0.9858 | 0.9878 | 0.9496 | 0.9619 |
| Right hip | 0.9911 | 0.9926 | 0.9727 | 0.9763 |
| Right knee | 0.9845 | 0.9921 | 0.9459 | 0.9597 |
| Right foot | 0.9743 | 0.9761 | 0.9389 | 0.9427 |
| Sternally | 0.9478 | 0.9490 | 0.8346 | 0.8673 |
| Right elbow | 0.9743 | 0.9828 | 0.9320 | 0.9398 |
| Right flank | 0.8306 | 0.9115 | 0.5006 | 0.6609 |
| Laterally | 0.9799 | 0.9838 | 0.9469 | 0.9575 |
| Standing | 0.9980 | 0.9980 | 0.9364 | 0.9519 |
+----------------+---------+---------+---------+---------+
| Overall | 0.8780 | 0.8967 | 0.7637 | 0.8131 |
+----------------+---------+---------+---------+---------+

@Shromm-Gaind
Copy link
Author

Shromm-Gaind commented Feb 13, 2024

here is the defined order of classes within our dataset:

CLASSES = ('Standing', 'Sitting', 'Sternally', 'Laterally', 'Snout', 'Neck', 'Base left ear', 'Base right ear', 'Tip left ear', 'Tip right ear', 'Left shoulder', 'Right shoulder', 'Left elbow', 'Right elbow', 'Left hand', 'Right hand', 'Left flank', 'Right flank', 'Left hip', 'Right hip', 'Left knee', 'Right knee', 'Left foot', 'Right foot', 'Base tail', 'Tip tail')

Regarding the specific examples I mentioned earlier, where I observed unrealistic prediction values for the z-axis in the bounding boxes, those were related to the classes with ID 6 ("Base right ear") and ID 7 ("Base left ear"). It's important to note that these results were obtained using a score_thr of 0.3.

By adjusting the score_thr to 0.7, I was able to effectively filter out some of the inaccurate predictions for these classes. But it dropped my metric to 0.2152 mAP@50

@filaPro
Copy link
Contributor

filaPro commented Feb 13, 2024

I see you have great metrics on all classes except these Base right ear and Base left ear. Do you have them in your training data? Are their gt boxes valid? I simply recommend to remove these 2 classes from validation as they have zero accuracy. This will solve your problem even without changing score_thr. Or you can increase score_thr only for these 2 classes.

@Shromm-Gaind
Copy link
Author

I do have them in my training data, I had thought of this. I will look through my data more carefully. What would you suggest for multiple predictions for one class as shown in this example:
Right Foot despite having quite a high metric has three predicted bounding boxes, would it just be to increase the score_thr?
24 -0.06998 -0.29320 1.23844 0.02683 -0.19450 1.33211
24 -0.21474 -0.31352 1.28769 -0.03122 -0.12543 1.42425
24 -0.03547 -0.34199 1.29211 0.12538 -0.17522 1.41214
259_boxes.txt

@filaPro
Copy link
Contributor

filaPro commented Feb 13, 2024

Try to play with iou_thr in your config. This is NMS parameter regarding how much 2 boxes should intersect to be recognized as duplicates.

@Shromm-Gaind
Copy link
Author

That did fix my problem, thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants