You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I observed that ground truth box for individual keypoint is calculated as 0.05 max(img_width, img_height). Which means width and height of every keypoint is almost random. Fro example: An eye keypoint in 640x640 image is always a 32x32 box irrespective of how big/small the eye is.
But you still use IOU loss to penalize the model. Why not just penalize the model based on difference in predicted center vs real center of the keypoint and ignore the predicted W & H of the keypoint box for loss computation ? (Only for keypoint objects)
I know that there is another l2 loss which penalizes based on keypoint centers. My point is that penalizing the model for predicting wrong W & H of keypoint box seems unreasonable given that ground truth W & H (for keypoints) are inconsistent.
The text was updated successfully, but these errors were encountered:
I observed that ground truth box for individual keypoint is calculated as
0.05 max(img_width, img_height)
. Which means width and height of every keypoint is almost random. Fro example: An eye keypoint in 640x640 image is always a 32x32 box irrespective of how big/small the eye is.But you still use IOU loss to penalize the model. Why not just penalize the model based on difference in predicted center vs real center of the keypoint and ignore the predicted W & H of the keypoint box for loss computation ? (Only for keypoint objects)
I know that there is another l2 loss which penalizes based on keypoint centers. My point is that penalizing the model for predicting wrong W & H of keypoint box seems unreasonable given that ground truth W & H (for keypoints) are inconsistent.
The text was updated successfully, but these errors were encountered: