results of the best checkpoint are different between training and evaluation #15

linhaojia13 · 2023-05-08T16:09:29Z

At the end of the training log, the results of the best chechpoint:

training completed...

--------------------------------------best--------------------------------------
[best] epoch: 25
[loss] loss: 44.52341
[loss] ref_loss: 16.75534
[loss] ref_mask_loss: 0.0
[loss] lang_cls_loss: 0.22115
[loss] objectness_loss: 0.33091
[loss] kps_loss: 0.0285
[loss] box_loss: 2.68898
[loss] sem_cls_loss: 5.56197
[loss] lang_cls_acc: 0.93388
[sco.] ref_acc: 0.14872
[sco.] obj_acc: 0.76845
[sco.] pos_ratio: 0.68719, neg_ratio: 0.31281
[sco.] iou_rate_0.25: 0.47397, iou_rate_0.5: 0.36692

saving checkpoint...

saving last models...

After the training, I run the command for evaluation: CUDA_VISIBLE_DEVICES=0 python scripts/eval.py --config ./config/sps.yaml --folder 2023-05-07_00-36_SPS/ --reference --no_nms --force :

unique:
unique | not_in_others | ref_acc: 0.14891243725599554
unique | not_in_others | [email protected]: 0.8120468488566648
unique | not_in_others | [email protected]: 0.6447295036252092
unique | in_others | ref_acc: 0.09615384615384616
unique | in_others | [email protected]: 0.7692307692307693
unique | in_others | [email protected]: 0.5961538461538461
unique | overall | ref_acc: 0.14742547425474256
unique | overall | [email protected]: 0.810840108401084
unique | overall | [email protected]: 0.643360433604336

multiple:
multiple | not_in_others | ref_acc: 0.07918758557736194
multiple | not_in_others | [email protected]: 0.3247375627567321
multiple | not_in_others | [email protected]: 0.26449109995435877
multiple | in_others | ref_acc: 0.2307223407497714
multiple | in_others | [email protected]: 0.4687595245352027
multiple | in_others | [email protected]: 0.32855836635172203
multiple | overall | ref_acc: 0.14406890251859586
multiple | overall | [email protected]: 0.38640219235286444
multiple | overall | [email protected]: 0.29192222367219106

overall:
overall | not_in_others | ref_acc: 0.0994331983805668
overall | not_in_others | [email protected]: 0.4662348178137652
overall | not_in_others | [email protected]: 0.3748987854251012
overall | in_others | ref_acc: 0.22862286228622863
overall | in_others | [email protected]: 0.4734473447344735
overall | in_others | [email protected]: 0.3327332733273327
overall | overall | ref_acc: 0.1447202355910812
overall | overall | [email protected]: 0.4687631468237274
overall | overall | [email protected]: 0.3601177955405974

language classification accuracy: 0.9309404022447408

The best overall accuracy during training is The overall acc is iou_rate_0.25: 0.47397, iou_rate_0.5: 0.36692, but in the evaluation the best one is overall | overall | [email protected]: 0.4687631468237274 overall | overall | [email protected]: 0.3601177955405974.
Why is there such a discrepancy? Did I make a mistake somewhere?

The text was updated successfully, but these errors were encountered:

xuxiaoxxxx · 2023-05-14T06:53:27Z

I meet the same question with you. Do you slove it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

results of the best checkpoint are different between training and evaluation #15

results of the best checkpoint are different between training and evaluation #15

linhaojia13 commented May 8, 2023

xuxiaoxxxx commented May 14, 2023

results of the best checkpoint are different between training and evaluation #15

results of the best checkpoint are different between training and evaluation #15

Comments

linhaojia13 commented May 8, 2023

xuxiaoxxxx commented May 14, 2023