some issus about dataset unmatch #2

xxyzll · 2024-12-04T02:58:32Z

Hi, I found some data mismatch issues in your data validation code. Specifically, I printed the annotation entries in MOWODB and we found that there is a mismatch here:

The printing code is as follows:

    def print_total_annatations(imagenames, known_classes, recs):
        known_classes_un = known_classes + ['known'] + ['unknown']
        total_ann = [0 for _ in known_classes_un]
        for imagename in imagenames:
            for obj in recs[imagename]:
                if obj["name"] in known_classes:
                    total_ann[known_classes.index(obj["name"])] += 1
                    total_ann[-2] += 1
                else:
                    total_ann[-1] += 1
        print('valid annotations:')
        for i in range(0, len(known_classes_un), 3):
            line = ""
            for j in range(3):
                if i + j < len(known_classes_un):
                    category, count = known_classes_un[i+j], total_ann[i+j]
                    line += f"{category.ljust(15)} | {str(count).rjust(5)} | "
            print(line)

    # first load gt
    # read list of images
    with PathManager.open(imagesetfile, "r") as f:
        lines = f.readlines()
    imagenames = [x.strip() for x in lines]

    imagenames_filtered = []
    # load annots
    recs = {}
    mapping = {}  # follow RandBox to map image id to image name
    for imagename in imagenames:
        rec = parse_rec(annopath.format(imagename), tuple(known_classes))
        if rec is not None and int(imagename) not in mapping:
            recs[imagename] = rec
            imagenames_filtered.append(imagename)
            mapping[int(imagename)] = imagename

    imagenames = imagenames_filtered
    # print annotations number in first load
    if print_annotations:   
        print_total_annatations(imagenames, known_classes, recs)

    # voc eval config:
    rec, prec, ap, unk_det_as_known, num_unk, tp_plus_fp_closed_set, fp_open_set = voc_eval(
                    res_file_template,
                    self._anno_file_template,
                    self._image_set_path,
                    cls_name,
                    ovthresh=thresh / 100.0,
                    use_07_metric=self._is_2007,
                    known_classes=self.known_classes,
                    print_annotations=(cls_id==0)  # in first load
                )

However, the number of dataset categories in the original ORE implementation is:

These unknown class numbers also appear in ORE's question (1), (2) and in the log.txt in the google link he provided google cloud In fact, relevant information is also printed in the detectron2 evaluation tool:

Perhaps the relevant code and settings should be reviewed again to ensure uniformity. Thanks again for your influential work.

343gltysprk · 2024-12-04T07:53:49Z

Hi,
Thank you for your questions.
I follow the experimental setting and data split of https://github.com/feifeiobama/OrthogonalDet and the results of existing works are directly taken from this paper Exploring Orthogonality in Open World Object Detection, so the settings should be consistent.

Also, can you confirm that you are using my code? My method does not make so many predictions on aeroplanes.

Thanks again for your comment.

xxyzll · 2024-12-04T08:55:32Z

Thank you for your response. After matching line by line, I found an inconsistency:
yours:

ORE:

There is an additional deduplication operation here. However, in the ORE test, there are duplicate filenames, amounting to as many as 1127 (link). Despite these duplicates, the ORE evaluation process has been widely accepted as a benchmark. Therefore, is it unreasonable to use the deduplicated data, given that most papers still adopt the ORE evaluation?

PS: In the standard ORE evaluation, using all unknown class names for zero-shot testing does not achieve the results reported in the paper, with only 59 U-Recall.

343gltysprk · 2024-12-04T10:13:40Z

Thank you for your question.

As I mentioned before, I followed the experiment setup of the SOTA method in 2024. Please refer to https://github.com/feifeiobama/OrthogonalDet.

An interesting fact is that the data split in the ORE paper is closer to my setting.

I hope this addresses your concerns.

xxyzll · 2024-12-05T02:00:26Z

However, before this, all implementations did not filter samples. I believe there seems to be a divergence in ORTH.
ORE (CVPR 21)

OW-DETR (CVPR 22)

PROB (CVPR 23)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some issus about dataset unmatch #2

some issus about dataset unmatch #2

xxyzll commented Dec 4, 2024

343gltysprk commented Dec 4, 2024

xxyzll commented Dec 4, 2024

343gltysprk commented Dec 4, 2024

xxyzll commented Dec 5, 2024

some issus about dataset unmatch #2

some issus about dataset unmatch #2

Comments

xxyzll commented Dec 4, 2024

343gltysprk commented Dec 4, 2024

xxyzll commented Dec 4, 2024

343gltysprk commented Dec 4, 2024

xxyzll commented Dec 5, 2024