Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about evaluating cityscapes dataset #31

Open
weiaicunzai opened this issue Dec 2, 2020 · 2 comments
Open

Questions about evaluating cityscapes dataset #31

weiaicunzai opened this issue Dec 2, 2020 · 2 comments

Comments

@weiaicunzai
Copy link

Thanks for your great work, I just wandering how do you evaluate cityscapes dataset, after reading your code, it seems like you trained the model on input size 512x512, and directly evaluate on the original image size(1024 x 2048):

  if opts.crop_val:
            val_transform = et.ExtCompose([
                et.ExtResize(opts.crop_size),     # random crop to 512 x 512
                et.ExtCenterCrop(opts.crop_size),
                et.ExtToTensor(),
                et.ExtNormalize(mean=[0.485, 0.456, 0.406],
                                std=[0.229, 0.224, 0.225]),
            ])
        else:
            val_transform = et.ExtCompose([
                et.ExtToTensor(),    
                et.ExtNormalize(mean=[0.485, 0.456, 0.406],
                                std=[0.229, 0.224, 0.225]),
            ])

Why use the same model to evaluate the different input image size? Thanks.

@13717630148
Copy link

i get the same question

@VainF
Copy link
Owner

VainF commented Jun 9, 2021

Deeplab models were trained on 512x512 patches and evaluated on full images (1024 x 2048).

The training protocol from deeplabv3 paper:

We adopt the same training protocol as before except
that we employ 90K training iterations, crop size equal to 769, 
and running inference on the whole image

There are two main reasons for training on cropped images: 1) larger batch size for more accurate BN statistics 2) more efficient training and less resource consumption. Of course, if you have sufficient GPU resources, it would be better to train models on larger images, e.g., full images or 769x769 patches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants