You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently I've discovered that cv2.resize and PIL.Image.resize have quite different implementations.
In particular, opencv does not scale the size of the filter used in downscaling (i.e. even if you downscale 100 times, only the neighbouring pixels are used to calculate the interpolation), which results in aliasing artifacts. Most notably, sharp edges will have 'staircase' artifacts, which increase with the size of the downscaling. (funny sidenote downscaling twice 2x will give better results than downscaling once 4x). Pillow uses adaptive filter size, which avoids aliasing by 'smoothing' the image. The downscaled image will contain no artifacts but will be a bit 'blurred'.
This also impacts Deep learning training. I've found that mAP can differ up to 5% when resizing with PIL on the train set and with cv2 on the test set versus using the same resize method for both (downscaling with factor 4).
Long story short, we should probably make sure that we consistently use the same downsampling method as much as possible in the repo to make sure that the artifacts do not influence inference/test performance.
I've now used PILLOW's bicubic interpolation in the coco tools and suggest to set that as default throughout the repo.
@Victorlouisdg, the image_transforms in the camera toolkit is the first place in the repo that comes to mind but there might be other places where we downscale images.
The text was updated successfully, but these errors were encountered:
I ran some quick performance tests. OpenCV is at least 10x faster than PIL at downscaling from (1920, 1080) to (320, 240). Because the ImageTransforms are also intended for real-time visualization, I'll likely leave OpenCV as the default. However, I have to investigate the quality concerns further. Possible actions:
Use cv2.INTER_BICUBIC instead of the default cv2.INTER_LINEAR because speed is similar
Add constructor argument to Resize class to select between OpenCV en PIL resize implementation.
Recently I've discovered that
cv2.resize
andPIL.Image.resize
have quite different implementations.In particular, opencv does not scale the size of the filter used in downscaling (i.e. even if you downscale 100 times, only the neighbouring pixels are used to calculate the interpolation), which results in aliasing artifacts. Most notably, sharp edges will have 'staircase' artifacts, which increase with the size of the downscaling. (funny sidenote downscaling twice 2x will give better results than downscaling once 4x). Pillow uses adaptive filter size, which avoids aliasing by 'smoothing' the image. The downscaled image will contain no artifacts but will be a bit 'blurred'.
More information: https://arxiv.org/pdf/2104.11222.pdf, https://zuru.tech/blog/the-dangers-behind-image-resizing
illustration:
This also impacts Deep learning training. I've found that mAP can differ up to 5% when resizing with PIL on the train set and with cv2 on the test set versus using the same resize method for both (downscaling with factor 4).
Long story short, we should probably make sure that we consistently use the same downsampling method as much as possible in the repo to make sure that the artifacts do not influence inference/test performance.
I've now used PILLOW's bicubic interpolation in the coco tools and suggest to set that as default throughout the repo.
@Victorlouisdg, the image_transforms in the camera toolkit is the first place in the repo that comes to mind but there might be other places where we downscale images.
The text was updated successfully, but these errors were encountered: