Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slightly different Box down-sampling result #8587

Open
AlttiRi opened this issue Dec 7, 2024 · 4 comments
Open

Slightly different Box down-sampling result #8587

AlttiRi opened this issue Dec 7, 2024 · 4 comments

Comments

@AlttiRi
Copy link

AlttiRi commented Dec 7, 2024

What did you do?

I have written an image down-scaler in JavaScript using the Box algorithm. I compare the result of it with the result or resampling with PIL's resize with Image.Resampling.BOX option. However, the results are slightly different.

Here is my TypeScript code:

type SingleChannelImageData = { data: Uint8Array; width: number; height: number; };

export function scaleDownLinearAverage(from: SingleChannelImageData, to: SingleChannelImageData) {
    const {data: orig, width, height} = from;
    const {data: dest, width: newWidth, height: newHeight} = to;
    const xScale = width  / newWidth;
    const yScale = height / newHeight;
    for (let newY = 0; newY < newHeight; newY++) {
        for (let newX = 0; newX < newWidth; newX++) {
            const fromY = yScale * newY       + 0.5 << 0;
            const fromX = xScale * newX       + 0.5 << 0;
            const toY   = yScale * (newY + 1) + 0.5 << 0;
            const toX   = xScale * (newX + 1) + 0.5 << 0;
            const count = (toY - fromY) * (toX - fromX);
            let value = 0;
            for (let y = fromY; y < toY; y++) {
                for (let x = fromX; x < toX; x++) {
                    value += orig[y * width + x];
                }
            }
            dest[newY * newWidth + newX] = value / count + 0.5 << 0;
        }
    }
}

When PIL works as expected

To make things simpler, let's use 1D (1 pixel height) gray-scaled (1 channel) images.

In most cases PIL produces the expected result, for example:

  • 4 pixels to 2 pixels: [255, 0, 255, 0] -> [(255 + 0) / 2, (0 + 255) / 2] -> [127.5, 127.5] -> rounding (+ 0.5 << 0) -> [128, 128]
  • 5 pixels to 2 pixels: [255, 0, 255, 0, 255] -> [(255 + 0 + 255) / 3, (0 + 255) / 2] -> [170, 127.5] -> [170, 128]

As well as for the most simple transform — the transform from 1D gray image to 1x1 image:

  • 8 pixels to 1 pixel: [1,2,3,4,5,6,7,8] ->[(1+2+3+4+5+6+7+8) / 8] -> [4.5] -> [5]
  • 9 pixels to 1 pixel: [0,0,0,0, 0,0,0,0, 220] ->[220 / 9] -> [24.444444444444443] -> [24]

What actually happened?

However, when the group/box/area size is 10, 12, 14, 20, 22, 26, 30, 36, 38, 42, ... pixels, then PIL produces the unexpected result — it rounds .5 to down.

For example:

  • 10 pixels to 1 pixel: [0,0,0,0, 0,0,0,0, 0,255] -> [25.5] -> [25]
from PIL import Image

pixel_values = [0,0,0,0, 0,0,0,0, 0,255]
image = Image.new("L", (len(pixel_values), 1))
image.putdata(pixel_values)
image = image.resize((1, 1), Image.Resampling.BOX)
pixel = image.getdata()[0]
avg = sum(pixel_values) / len(pixel_values)
avg_round = int(avg + 0.5)
print(avg, avg_round, pixel, avg_round == pixel)

It prints:

25.5 26 25 False

What did you expect to happen?

Rounding of 25.5 should be 26 (Math.round(25.5) / 25.5 + 0.5 << 0 / int(25.5 + 0.5).


Here is more complex example:

from PIL import Image

for uint in range(0xFFFF + 1):
    v1 = (uint >> 24) & 0xFF
    v2 = (uint >> 16) & 0xFF
    v3 = (uint >>  8) & 0xFF
    v4 = (uint >>  0) & 0xFF
    pixel_values   = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, ] # 10
    # pixel_values = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, 0, 0,] # 12
    # pixel_values = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0,] # 14
    # pixel_values = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,] # 20
    # pixel_values = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0,] # 22

    # print(pixel_values)

    width = len(pixel_values)
    image = Image.new("L", (width, 1))
    image.putdata(pixel_values)
    image = image.resize((1, 1), Image.Resampling.BOX)

    pixel = image.getdata()[0]
    avg = sum(pixel_values) / width
    # avg_round = round(avg)   # Python 3' "round half to even" or "banker's rounding"
    avg_round = int(avg + 0.5) # "round" like it is in other languages (0.5 -> 1)

    # "fix"
    # magic_numbers = {10, 12, 14, 20, 22, 26, 30, 36, 38, 42, ... }
    # if width in magic_numbers and avg % 1 == 0.5:
    #     avg_round = avg_round - 1

    if avg_round != pixel:
        print(  "avg_round", avg_round, f"({sum(pixel_values)} / {width} = {avg})",
              "  pil_pixel", pixel,
              "  pixels",    pixel_values,
              )

Try to add/remove zeros from the pixel_values array to change its length, to see that there are magic array sizes that PIL behaves strangely with when rounding .5.

What are your OS, Python and Pillow versions?

  • OS: Windows 10
  • Python: 3.12.4
  • Pillow: 11.0.0
@radarhere
Copy link
Member

I suspect that rounding has been sacrificed for the sake of performance in Pillow, but I wonder if @homm could confirm?

@AlttiRi
Copy link
Author

AlttiRi commented Dec 10, 2024

BTW, is filter-based approach really optimal/needed?

My code above on Node.js for scaling down a gray-scaled image (https://i.imgur.com/DR94LKg.jpeg) 1000 times works for 19 seconds. PIL's Box filter for 12 seconds. I expected a bigger difference.

# ...
print(size) # (961, 1266)

# benchmark gray:
image_gray = image.convert("L")
print("bench gray")

start_time = time.time()
for i in range(1000):
    image_box = image_gray.resize(size, Image.Resampling.BOX)
print(time.time() - start_time)  # 12.385019302368164

start_time = time.time()
for i in range(1000):
    image_lanc = image_gray.resize(size, Image.Resampling.LANCZOS)
print(time.time() - start_time)  # 26.357529163360596

And Lanczos only ~2 times slower than Box resampling.

@AlttiRi
Copy link
Author

AlttiRi commented Dec 10, 2024

Here is a picture of the table of 51x51 px squares (1px border), the image has odd width and height (2755x1837).
PIL's box filter produces the result with the central vertical line is missed for down-scaling it in 4 times (to 1376x918).
Looks like the central vertical 1px line was removed from the computation.
My code works as expected.

from PIL import Image

image_path = "51-squares/_original.png"

image = Image.open(image_path)
newHeight = image.height // 2
size = (int(image.width / image.height * newHeight), newHeight)
print(image.width, image.height) # 2755 1837
print(image.width  / size[0]) # 2.0021802325581395
print(image.height / size[1]) # 2.0010893246187362
print(size) # (1376, 918)

# image_gray = image.convert("L")

image_box = image.resize(size, Image.Resampling.BOX)
image_box.save("pil-box.png")

image_lanc = image.resize(size, Image.Resampling.LANCZOS)
image_lanc.save("pil-lanczos.png")

51-squares.zip

@AlttiRi
Copy link
Author

AlttiRi commented Dec 10, 2024

One more about the central lines (the central cross that divides the image to 4 parts) and odd image dimension.

I compared down-scaling with BOX and LANCZOS filters. This image https://i.imgur.com/DR94LKg.jpeg is 1923x2533.
After down-scaling each side of it twice I find out the there is a "shift" from the central cross lines to the edges (scheme.jpg) with Lanczos scaling.

You need to toggle between DR94LKg-box.png and DR94LKg-lanczos.png to see it. It's pretty visible.


My scaler produces a visually identical result to PIL's Box filter one, except the horizontal central 1px line (you need to use zooming to see it). It's possible the same bug as in my previous message.

Or look at this:


DR94LKg.zip

image_path = "DR94LKg.jpeg"        # 1923x2533

image = Image.open(image_path)
newHeight = image.height // 2
size = (int(image.width / image.height * newHeight), newHeight)
print(image.width, image.height) # 1923 2533
print(image.width  / size[0])    # 2.001040582726327
print(image.height / size[1])    # 2.000789889415482
print(size)                      # (961, 1266)

image_gray = image.convert("L")

image_box = image_gray.resize(size, Image.Resampling.BOX)
image_box.save("DR94LKg-box.png")

image_lanc = image_gray.resize(size, Image.Resampling.LANCZOS)
image_lanc.save("DR94LKg-lanczos.png")

If I crop the image by 1 px for both sides: to 1922x2532, than everything is fine.
Both box-scaled images are the same (except the invisible for the eye differences of .5 rounding), and box resample result is very close to the result of Lanczos scaling (Box filter is really good for integer down-scaling, when you map the constant count of initial pixels to the result pixel).

DR94LKg-crop.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants