crop pdf page not getting the expected result #980
Replies: 4 comments 2 replies
-
Allow me to convert this to a discussion. |
Beta Was this translation helpful? Give feedback.
-
You actually have no dealings with the When you use clip rectangles for inserting parts of some source page, you of couse need to know whether you got them relative to the rotated or the unrotated source page. The following script uses 2 clip rectangles for it is known they are in de-rotated coordinates and split the source page in top and bottom halves. The resulting two output pages are then rotated like the source page was (also works if there was no original rotation): import fitz
src = fitz.open("table.test.pdf")
srcpage = src[0]
old_rot = srcpage.rotation
srcpage.set_rotation(0)
srcrect = srcpage.rect
top = fitz.Rect(0, srcrect.height/2, srcrect.width, srcrect.height)
btm = fitz.Rect(0, 0, srcrect.width, srcrect.height/2)
out = fitz.open()
outp = out.new_page(width=top.width, height=top.height)
outp.show_pdf_page(outp.rect, src, 0, clip=top)
outp.set_rotation(old_rot)
outp = out.new_page(width=btm.width, height=btm.height)
outp.show_pdf_page(outp.rect, src, 0, clip=btm)
outp.set_rotation(old_rot)
out.save("x.pdf", garbage=4, deflate=True) |
Beta Was this translation helpful? Give feedback.
-
Hi @zorzigio
No, only according to the PDF standard. (Py-) MuPDF count from top-left. This is the reason why we have that transformation matrix. |
Beta Was this translation helpful? Give feedback.
-
Hi @JorjMcKie Yes, I understand that the bottom-left is the definition of the origin according to the PDF standard. And if the transformation matrix has been created for transforming between these 2 coordinates systems, I think I was on the right track I guess You see, in the code I posted originally, the Rect |
Beta Was this translation helpful? Give feedback.
-
I am trying to crop an area of a pdf and I am not able to get the expected result using the transformation matrix.
The position of the area I am trying to extract is relative to the bottom left corner of the page.
The page is also rotated by 90 deg.
In the code below, the first page contains the extracted area using the transformation matrix which does not work properly, while the second page is extracted manually deriving the position of the area knowing the rotation of the page (which extracts the area correctly).
I would much prefer using the transformation matrix, however I am not sure what I am doing wrong here?
Also, I was wondering if there is a method to deal with the rotation of the page automatically rather than having to rotate back and forth the page?
table test.pdf
Beta Was this translation helpful? Give feedback.
All reactions