Document select keeps the pdf with the same size. #884
-
Hi!
the page saved: page.pdf After i save the file, the pdf have almost the same size, having only one page. PyMuPDF 1.18.6. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
There may be a number of reasons for this - even apart from a potential PyMuPDF bug 😉. Garbage collection can only remove objects which are no longer referenced by anything whatsoever ... and that may be all sorts of this. Here are a few steps to get closer to an explanation:
new = fitz.open()
new.insert_pdf(doc, from_page=75, to_page=75) # or insertPDF(...)
new.save(...) |
Beta Was this translation helpful? Give feedback.
-
Found the reason for this unexspected size behaviour:
In this case it references annotations, which in turn reference other pages, and yet other pages, etc. So you have the option 3 from previous post - which gives you below 70 KB PDF size. >>> import fitz
>>> doc=fitz.open("page.pdf")
>>> page=doc[0]
>>> print(doc.xref_get_keys(page.xref))
('Annots', 'B', 'Contents', 'CropBox', 'MediaBox', 'Parent', 'Resources', 'Rotate', 'Type')
>>> doc.xref_set_key(page.xref, "B", "null")
>>> page.clean_contents()
>>> doc.save("x.pdf",garbage=4,deflate=True) Which results in a PDF of 66 KB. Semi-manual might also mean for key in doc.xref_get_keys(page.xref):
if key not in ('Annots', 'Contents', 'CropBox', 'MediaBox', 'Parent', 'Resources', 'Rotate', 'Type'):
doc.xref_set_key(page.xref, key, "null") So this would only keep PDF keys also included by the |
Beta Was this translation helpful? Give feedback.
Found the reason for this unexspected size behaviour:
The page contains a
/B
entryIn this case it references annotations, which in turn reference other pages, and yet other pages, etc.
Result: the full PDF is still there - only the page pointers have been reduced to that one page.
So you have the option 3 from previous post - which gives you below 70 KB PDF size.
OR with functions new in v1.18.7, you …