Metadata gets updated in the PDF structure, but doesn't reflect in the Adobe Reader (some properties) #1254
Replies: 11 comments 1 reply
-
Which version are you using? |
Beta Was this translation helpful? Give feedback.
-
PyMuPDF Details:
Platform: Windows 10, python 3.9.2 |
Beta Was this translation helpful? Give feedback.
-
Ok, looks good. import fitz
doc = fitz.open()
page = doc.new_page()
m = doc.metadata
m["keywords"] = "kw1 kw2"
doc.set_metadata(m)
doc.save("x.pdf") the file show no irregularities in whatever viewer. |
Beta Was this translation helpful? Give feedback.
-
If you want, please share the file and let me have a look. |
Beta Was this translation helpful? Give feedback.
-
Could you please check if the same works on the attached pdf. |
Beta Was this translation helpful? Give feedback.
-
Hm, same thing happening for me 🤔. Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:26:21) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> import fitz
>>> doc=fitz.open("untagged-mod.pdf")
>>> from pprint import pprint
>>> pprint(doc.metadata)
{'author': 'Martie Shrader',
'creationDate': 'D:19040229023654Z',
'creator': 'Adobe® PageMaker® 6.5',
'encryption': None,
'format': 'PDF 1.6',
'keywords': 'kw1,kw2',
'modDate': "D:20150217093420-07'00'",
'producer': 'Acrobat PDFWriter 4.05 for Power Macintosh',
'subject': '',
'title': '',
'trapped': ''}
>>> print(doc.xref_object(-1)) # the trailer
<<
/Size 28
/Info 14 0 R
/Root 15 0 R
/ID [ <182A58473815524CCFA528451E12F3C6> <5BE0747C168AEBD2D843611781631B79> ]
>>
>>> print(doc.xref_object(14)) # info object
<<
/Author (Martie Shrader)
/CreationDate (D:19040229023654Z)
/Creator (Adobe\256 PageMaker\256 6.5)
/Keywords (kw1,kw2)
/ModDate (D:20150217093420-07'00')
/Producer (Acrobat PDFWriter 4.05 for Power Macintosh)
/Subject null
/Title null
/Trapped null
>>
>>> Out of good advice, I must say ... |
Beta Was this translation helpful? Give feedback.
-
Ha! I have the reason: |
Beta Was this translation helpful? Give feedback.
-
Thanks @JorjMcKie . |
Beta Was this translation helpful? Give feedback.
-
Well, at least I recommend it. But you are on your own as per any risks losing potentially important information. But again: PyMuPDF is not in the business of dealing with XML syntax - use e.g. lxml if you need something there. |
Beta Was this translation helpful? Give feedback.
-
Ok! But when we set the |
Beta Was this translation helpful? Give feedback.
-
Sure - and it does, just tested it, save the resulting file, added keywords information and, voilà, Adobe did show them. |
Beta Was this translation helpful? Give feedback.
-
Issue:
I have been updating the metadata of my pdf document and saving it back to file using
set_metadata()
function. Before setting metadata, I remove complete metadata usingHere's how it looks inside pdf structure.
In this example, keywords property didn't reflect in the Adobe reader (Works for some pdfs).
Please do let me know if I am doing something wrong, or anything I need to do, to make it work.
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions