Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docxtpl generates broken docx if some core_properties was changed #471

Open
av-gantimurov opened this issue Dec 21, 2022 · 1 comment
Open
Labels

Comments

@av-gantimurov
Copy link

av-gantimurov commented Dec 21, 2022

Describe the bug

docxtpl generates broken docx if some core_properties was changed before save.
If open in Microsoft Office Word 2010-2019 - error occures.
it is not error in python-docx - with it works and generates normal documents.

it works fine with docxtpl version 0.11.5, but in 0.12.0 and newer generated document is broken.

To Reproduce

installing python-docx and creating empty test document

python -m pip install python-docx
from docx import Document
doc = Document()
doc.core_properties.keywords = "keywords; works; fine"
doc.save("empty.docx")

empty.docx is opened fine and without fails or errors in Microsoft Office Word.

success

Installing properly working docxtpl 0.11.5

python -m pip install docxtpl==0.11.5
from docxtpl import DocxTemplate
tpl = DocxTemplate("empty.docx")
tpl.render({})
tpl.docx.core_properties.keywords = "keywords; still; works"
tpl.save("success.docx")

error

Bug occures in docxtpl from 0.12 to 0.16.4 and current development.

python -m pip install docxtpl
from docxtpl import DocxTemplate
tpl = DocxTemplate("empty.docx")
tpl.render({})
tpl.docx.core_properties.keywords = "keywords; still; works"
tpl.save("error.docx")

Problem occured if open in Microsoft Office Word 2010-2019.
When open error.docx in Microsoft Office Word you see error message.
When i try to show extra properties tab in file properties I don't see keywords were set.
In libreoffice error.docx is opened without fails. Exiftools shows properly set keywords.

i compared empty.docx, success.docx and error.docx.
they difference only in docProps/core.xml file.

Here i attach difference with beautified xml

  1. empty.docx
<cp:coreProperties
	xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:dcterms="http://purl.org/dc/terms/"
	xmlns:dcmitype="http://purl.org/dc/dcmitype/"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<dc:title/>
	<dc:subject/>
	<dc:creator>python-docx</dc:creator>
	<cp:keywords>success; fine</cp:keywords>
	<dc:description>generated by python-docx</dc:description>
	<cp:lastModifiedBy/>
	<cp:revision>1</cp:revision>
	<dcterms:created xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:created>
	<dcterms:modified xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:modified>
	<cp:category/>
</cp:coreProperties>
  1. success.docx
<cp:coreProperties
	xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:dcterms="http://purl.org/dc/terms/"
	xmlns:dcmitype="http://purl.org/dc/dcmitype/"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<dc:title/>
	<dc:subject/>
	<dc:creator>python-docx</dc:creator>
	<cp:keywords>error</cp:keywords>
	<dc:description>generated by python-docx</dc:description>
	<cp:lastModifiedBy/>
	<cp:revision>1</cp:revision>
	<dcterms:created xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:created>
	<dcterms:modified xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:modified>
	<cp:category/>
</cp:coreProperties>
  1. error.docx
<cp:coreProperties
	xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:dcterms="http://purl.org/dc/terms/"
	xmlns:dcmitype="http://purl.org/dc/dcmitype/"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<dc:title/>
	<dc:subject/>
	<dc:creator>python-docx</dc:creator>
	<cp:keywords>success; fine</cp:keywords>
	<dc:description>generated by python-docx</dc:description>
	<cp:lastModifiedBy/>
	<cp:revision>1</cp:revision>
	<dcterms:created xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:created>
	<dcterms:modified xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:modified>
	<cp:category/>
	<cp:keywords
		xmlns:cp="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties">error
	</cp:keywords>
</cp:coreProperties>

As you can see in error.docx <cp:keywords> dublicated and has additional xmlns:cp="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties

Expected behavior

Script above must create docx document without consistency errors.

Screenshots

Error in Microsoft Office 2010
image

Additional context

Python 3.10

@abhishekjain12
Copy link

abhishekjain12 commented May 8, 2024

Same issue. Is there any fix?

2 custom properties make a difference:

  • lastModifiedBy
  • revision

if I remove these, then it works fine

Both properties has xmlns:cp="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties"

python-openxml/python-docx#1037

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants