Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom properties support + unit-tests, feature-tests #580

Closed
wants to merge 0 commits into from

Conversation

michael-koeller
Copy link

Hi @scanny,

I patched the implementation of @renejsum into a new fork and added the missing test cases.
Along the way I also added proper type handling for string/int/bool properties.

Regards,
Michael

@renejsum
Copy link

renejsum commented Nov 23, 2018 via email

@Ricyteach
Copy link

I haven't reviewed the PR but wanted to say thanks, @michael-koeller!

@jekozyra
Copy link

@scanny any chance of this ever making it into a release? It would save me a lot of time.

@BastienFaure
Copy link

Merging this pull-request would we fantastic :) Thanks to all you guys for your job !

@geobeo
Copy link

geobeo commented Jan 29, 2020

This would basically be the one and only reason for me to use python-docx (a lot). I consider this a killer feature :P Please merge (or give someone write access to merge :P).

aleksandarbos added a commit to openlawlibrary/python-docx that referenced this pull request Feb 29, 2020
applied custom properties pr
python-openxml#580 for storing
private metadata within custom xml part.
@nozokada
Copy link

I'm so excited for this to be merged to master. Thank you for the work.

@Ricyteach
Copy link

I'm so excited for this to be merged to master. Thank you for the work.

Seconded. I intend to use the stink out of this feature. THANK YOU!!!

@paweljasinski
Copy link

Can this be merged?
If not, what is outstanding?
Thanks in advance

@HansBambel
Copy link

Would be nice to have this feature merged in :)

@mohitmathew
Copy link

Can we get this merged please ? this has been open for a very long time now.

@hugoe
Copy link

hugoe commented Feb 10, 2022

Yes, I need it too, please merge.

@boomesq
Copy link

boomesq commented Mar 7, 2022

I need it as well.

@lsaint
Copy link

lsaint commented May 19, 2022

It's been four years. Merge, please.

lsaint referenced this pull request in lsaint/python-docx-oss May 25, 2022
Add custom properties support + unit-tests, feature-tests #580
@Apteryks
Copy link
Contributor

Hi, and thanks for this work!

It looks rather complete, with the acceptance tests and unit tests already in! I've taken the freedom to rebase it and get rid of the merge commits; I believe I've preserved the correct authors and the tests still passes; @michael-koeller could you hard reset your branch to https://github.com/Apteryks/python-docx/tree/add-support-for-custom-properties, after verifying that it looks alright?

After which we can look into some details.

@sanzoghenzo
Copy link

Hi, there is still a fix to do in the docstring that refers to pptx package instead of docx. See my review above.

I'm glad you're taking the time and effort to keep the ball rolling!

@michael-koeller
Copy link
Author

@Apteryks: Nice to hear, that things might get some traction after laying sleeping for so long. 👍

I just did a rebase on my pull request branch and I also added a separate feature branch in my repo wich I didn't do originally (don't know any more why).

@Apteryks
Copy link
Contributor

Hi @michael-koeller; for tidiness, could you please split the acceptance tests in a first commit, and everything else into a 2nd commit? Basically the way I did it here: https://github.com/Apteryks/python-docx/commits/add-support-for-custom-properties.

@michael-koeller
Copy link
Author

for tidiness, could you please split the acceptance tests in a first commit, and everything else into a 2nd commit?

Yes, sure. Done.

Copy link
Contributor

@Apteryks Apteryks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! This looks good; I added a few minor comments.

Testing it though, I seem to trigger a hang this way:

from docx import Document
doc = Document('./features/steps/test_files/doc-customprops.docx')
list(doc.custom_properties)

Is this expected?

Thanks.

@Apteryks
Copy link
Contributor

@michael-koeller One last thing: it'd be nice to add documentation for the new feature; it seems it'd go here: https://github.com/python-openxml/python-docx/blob/master/docs/api/document.rst, next to the CoreProperties section. A usage example would also be neat in https://raw.githubusercontent.com/python-openxml/python-docx/master/docs/index.rst, demonstrating how to access or set custom properties on a document. Thanks again!

@mohitmathew
Copy link

Almost there team can we push this through 👍 . Thanks for everyone's effort ..

@R00118189
Copy link

R00118189 commented Nov 6, 2022

@celkas, what is this feature's estimated merge date?

@Apteryks
Copy link
Contributor

Apteryks commented Nov 8, 2022

Could someone volunteer to test this branch? Especially w.r.t. #580 (review).

@celkas
Copy link

celkas commented Nov 10, 2022

@Apteryks I tested it on Windows. TL;DR same result that You have in #580 (review)

from docx import Document

print("-------------------------------------------")
print("opening doc with custom-props => works fine")
print("-------------------------------------------")
name = r"C:\Python\python-docx-master\features\steps\test_files\doc-customprops.docx"
doc = Document(name)
print(name)
print(doc)
print(doc.custom_properties)

print("-------------------------------------------")
print("access using known values => works fine")
print("-------------------------------------------")
known_propvals = (
    ("CustomPropBool", True),
    ("CustomPropInt", 13),
    ("CustomPropString", "Test String"),
)
for name, expected_value in known_propvals:
    print(name, doc.custom_properties[name])

print("-------------------------------------------")
print("access like a list or dict => fails")
print("-------------------------------------------")
i = 0
for prop in doc.custom_properties:
    if i > 10:
        break
    print(prop)
    i += 1
list(doc.custom_properties)

print("-------------------------------------------")
print("end => never reached")
print("-------------------------------------------")

results in

C:\...>python docx-test.py
-------------------------------------------
opening doc with custom-props => works fine
-------------------------------------------
C:\Python\python-docx-master\features\steps\test_files\doc-customprops.docx
<docx.document.Document object at 0x000001E784E8FF40>
<docx.opc.customprops.CustomProperties object at 0x000001E7870DBE20>
-------------------------------------------
access using known values => works fine
-------------------------------------------
CustomPropBool True
CustomPropInt 13
CustomPropString Test String
-------------------------------------------
access like a list or dict => fails
-------------------------------------------
None
None
None
None
None
None
None
None
None
None
None
Traceback (most recent call last):
  File "C:\...\docx-test.py", line 32, in <module>
    list(doc.custom_properties)
  File "C:\Python\Python310\lib\site-packages\docx\opc\customprops.py", line 26, in __getitem__
    prop = self.lookup(item)
  File "C:\Python\Python310\lib\site-packages\docx\opc\customprops.py", line 70, in lookup
    if child.get("name") == item:
KeyboardInterrupt
^C
C:\...>

@Apteryks
Copy link
Contributor

@Apteryks I tested it on Windows. TL;DR same result that You have in #580 (review)
Thanks for testing. So I think this should be fixed before we poke the maintainer to have a look and hopefully merge; @michael-koeller, is this something you'd be interested looking into? Giving a more dict-like interface to the properties (or storing them in a plain dict directly)

@celkas
Copy link

celkas commented Nov 14, 2022

@michael-koeller, easiest way I see is adding this code to docx\opc\customprops.py:

    def __iter__(self):
        return iter(self._element)

This would make it possible to iterate all custom props in a file.

One needs to be aware, though, that the access to a custom-prop is (then) still not straight forward:

for prop in doc.custom_properties:
    #print(prop) # does not work!
    print(type(prop), prop.items(), prop.get('name'))

@michael-koeller
Copy link
Author

I just added the iteration support on custom properties, and added the relevant test cases for this feature.

@celkas : Thanks for your input. My version is slightly different from yours. The iteration just returns the names of existing custom properties. To obtain the values, any client code may then explicitely access the getter method.

@celkas
Copy link

celkas commented Nov 21, 2022

@Apteryks, looks good to me. Tested the branch, everything OK.

@jakeformico
Copy link

I might be a little out of my depth here, but I'm hoping someone can clear this up. With this, I was successfully able to get and update my custom property, which in my case was used for updating the existing header, but I noticed inside the actual word document nothing changed. The old header stayed the same. Then I checked, and it looks like I need to update the field even after I've updated the custom property itself. Does this functionality also handle updating the fields associated with the custom properties?

@R00118189
Copy link

R00118189 commented Dec 2, 2022

I might be a little out of my depth here, but I'm hoping someone can clear this up. With this, I was successfully able to get and update my custom property, which in my case was used for updating the existing header, but I noticed inside the actual word document nothing changed. The old header stayed the same. Then I checked, and it looks like I need to update the field even after I've updated the custom property itself. Does this functionality also handle updating the fields associated with the custom properties?

I believe you have to refresh it manually in MSWord and save it to keep it displayed or set MSWord to update fields before printing Click FILE > Options > Display, and under Printing options, select the check box for Update fields before printing. see here

Edit: Your header footers won't update unless you print the document with the option ↑. Maybe there is an option to refresh all programmatically; who knows?

@R00118189
Copy link

@michael-koeller, how do we remove or delete custom properties from <class 'docx.opc.customprops.CustomProperties'> object?

@michael-koeller
Copy link
Author

@michael-koeller, how do we remove or delete custom properties from <class 'docx.opc.customprops.CustomProperties'> object?

Good question. That's not covered yet, as it seems.

@michael-koeller
Copy link
Author

@michael-koeller, how do we remove or delete custom properties from <class 'docx.opc.customprops.CustomProperties'> object?

OK, should be working now. 😉

I also added corresponding test cases.

@R00118189
Copy link

R00118189 commented Dec 2, 2022

@michael-koeller do we delete by setting the value (that does not work, as properties stay with empty values) to None or is there a way to access the delete function and get rid of the property altogether?
edit: is it del doc.custom_properties['CustomPropString']

@michael-koeller
Copy link
Author

@michael-koeller do we delete by setting the value (that does not work, as properties stay with empty values) to None or is there a way to access the delete function and get rid of the property altogether?

Ah, sorry. I forgot to explain:

    custom_properties = document.custom_properties
    # delete custom property "Foo"
    del custom_properties["Foo"]

The implementation is defensive. So if there's no custom property to the given key, nothing happens.

@celkas
Copy link

celkas commented Dec 21, 2022

What can be done to make this happen?

@Apteryks
Copy link
Contributor

Apteryks commented Dec 31, 2022

    custom_properties = document.custom_properties
    # delete custom property "Foo"
    del custom_properties["Foo"]

I'm wondering, is there a reason the properties can't be kept in a true Python dict object, instead of a look-alike class? I think the API would be more intuitive if this was the case (we can simply document that the properties are exposed as a dictionary, the same way the environment variables are exposed as such via os.environ). What do you think?

@michael-koeller ^

@michael-koeller
Copy link
Author

I'm wondering, is there a reason the properties can't be kept in a true Python dict object, instead of a look-alike class?

That's already part of the original implementation by @renejsum.

The util class actually operates directly on the XML document tree. So while it might as well be possible to keep the custom properties in a plain dictionary, you would still need code to couple this dictionary instance to the XML elements in the tree.

@Apteryks
Copy link
Contributor

Apteryks commented Jan 2, 2023

Hi everyone, and happy New Year!

I've tested this again, and the test suite passes (============================= 1477 passed in 5.10s =============================), and the implementation seems to work alright.

My preference would be to have a better emulated dict-like interface (where you could call .values() or .items()), but that can always be added later.

@scanny in case you haven't followed this discussion, this is about adding support for custom properties; the change includes tests and has been reviewed and manually tested by myself and others.

@R00118189
Copy link

Lads, what is the roadmap for merging it into the repo?

@Apteryks
Copy link
Contributor

Lads, what is the roadmap for merging it into the repo?

@scanny as the owner of the project has the last word; it seems they haven't had the time to catch up with this thread yet; I hope they do, as it seems to be a much wanted feature :-).

@jzzsxm
Copy link

jzzsxm commented Feb 24, 2023

@michael-koeller and @Apteryks I appreciate all the hard work on this. I've just implemented this new functionality at work and it's a huge time saver. Very well done!

@Kasimashi
Copy link

Any news about it ?

@sHermanGriffiths
Copy link

@michael-koeller @scanny, any updates?

@celkas
Copy link

celkas commented Oct 20, 2023

@michael-koeller it seems there are conflicts to be resolved? At least that is a new message appearing in my feed...

@scanny
Copy link
Contributor

scanny commented Oct 20, 2023

Hi All, apologies for the long hiatus. This PR looks potentially viable. @michael-koeller are you still interested in driving this?

I have a few comments but no sense getting into that if it's gone stale. It would need a rebase given all the changes in getting from 0.8 to 1.0 and dropping the Python 2 support.

Let me know if you're still interested and we can see where to go next.

@michael-koeller
Copy link
Author

michael-koeller commented Oct 23, 2023 via email

@michael-koeller
Copy link
Author

Oops.

While working on a rebase for this MR, I accidentally closed this MR. Sorry for any inconveniences. 😞

As a possible followup, I created a new MR on my recent branch: #1273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.