Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable copying of a run's font #1308

Open
luissantosHCIT opened this issue Nov 29, 2023 · 4 comments
Open

Enable copying of a run's font #1308

luissantosHCIT opened this issue Nov 29, 2023 · 4 comments

Comments

@luissantosHCIT
Copy link

luissantosHCIT commented Nov 29, 2023

Hello,

I am new to this project and have been enjoying the work done.
For my own project, I replace some text fields with specific values.
Unfortunately, I noticed that the formatting gets lost when using the Paragraph::text property.
For example,
image
becomes
image

Inspection of the XML output revealed that the run's embedded font information gets lost.
image

Finally, I found this comment in the source code for the Paragraph class
image

Looking deeper, I see why the call to clear() was needed.
In my testing, I can see how Word can sometimes split text over multiple runs.

As a result, I resorted to logic like so in my program in order to recover the font information.


if len(p.runs):
            old_run = p.runs[0]
            p.text = p.text.replace(str(key), str(value))
            new_run = p.runs[0]
            new_run.italic = old_run.italic
            new_run.bold = old_run.bold
            new_run.underline = old_run.underline
            new_run.font = old_run.font
            new_run.style = old_run.style
        else:
            p.text = p.text.replace(str(key), str(value))

The one issue is that new_run.font = old_run.font was returning an error.

I went ahead and added the missing font.setter and color.setter to the respective properties in run.py and font.py.
I am going to open a pull request to associate with this issue.

Ideally, I can retain the formatting of one of the runs when changing the text via an instance of Paragraph. Since I understand this is tricky and probably not worth the time, I think enabling the ability to copy the font formatting from a run to another would be helpful for those who are assuming 1 formatting to one block of text (meaning, even if Word splits the text into multiple runs, all runs have the same formatting). I think this is a good compromise for me at this time.

Let me know what you think, how I can be of assistance, or how I can adjust my solution in #1309 to better match expectations for this project.

Thank you for your excellent work!

@luissantosHCIT
Copy link
Author

I should mention that I tested my changes in #1309 to be working. It allowed me to recover the font information and pass it to the new run generated via Paragraph::text .

@scanny
Copy link
Contributor

scanny commented Dec 1, 2023

@luissantosHCIT if you work at the lxml level you can make copies of that original<w:rPr> element and apply them to new runs. That would generally be more reliable and less tedious than inspecting all the known attributes.

You'll need to search around for examples, but some of the things you'll need to know about are the lxml.etree._Element interface: https://lxml.de/api/lxml.etree._Element-class.html. All elements in python-docx are instances of these so all these methods are availble.

So something roughly like:

original_rPr = run._r.get_or_add_rPr()
new_rPr = copy.deepcopy(original_rPr)
new_r = para.add_run()._r
new_r.replace(new_r.get_or_add_rPr(), new_rPr)

@luissantosHCIT
Copy link
Author

Hi @scanny. Thank you for taking the time to share the snippet.

Yes, I have used lxml for a while now. I was trying to avoid the lower level api.

It makes more sense to me to abstract away the private member access with getters and setters.
It feels to me, the natural way to access the elements is through python-docx's Document->{Paragraphs, Tables}->{Runs, Rows, Cells} structure layout.

Do you think that perhaps it would be better to have a property like formatting that allows users to do exactly what you propose at the Run level? I am down to updating my fork and pushing such changes.

@property
def formatting(self) -> CT_RPr:
      return copy.deepcopy(self._r.get_or_add_rPr())

@formatting.setter
def formatting(self, rPr: CT_RPr):
      text = self._r.text
      new_run = self.add_run()
      new_run._r._insert_rPr(rPr)
      new_run.text = text

What do you think?

@luissantosHCIT
Copy link
Author

Updated my branch.

I only needed to add the following.
Tested to be working as desired.

    @property
    def formatting(self) -> CT_RPr:
        """The |CT_RPr| object providing access to the full range of formatting properties for
        this run, such as font name and size."""
      return copy.deepcopy(self._r.get_or_add_rPr())

    @formatting.setter
    def formatting(self, new_rPr: CT_RPr):
          self._r.replace(self._r.get_or_add_rPr(), new_rPr)

@scanny Let me know if it looks good or if you would like me to do something else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants