Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acute accent is missing in PDF #231

Open
Intelligent2013 opened this issue Dec 22, 2023 · 0 comments
Open

Acute accent is missing in PDF #231

Intelligent2013 opened this issue Dec 22, 2023 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@Intelligent2013
Copy link
Contributor

Moved from https://github.com/metanorma/pdfa-iso-32000-2/issues/29.

Source: https://github.com/metanorma/pdfa-iso-32000-2/issues/28#issuecomment-1836104849

When PDF generates for whole document - acute accent is missing:
image

When PDF for Annex D only (just for time economy for debug purposes) - it renders correctly:
image

Whole document contains mathml. mn2pdf generates Apache IF XML if the source XML contains mathml (it adds the hidden math text for the 'copy-paste' feature). The IF XML contains the wrong characters sequence:

<text x="84469" y="185852" dp="1 4532 Z3" foi:struct-ref="38">ᴀ</text>

(XSL-FO contains correct sequence ᴀ́)

instead of:

<text x="84469" y="185852" dp="1 4532 Z3" foi:struct-ref="38">ᴀ́</text>

I'll find the reason of the characters conversion: Apache FOP, Xalan processor or mn2pdf.

For adoc:

Test Aacutesmall: &#x1D00;&#x0301;

Test Acircumflexsmall: &#x1D00;&#x0302;

Test Adieresissmall: &#x1D00;&#x0308;

PDF renders so:
image

Apache IF XML:

<text x="0" y="185852" dp="10 Z2 -891 Z35 -88 0" foi:struct-ref="37">Test Aacutesmall: </text>
<text x="84469" y="185852" dp="1 4532 Z3" foi:struct-ref="38">ᴀ</text>
...
<text x="0" y="207052" dp="16 Z2 -891 Z19 -77 Z11 -154 Z23 -121 Z3 -99 0" foi:struct-ref="3a">Test Acircumflexsmall: </text>
<text x="108218" y="207052" dp="1 4532 Z3" foi:struct-ref="3b">̂ᴀ</text>
...
<text x="0" y="228252" dp="10 Z2 -891 Z19 -77 Z15 -154 0" foi:struct-ref="3d">Test Adieresissmall: </text>
<text x="95854" y="228252" dp="1 4532 Z3" foi:struct-ref="3e">̈ᴀ</text>

Adoc with full set of combining chars:
document.zip

PDF:
document.presentation.pdf

This issue occurs due the enclosing 'char x' + 'combining char y' into the element <fo:inline xml:lang="none">...</fo:inline>

		<!-- enclose sequence of 'char x' + 'combining char y' to <lang_none>xy</lang_none> -->
		<xsl:variable name="regex_combining_chars">(.[&#x300;-&#x36f;])</xsl:variable>
		<xsl:variable name="element_name_lang_none">lang_none</xsl:variable>
		<xsl:variable name="tag_element_name_lang_none_open">###<xsl:value-of select="$element_name_lang_none"/>###</xsl:variable>
		<xsl:variable name="tag_element_name_lang_none_close">###/<xsl:value-of select="$element_name_lang_none"/>###</xsl:variable>

		<xsl:template match="text()" mode="update_xml_step2">
			<xsl:variable name="text_" select="java:replaceAll(java:java.lang.String.new(.), $regex_combining_chars, concat($tag_element_name_lang_none_open,'$1',$tag_element_name_lang_none_close))"/>
			<xsl:call-template name="replace_text_tags">
				<xsl:with-param name="tag_open" select="$tag_element_name_lang_none_open"/>
				<xsl:with-param name="tag_close" select="$tag_element_name_lang_none_close"/>
				<xsl:with-param name="text" select="$text_"/>
			</xsl:call-template>
		</xsl:template>

...
	<!-- for correct rendering combining chars -->
	<xsl:template match="*[local-name() = 'lang_none']">
		<fo:inline xml:lang="none"><xsl:value-of select="."/></fo:inline>
	</xsl:template>

This workaround solution added specially for fixing the issue with combining chars position (https://issues.apache.org/jira/browse/FOP-3065)

Combining chars render with a bit horizontal shift without xml:lang="none":
image

I'll find why only a few combining chars render as #.

@Intelligent2013 Intelligent2013 added the bug Something isn't working label Dec 22, 2023
@Intelligent2013 Intelligent2013 self-assigned this Dec 22, 2023
@github-project-automation github-project-automation bot moved this to 🆕 New in Metanorma Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🆕 New
Development

No branches or pull requests

1 participant