Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolves #2485 convert string with non-ASCII characters to NFD when applying smallcaps transformation #2486

Merged
merged 1 commit into from
Mar 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Enhancements::
* add symbols for shift, command, option, and return keys to the fallback font
* add support for `nowrap` and `nobreak` roles on formatted text
* do not insert break opportunities into bare URL with `nobreak` role
* convert string with non-ASCII characters to NFD when applying smallcaps transformation to support diacritics (#2485)

Improvements::

Expand Down
39 changes: 5 additions & 34 deletions docs/modules/theme/pages/text.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -144,44 +144,15 @@ It accepts the following keywords:
capitalize:: Transforms the first letter of each word to a capital letter.
lowercase:: Transforms all the text to lowercase letters.
none:: Clears an inherited value and no case transformation is applied to the text.
smallcaps:: Replaces lowercase ASCII letters (a-z) with their small capital variant.
Lowercase letters outside the ASCII range are not transformed.
smallcaps:: Normalizes text as NFD (decomposed normalized form) and replaces lowercase ASCII letters (a-z) with their small capital variant.
The diacritic mark will be applied to the small capital letter using a combining character as a result of the text being normalized as NFD.
uppercase:: Transforms all the text to capital letters.

The `text-transform` key can't be set on the xref:base.adoc[base category].

If you want the smallcaps transformation to support letters beyond the a-z range, you can do so by overridding the `smallcaps` method in an extended converter.

[,ruby]
----
class MyPDFConverter < (Asciidoctor::Converter.for 'pdf')
register_for 'pdf'

def smallcaps string
string = super
string = string.gsub 'é', %(\u1d07\u0301)
string
end
end
----

This transformation can be automated using `String#unicode_normalize` with the `:nfd` form.
This method will rewrite all characters with diacritical marks so that the diacritical mark is added using a combining character (i.e., a two graphene form).

[,ruby]
----
class MyPDFConverter < (Asciidoctor::Converter.for 'pdf')
register_for 'pdf'

def smallcaps string
string = string.unicode_normalize :nfd unless string.ascii_only?
super
end
end
----

The smallcaps transformation for extended Latin characters (e.g., characters that include an accent) typically requires the addition of a combining character, such as the combining acute accent in the example above).
Therefore, you must ensure that the font you're using supports these combining characters (meaning it provides the necessary glyphs).
The smallcaps transformation for extended Latin characters (e.g., characters outside the a-z range that include an accent) require the use of a combining character, such as the combining acute accent.
Therefore, if you want the smallcaps transformation to support letters beyond the a-z range, you need to ensure the font you're using provides the required https://en.wikipedia.org/wiki/Combining_character[combining characters^] to support diacritics (meaning it provides the necessary glyphs).
Otherwise, these characters will appear as missing glyph boxes in the transformed text.

[#superscript]
== Superscript
Expand Down
1 change: 1 addition & 0 deletions lib/asciidoctor/pdf/text_transformer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ def smallcaps_pcdata string
end

def smallcaps string
string = string.unicode_normalize :nfd unless string.ascii_only?
string.tr LowerAlphaChars, SmallCapsChars
end

Expand Down
5 changes: 5 additions & 0 deletions spec/formatted_text_formatter_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1024,6 +1024,11 @@
(expect pdf.lines).to eql ['HTML stands for HʏᴘᴇʀTᴇxᴛ Mᴀʀᴋᴜᴘ Lᴀɴɢᴜᴀɢᴇ']
end

it 'should decompose non-ASCII characters when applying smallcaps text transform' do
pdf = to_pdf '== Références', pdf_theme: { heading_text_transform: 'smallcaps' }, analyze: true
(expect pdf.lines).to eql [%(R\u1d07\u0301ғ\u1d07\u0301ʀᴇɴᴄᴇs)]
end

it 'should allow custom role to specify relative font size' do
pdf_theme = {
heading_h2_font_size: 24,
Expand Down
Loading