Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnf5 does not use localized package summaries and descriptions (or field labels etc.) #1687

Open
musicinmybrain opened this issue Sep 10, 2024 · 7 comments
Labels
Priority: LOW RFE Request For Enhancement (as opposed to a bug) Triaged Someone on the DNF 5 team has read the issue and determined the next steps to take

Comments

@musicinmybrain
Copy link

As reported in #1685 (comment):

There are also differences in localization, which kind of worked in rpm, kind of worked differently in dnf 4, and doesn’t work at all in dnf5. You can compare these in Fedora 40:

  • LC_ALL=zh_CN.utf8 rpm -qi python3-fastapi: uses localized summary and description text and build date; does not localize its own fixed strings (field labels)
  • LC_ALL=zh_CN.utf8 dnf info python3-fastapi: localizes all of its own strings (dates, field labels), but does not use the localized summary and description text
  • LC_ALL=zh_CN.utf8 dnf5 info python3-fastapi: ignores locale completely, hex-escapes in the default English description

(Don’t try other locales for now; I just fixed an issue with the other localized descriptions in the python-fastapi spec file, and the updates for this are not yet in stable repositories.)

Displaying localized text will require proper Unicode support, #1685, for most locales.

@musicinmybrain musicinmybrain changed the title dnf5 does not use localized package summaries and descriptions dnf5 does not use localized package summaries and descriptions (or field labels etc.) Sep 10, 2024
@ppisar
Copy link
Contributor

ppisar commented Sep 13, 2024

There two separate issues in the "dnf info" output:

  • Localizing DNF5 user interface, i.e. field names (e.g. "Summary") and locale-independent values (e.g. a build time).
  • Presenting summary and description in the variants matching a current locale.

The first is a sole problem of DNF.

The latter depends on whether the package has not yet been installed, or whether it is already installed. For the uninstalled case it depends on what is stored in the repository.

Looking at Fedora repositories, I cannot find any translations in them. Hence, I worry DNF5 won't be able to display translations of summary and description. That confirms your DNF4 tests.

I tried creating a local repository with a package having a summary in multiple languages, and the result was disappointing: The repository stored a text for the locale used when the repository was created, but stored the text as an English text. That means a tool Fedora uses for creating repositories cannot create repositories with translations.

@ppisar ppisar added RFE Request For Enhancement (as opposed to a bug) Priority: LOW Triaged Someone on the DNF 5 team has read the issue and determined the next steps to take labels Sep 13, 2024
ppisar added a commit to ppisar/dnf5 that referenced this issue Sep 13, 2024
Displaying a package size, an installation size, a download speed etc.
formats decimal numbers to 1-digit precision after the decimal point
(49.5 KiB). However, users expect the number to be formatted according
their locale. E.g. in cs_CZ.UTF-8, it is "49,5 KiB".

DNF5 formats these values with fmt::format() which utilizes C++ locale
if "L" formatting option is used.

C++ locale (std::locale::global()) and C locale (setlocale(),
C++-wrapped as std::setlocale()) are two different things and DNF5
only set the C locale up to now..

This patch starts setting C++ locale, which also implicitly sets
C locale. This patch also modifies
libdnf5::cli::utils::units::format_size_aligned() to use the
locale-dependent decimal seperator (available since fmt-8.0.0).

I manully tested dnf5 and dnf5daemon-client and it works for me.

Though, there is a risk that the new C++ locale will affect some code
uknown to me, like regular expression matching, or thread-specific
locales. If the affected code was unfixable, we can resort to saving
the desired C++ locale into a dedicated object accessible to
format_size_aligned() and pass it explicitly for fmt::format. Thorough
testing is welcome.

Related: rpm-software-management#1687
ppisar added a commit to ppisar/dnf5 that referenced this issue Sep 13, 2024
Displaying a package size, an installation size, a download speed etc.
formats decimal numbers to 1-digit precision after the decimal point
(49.5 KiB). However, users expect the number to be formatted according
their locale. E.g. in cs_CZ.UTF-8, it is "49,5 KiB".

DNF5 formats these values with fmt::format() which utilizes C++ locale
if "L" formatting option is used.

C++ locale (std::locale::global()) and C locale (setlocale(),
C++-wrapped as std::setlocale()) are two different things and DNF5
only has set the C locale up to now.

This patch starts setting C++ locale, which also implicitly sets
C locale. This patch also modifies
libdnf5::cli::utils::units::format_size_aligned() to use the
locale-dependent decimal seperator (available since fmt-8.0.0).

I manully tested dnf5 and dnf5daemon-client and it works for me.

Though, there is a risk that the new C++ locale will affect some code
uknown to me, like regular expression matching, or thread-specific
locales. If the affected code was unfixable, we can resort to saving
the desired C++ locale into a dedicated object accessible to
format_size_aligned() and pass it explicitly for fmt::format. Thorough
testing is welcome.

Related: rpm-software-management#1687
ppisar added a commit to ppisar/dnf5 that referenced this issue Sep 13, 2024
Displaying a package size, an installation size, a download speed etc.
formats decimal numbers to 1-digit precision after the decimal point
(49.5 KiB). However, users expect the number to be formatted according
their locale. E.g. in cs_CZ.UTF-8, it is "49,5 KiB".

DNF5 formats these values with fmt::format() which utilizes C++ locale
if "L" formatting option is used.

C++ locale (std::locale::global()) and C locale (setlocale(),
C++-wrapped as std::setlocale()) are two different things and DNF5
only has set the C locale up to now.

This patch starts setting C++ locale, which also implicitly sets
C locale. This patch also modifies
libdnf5::cli::utils::units::format_size_aligned() to use the
locale-dependent decimal seperator (available since fmt-8.0.0).

I manually tested dnf5 and dnf5daemon-client and they work for me.

Though, there is a risk that the new C++ locale will affect some code
uknown to me, like regular expression matching, or thread-specific
locales. If the affected code was unfixable, we can resort to saving
the desired C++ locale into a dedicated object accessible to
format_size_aligned() and pass it explicitly for fmt::format. Thorough
testing is welcome.

Related: rpm-software-management#1687
ppisar added a commit to ppisar/dnf5 that referenced this issue Sep 13, 2024
Displaying a package size, an installation size, a download speed etc.
formats decimal numbers to 1-digit precision after the decimal point
(49.5 KiB). However, users expect the number to be formatted according
their locale. E.g. in cs_CZ.UTF-8, it is "49,5 KiB".

DNF5 formats these values with fmt::format() which utilizes C++ locale
if "L" formatting option is used.

C++ locale (std::locale::global()) and C locale (setlocale(),
C++-wrapped as std::setlocale()) are two different things and DNF5
only has set the C locale up to now.

This patch starts setting C++ locale, which also implicitly sets
C locale. This patch also modifies
libdnf5::cli::utils::units::format_size_aligned() to use the
locale-dependent decimal seperator (available since fmt-8.0.0).

I manually tested dnf5 and dnf5daemon-client and they work for me.

Though there is a risk that the new C++ locale will affect some code
uknown to me, like regular expression matching, or thread-specific
locales. If the affected code was unfixable, we can resort to saving
the desired C++ locale into a dedicated object accessible to
format_size_aligned() and pass it explicitly for fmt::format. Thorough
testing is welcome.

Related: rpm-software-management#1687
ppisar added a commit to ppisar/dnf5 that referenced this issue Sep 16, 2024
Displaying a package size, an installation size, a download speed etc.
formats decimal numbers to 1-digit precision after the decimal point
(49.5 KiB). However, users expect the number to be formatted according
their locale. E.g. in cs_CZ.UTF-8, it is "49,5 KiB".

DNF5 formats these values with fmt::format() which utilizes C++ locale
if "L" formatting option is used.

C++ locale (std::locale::global()) and C locale (setlocale(),
C++-wrapped as std::setlocale()) are two different things and DNF5
only has set the C locale up to now.

This patch starts setting C++ locale, which also implicitly sets
C locale. This patch also modifies
libdnf5::cli::utils::units::format_size_aligned() to use the
locale-dependent decimal seperator (available since fmt-8.0.0).

I manually tested dnf5 and dnf5daemon-client and they work for me.

Though there is a risk that the new C++ locale will affect some code
uknown to me, like regular expression matching, or thread-specific
locales. If the affected code was unfixable, we can resort to saving
the desired C++ locale into a dedicated object accessible to
format_size_aligned() and pass it explicitly for fmt::format. Thorough
testing is welcome.

Related: rpm-software-management#1687
@jrohel
Copy link
Contributor

jrohel commented Sep 19, 2024

In Fedora 40 there is now dnf version 5.1.17. This version does not use locale.
Support was added to the upstream some time ago. I assume the problem was solved by this commit 3d302e0 .
That is, since dnf5 version 5.2.0.

@musicinmybrain
Copy link
Author

What I see when I try this in a Fedora 41 mock chroot with --enable-network and dnf5 version 5.2.5.0 is weird.

# dnf info python3-fastapi
[…]
               :   • Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette
[…]
# dnf info uv
[…]
               :   • ⚖️ Drop-in replacement for common pip, pip-tools, and virtualenv commands.
[…]

…so Unicode output is working in my default locale. But now, if I set it explicitly:

# LC_ALL=en_US.UTF-8 dnf info python3-fastapi
[…]
               :   \xe2\x80\xa2 Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette
[…]
# dnf info uv
[…]
               :   \xe2\x80\xa2 \xe2\x9a\x96\xef\xb8\x8f Drop-in replacement for common pip, pip-tools, and virtualenv commands.
[…]

Ouch, now all the non-ASCII symbols are escaped as in the original report for #1685!

If I try LC_ALL=zh_CN.UTF-8 the same thing happens, and the field labels (Name/Epoch/Version/etc.) are still in English.

@jrohel
Copy link
Contributor

jrohel commented Sep 19, 2024

the field labels (Name/Epoch/Version/etc.) are still in English.

That is a dnf5 problem. The PR (just new) #1698 marks these strings for translation. And then the translations for each language will need to be added.

@jrohel
Copy link
Contributor

jrohel commented Sep 19, 2024

But now, if I set it explicitly:

I tried it and got Failed to set locale, defaulting to "C":

# LC_ALL=en_US.UTF-8 dnf5 info python3-fastapi
Failed to set locale, defaulting to "C"
...
Description    : FastAPI is a modern, fast (high-performance), web framework for building APIs
               : with Python 3.8+ based on standard Python type hints.
               : 
               : The key features are:
               : 
               :   \xe2\x80\xa2 Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette
               :     and Pydantic). One of the fastest Python frameworks available.
...

So, I installed langpack with en language dnf5 install glibc-langpack-en

And now it is working:

# LC_ALL=en_US.UTF-8 dnf5 info python3-fastapi
...
Description    : FastAPI is a modern, fast (high-performance), web framework for building APIs
               : with Python 3.8+ based on standard Python type hints.
               : 
               : The key features are:
               : 
               :   • Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette
               :     and Pydantic). One of the fastest Python frameworks available.
...

Can you check if you have the same problem?

@musicinmybrain
Copy link
Author

LC_ALL=en_US.UTF-8 dnf5 info python3-fastapi

Ok, I’m seeing the same thing. For dnf5 5.2.6, if the glibc-langpack-## package corresponding to the selected locale is not installed, everything is output to the console as ASCII with backslash-escapes because setlocale fails and we end up with the C locale. That is actually logical – and the reason I can’t reproduce it in a clean Fedora 40 chroot with dnf 4.21.1 is that the fallback locale in F41 and later is C, but the fallback locale in Fedora 40 and earlier is C.UTF-8. That surprises me, but isn’t dnf’s responsibility.

@musicinmybrain
Copy link
Author

The string Failed to set locale, defaulting to "C" is actually coming from dnf5 itself, and so the discrepancy between C.UTF-8 and C is also just a string change in dnf5, and doesn’t necessarily reflect a difference in the system.

dnf5/dnf5/main.cpp

Lines 959 to 965 in 117bc3e

static void set_locale() {
auto * locale = setlocale(LC_ALL, "");
if (locale) {
return;
}
std::cerr << "Failed to set locale, defaulting to \"C\"" << std::endl;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: LOW RFE Request For Enhancement (as opposed to a bug) Triaged Someone on the DNF 5 team has read the issue and determined the next steps to take
Projects
None yet
Development

No branches or pull requests

3 participants