-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't ignore localized strings that are the same as original #307
base: main
Are you sure you want to change the base?
Don't ignore localized strings that are the same as original #307
Conversation
Avoiding storing identical strings is a clever measure to keep the appstream file small and clean, but it ignores translation differences between languages and their specific locales. Applications tend to "fallback" from missing locales by picking a translation of the same language but from a different locale. For instance, when missing a "pt_BR" translation, some apps will pick the "pt" translation instead. That usually works but there are some cases when it doesn't, such as for international words: Brazilian Portuguese (pt_BR) tends to use them, while European Portuguese (pt) has a translation for everything. This way, "GNOME Boxes" gets translated to "Caixas GNOME" in European Portuguese (pt) but the same "GNOME Boxes" name is expected in Brazilian Portuguese (pt_BR). This was initially reported as a Flatpak issue in https://lists.freedesktop.org/archives/flatpak/2019-May/001578.html Because OP was seeing the wrong translations in Flathub and GNOME Software.
How much does this affect the size of something like the Fedora metadata? Certainly |
I'd say it's probably best to go for correctness, even if it increases the metadata size. Let's see how much bigger it gets and back this change out if it's unacceptably large? |
How about keeping localized strings equal original only when the fallback language ('pt', in Felipe's example) translation is presented? If absent, there will be no fallback and no problem for other languages (e.g. pt_BR). This way, lines wouldn't be added without being required and file size may not grow too much. |
My understanding is that the fallback is done in the application's side. All consumers of appstream.xml will read from it (GNOME Software, Flathub, other-software-store-thingies) and they are the ones using this heuristic to fallback missing translations. |
Old versions of gnome-software use appstream-glib to get the "best" translation, so it should be enough to fix it here for those older versions. For newer versions of gnome-software we parse it directly with libxmlb and so any fix would need to be copied there. I'm going to do some benchmarks compiling the entire fedora repo today with the dedepe functionality and without. It does take some time... |
I'm guessing this wasn't tested :) The XML file is the same size, even post decompression. I think this patch needs to drop |
Hey there, is there anything blocking this PR? Now and then Brazilians ask about this issue. |
Well, there's CI failure and the fact that the patch doesn't work, but other than that... |
@@ -1747,10 +1747,6 @@ as_node_get_localized (const AsNode *node, const gchar *key) | |||
if (g_strcmp0 (xml_lang, "x-test") == 0) | |||
continue; | |||
|
|||
/* avoid storing identical strings */ | |||
data_localized = data->cdata; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the data_localized variable.
/* avoid storing identical strings */ | ||
data_localized = data->cdata; | ||
if (xml_lang != NULL && g_strcmp0 (data_unlocalized, data_localized) == 0) | ||
continue; | ||
g_hash_table_insert (hash, | ||
as_ref_string_ref (xml_lang != NULL ? xml_lang : xml_lang_c), | ||
(gpointer) data_localized); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of data_localized, you need to use data->cdata
Hi @felipeborges, I tried to update your branch but I don't have rights, so I just commented it. |
Avoiding storing identical strings is a clever measure to keep the
appstream file small and clean, but it ignores translation differences
between languages and their specific locales.
Applications tend to "fallback" from missing locales by picking a
translation of the same language but from a different locale.
For instance, when missing a "pt_BR" translation, some apps will
pick the "pt" translation instead. That usually works but there
are some cases when it doesn't, such as for international words:
Brazilian Portuguese (pt_BR) tends to use them, while European
Portuguese (pt) has a translation for everything.
This way, "GNOME Boxes" gets translated to "Caixas GNOME" in
European Portuguese (pt) but the same "GNOME Boxes" name is
expected in Brazilian Portuguese (pt_BR).
This was initially reported as a Flatpak issue in
https://lists.freedesktop.org/archives/flatpak/2019-May/001578.html
Because OP was seeing the wrong translations in Flathub and
GNOME Software.