Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[l10n] Unified Modern Standard Arabic localization #31847

Open
shadigaafar opened this issue Mar 17, 2022 · 17 comments
Open

[l10n] Unified Modern Standard Arabic localization #31847

shadigaafar opened this issue Mar 17, 2022 · 17 comments
Labels
discussion enhancement This is not a bug, nor a new feature l10n localization

Comments

@shadigaafar
Copy link
Contributor

shadigaafar commented Mar 17, 2022

Summary 💡

You are supporting Arabic language based on countries... but the problem is Arabic language has a standard that all can understand. therefore, instead of having to do multiple locales for each Arabic speaking country, just support the Modern Standard Arabic. I don't know if there is a code for it. I mean it is just 'ar'. and since you only willing to accept only up 100 locales I think localizing to Arabic by country is a waste. and eventually Arab do not write in dialect. all books and websites used the same language "Modern Standard Arabic". if this is possible, I can do a PR.

Examples 🌈

https://mui.com/material-ui/guides/localization/

Screenshot 2023-10-19 at 17 41 32

Instead of having 'ar-SD', 'ar-SA', 'ar-EG', it should only be 'ar'

Motivation 🔦

It does not make any since to have so many locales for Arabic, and all use Standard Arabic for writing things.

@shadigaafar shadigaafar added the status: waiting for maintainer These issues haven't been looked at yet by a maintainer label Mar 17, 2022
@hbjORbj hbjORbj added docs Improvements or additions to the documentation discussion labels Mar 17, 2022
@hbjORbj hbjORbj changed the title Unified Modern Standard Arabic localization [docs] Unified Modern Standard Arabic localization Mar 17, 2022
@mnajdova
Copy link
Member

We cannot just remove the arabic localization that exists, as it would be a breaking change. What you say makes sense, but I am not an expert in arabic languages for sure :) Let's wait to see if we will see more upvotes to this. Would that work? In the mean time, you can add in your theme overrides the locale as you see fit.

@mnajdova mnajdova removed the status: waiting for maintainer These issues haven't been looked at yet by a maintainer label Mar 24, 2022
@shadigaafar
Copy link
Contributor Author

shadigaafar commented Apr 2, 2022

@mnajdova but you can. as a temporary measure, add Modern Standard Arabic to the list of locales, then remove the old ones, in the next major release.

@shadigaafar

This comment was marked as off-topic.

@mbrookes mbrookes added l10n localization and removed docs Improvements or additions to the documentation labels Apr 2, 2022
@mnajdova
Copy link
Member

mnajdova commented Jun 1, 2022

@mnajdova but you can. as a temporary measure, add Modern Standard Arabic to the list of locales, then remove the old ones, in the next major release.

I am leaving this at this moment to you, to do in your style overrides, and we can decide in v6 if we are going to make the change, based on upvotes on this issue.

@shadigaafar
Copy link
Contributor Author

@mnajdova but you can. as a temporary measure, add Modern Standard Arabic to the list of locales, then remove the old ones, in the next major release.

I am leaving this at this moment to you, to do in your style overrides, and we can decide in v6 if we are going to make the change, based on upvotes on this issue.

Could you please explain more? what do you want me to do excatly?

@flaviendelangle
Copy link
Member

If we don't decide to drop the various ar-* locales, we could at least merge them with the UMSA version instead of with english (@mnajdova is the behavior of the core the same as X right now ? If a key is missing in a language, show the english version).

As for dropping the variants, how do similar projects handle the Arabic language(s) ?

@shadigaafar
Copy link
Contributor Author

shadigaafar commented Jun 3, 2022

If we don't decide to drop the various ar-* locales, we could at least merge them with the UMSA version instead of with english (@mnajdova is the behavior of the core the same as X right now ? If a key is missing in a language, show the english version).

As for dropping the variants, how do similar projects handle the Arabic language(s) ?

Personally as i noticed in some apps by Google and Microsoft have many ar-* locales, but changing between them does not really do anyting, they all the same, useing Standard Arabic, but maybe it is important for them for other informaiion like time and etc. YouTube, Facebook and Twitter apps support only Standard Arabic (Twiitter Support also Feminine Standard Arabic) , and mostly all arabs apps, have just only one ar locale which is the Standard only. Wordpress has two ar locale, Standard and Moroccan, but the Moroccan-arabic is almost dead and it is unactive. but from my point of view, it is an over kill to have many locales for ar, and it is confusing, specially for small project like this. I mean it is not small. but there are not many strings and texts. some people misunderstood too many chooses. sometime you will come across some people with nationalitst ideas, asking to add their ar for their own country. but in a lot of times it end up dead like the arabic Moroccan locale in wordpress. Standard Arabic, is understood by all and it documented, people study it from first grade to univsersity.

@mnajdova
Copy link
Member

mnajdova commented Jun 3, 2022

If we don't decide to drop the various ar-* locales, we could at least merge them with the UMSA version instead of with english (@mnajdova is the behavior of the core the same as X right now ? If a key is missing in a language, show the english version).

The localization are simply filling the theme with default props for the components (for the content/aria generation labels etc). If there is no default prop for the particular string/or there are no localization, yes the will be filled with English, as that is what we have by default in the components).

@oliviertassinari oliviertassinari changed the title [docs] Unified Modern Standard Arabic localization [l10n] Unified Modern Standard Arabic localization Oct 19, 2023
@oliviertassinari
Copy link
Member

oliviertassinari commented Oct 19, 2023

As I understand it from https://en.wikipedia.org/wiki/Varieties_of_Arabic and https://www.youtube.com/watch?v=SDxGAH83cuo there are differences between these locales.

We could add a Modern Standard Arabic locale, maybe as just ar but it seems that we should still distinguish between the different locales. When looking at the translations we have, I can notice that our 'ar-SD', 'ar-SA', 'ar-EG' locales are not identical.

@oliviertassinari oliviertassinari added the enhancement This is not a bug, nor a new feature label Oct 19, 2023
@shadigaafar
Copy link
Contributor Author

@oliviertassinari
well, don't get your information form Wikipedia, Arabs don't write in dialects, it is just oral dialects, written language is always in Modern Standard Arabic, yes there is differences, this is why there is a standard, a standard is not a spoken language in everyday life, it is the written language of books, websites, apps, magazine and journals etc,

@mazenadel19
Copy link

@oliviertassinari
As @shadigaafar rightly points out, these represent distinct dialects within the Arabic language, much like the variations seen in English with 'en_AU', 'en_CA', 'en_GB', and 'en_US'. Streamlining them under a unified Arabic localization could enhance consistency and accessibility across different Arabic-speaking regions.

@LukasTy
Copy link
Member

LukasTy commented Dec 5, 2024

Based on mui/mui-x#15693 (comment) I would agree with the proposal.
@mui/material and @mui/x-date-pickers translations don't seem to have the notion of numbers or dates in them, hence, I don't see the point in having country-specific translations if they are exactly the same between the countries. 🤷

@oliviertassinari
Copy link
Member

oliviertassinari commented Dec 11, 2024

To continue the discussion that is happening in mui/mui-x#15693.

Overall, I believe that our naming strategy is:


I asked the Material UI team, and we agree that we should try to keep consistency and follow https://mui.com/material-ui/guides/localization/#supported-locales, which would be best.

@DiegoAndai I think we should push more on this, the default should be to replicate https://mui.com/material-ui/guides/localization/#:~:text=Egypt&text=Saudi%20Arabia&text=Sudan, unless there is a direction change. Ownership of this will actually move to be a Base UI concern.

If there is a direction change, all projects should adapt as fast as they can. I don't think we should wait upstream to propagate downstream.

For languages, where a common form for certain thing is acceptable, I wouldn't mind landing on a generic locale (like having en instead of enUS and endGB, etc.)

@LukasTy I disagree on this one en-US and en-GB are not the same locales. Spelling, vocabulary, etc. is different. So I think this is correct. If people want to only support "en" they can pick one of the two.

eu https://github.com/mui/mui-x/blob/master/packages/x-date-pickers/src/locales/eu.ts

This seems to make sense, the same basque is spoken in France and Spain, so same language but different regions. If we add a region, we need to duplicate it, it's not pragmatic.

mk https://github.com/mui/mui-x/blob/master/packages/x-date-pickers/src/locales/mk.ts

This seems to be a mistake, I saw no justification to change, added to #32288 (comment). It's technically not wrong, but it seems to be inconsistent with the above naming strategy.

Based on mui/mui-x#15693 (comment) I would agree with the proposal.

@LukasTy I think it depends on how we look at this:

  • Locales specificity
    • Today, ar-EG, ar-SA, ar-SD have different translations. Is this correct or is this wrong?
    • If the content should be the same, then ar as a macro language makes sense.
    • It the content should be different, then ar as a macro language doesn't make sense IMHO, it's not generic enough. People can cherry-pick the version they like the most.
  • Arabic as a macro language, Generic Arabic (ara) vs. Standard Arabic (arb) an individual language https://en.wikipedia.org/wiki/ISO_639-3.

As I understand this https://www.reddit.com/r/learn_arabic/comments/193tdsa/comment/khbpjtw/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button, both are used in practice.

I guess the ideal is that we can kill each locale and only have Standard Arabic. I don't know if it works in practice though.

@LukasTy
Copy link
Member

LukasTy commented Dec 12, 2024

For languages, where a common form for certain thing is acceptable, I wouldn't mind landing on a generic locale (like having en instead of enUS and endGB, etc.)

I disagree on this one en-US and en-GB are not the same locales. Spelling, vocabulary, etc. is different. So I think this is correct. If people want to only support "en" they can pick one of the two.

Fair point. I had simple translations in mind, but generally, in most cases enUS and enGB have enough different words to justify the existence of both and keep the specificity. 👍

mk https://github.com/mui/mui-x/blob/master/packages/x-date-pickers/src/locales/mk.ts

This seems to be a mistake, I saw no justification to change, added to #32288 (comment). It's technically not wrong, but it seems to be inconsistent with the above naming strategy.

I would disagree that this is a mistake. 🤷
Why is it a mistake if the language does not have different dialects?
I found references to mk in multiple resources ([1], [2], [3], [4]).
ISO-639-1 seems to agree/allow ar, mk, and similar usages.

Based on mui/mui-x#15693 (comment) I would agree with the proposal.

I think it depends on how we look at this:

  • Locales specificity

    • Today, ar-EG, ar-SA, ar-SD have different translations. Is this correct or is this wrong?
    • If the content should be the same, then ar as a macro language makes sense.
    • It the content should be different, then ar as a macro language doesn't make sense IMHO, it's not generic enough. People can cherry-pick the version they like the most.
  • Arabic as a macro language, Generic Arabic (ara) vs. Standard Arabic (arb) an individual language https://en.wikipedia.org/wiki/ISO_639-3.

Yes, I checked that there are differences in country-specific translations.
But that is up to native speakers to identify, if having them makes sense. 🤔
Given the listed resources, it seems that it is agreed that formal communication is done in Standard Arabic, hence, I think that providing a single locale is the most optimal solution in this case. Especially, when nativr speakers agree with it... 🤷 🙈

@SafaAlfulaij
Copy link

To give the bigger picture:

  • Almost every Arab country (and districts within countries) have their own alteration of Arabic, used for the spoken language.
  • An alteration of Arabic is based on the Standard Arabic, with alteration that extends to ways of speech, different words and grammar.
  • The percentage of the diversion changes from one country to another.
  • The spoken language is used only for spoken communication, and non-official/non-standard written communication within the same country (ex. social media).
  • To communicate across Arab countries, spoken or written, one would lean more to the Standard Arabic.
  • All official/formal/news communication (spoken and written) is done through Standard Arabic.
  • When translating, and as we have translators from all the different countries, and as we want the knowledge to get to all Arab countries, we use Standard Arabic.
    • There are exceptions of course, as Wikipedia have a spoken Egyptian version. I can't provide exact numbers or figures, but the Egyptian versions of articles are much more shorter/briefer than the Standard Arabic ones. Generally because of the number of people willing to provide contributions is much less compared to Standard Arabic. (I would personally also say that most of the Arab countries tries to find the (Standard) Arabic version instead, because they may not get to understand what is written in the Egyptian version)

TL;DR: We should have one Arabic localization language.

Now, some operating systems provide languages per country (ex. Android), but the differences are in numeric, date and time, currencies and locale-based settings. Since they don't want to confuse end-users by providing two settings, they had to combine them into one.

We could look into cases where MUI should be different based on locale settings (ex. number formats), and see how that can be handled. This is a rather hard subject as you could use, in the operating system, the English language as a translation, but with an Arabic locale. Something like "Created on Jun 24th, 2024" will be "Created on ‏٢٤ يناير ٢٠٢٤" (notice how it can be weird to read).

@oliviertassinari
Copy link
Member

oliviertassinari commented Dec 13, 2024

Why is it a mistake if the language does not have different dialects?
I found references to mk in multiple resources ([1], [2], [3], [4]).
ISO-639-1 seems to agree/allow ar, mk, and similar usages.

@LukasTy Technically mk is correct. However, if we implement a strategy of having the most specific locale possible, without being overkill then it's too generic, mk-MK would be more right.

What would be the value of such a policy? As I understand locales, they are meant to set conventions on how to communicate: human <> human, and machine <> human. Because humans barely move, and machines often have a physical fixed location, those locales are very often tied to specific GPS coordinates, so to a region.

By defaulting to be specific, we avoid the cases where we name a locale and then realize there is another frequently used one that shares the same prefix, and so is confusing.

it seems that it is agreed that formal communication is done in Standard Arabic, hence, I think that providing a single locale is the most optimal solution in this case.

This makes sense to me, a playbook could look like this:

  • Update the existing ar-* locales to be all the same and to Standard Arabic
  • Release, wait to see if anyone complains
  • Add ar Standard Arabic. (and we don't use general Arabic)
  • Deprecate the ar-* locales
  • Remove the ar-* locales

some operating systems provide languages per country (ex. Android), but the differences are in numeric, date and time, currencies and locale-based settings

@SafaAlfulaij Then this could be a problem, today, our locale includes date and number formatting. What's the difference?

We are exploring solutions at #24495. It would make sense to decouple number format locale, to date format locale, to text format locale. As you illustrated in the case below ⬇️

This is a rather hard subject as you could use, in the operating system, the English language as a translation, but with an Arabic locale. Something like "Created on Jun 24th, 2024" will be "Created on ‏٢٤ يناير ٢٠٢٤" (notice how it can be weird to read).

I have something similar on my phone (Android). I have set en as the locale and France as the region. It formats the dates like we do in France (strange I guess?), the numbers like English people do, and uses English for the text.
I find it frustrating that it proposes only en for the locale, I would want to set en-US.

On my laptop (macOS), I have the same setup, but this time, it feels right: locale: en-US, region: France, date: en-US, number: en-US. First day of week: France. Metric system: France.

@flaviendelangle
Copy link
Member

Update the existing ar-* locales to be all the same and to Standard Arabic
Release, wait to see if anyone complains
Add ar Standard Arabic. (and we don't use general Arabic)
Deprecate the ar-* locales
Remove the ar-* locales

The pickers don't have any ar-* locale, and the gird only have ar-SD, so the deprecation phase would be quite easy there
For the core, you have more ar-* locales.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement This is not a bug, nor a new feature l10n localization
Projects
None yet
Development

No branches or pull requests

9 participants