Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add i18n to RStudio (initial elaboration and work) #149

Closed
blairdrummond opened this issue Nov 2, 2020 · 12 comments · Fixed by #185
Closed

Add i18n to RStudio (initial elaboration and work) #149

blairdrummond opened this issue Nov 2, 2020 · 12 comments · Fixed by #185
Assignees
Labels
size/XL 6+ days

Comments

@blairdrummond
Copy link
Contributor

Look into Bilungualism options for RStudio

@brendangadd brendangadd transferred this issue from StatCan/aaw Nov 5, 2020
@brendangadd brendangadd changed the title Bilingual RStudio? Add i18n to RStudio Nov 5, 2020
@wg102 wg102 self-assigned this Nov 10, 2020
@wg102
Copy link
Contributor

wg102 commented Nov 16, 2020

The i18n of Rstudio can only be partial. It seems the menus/text are still in English, but the inside (the text, and the error messages) are able to be in French, if one changes the Environment variable to French

This is the code to add the French Locale and set it.

RUN echo "fr_CA.UTF-8 UTF-8" > /etc/locale.gen && \
    locale-gen
# Configure environment
ENV CONDA_DIR=/opt/conda \
    LC_ALL=fr_CA.UTF-8 \
    LANG=fr_CA.UTF-8 \
    LANGUAGE=fr_CA.UTF-8

The next step is to find how to detect the language, and in runtime set the environment variable.
Current idea is to create a 'transparent layer' similar to remote-desktop dashboard but without UI that will check the browser language before it opens to the user.

@wg102
Copy link
Contributor

wg102 commented Nov 20, 2020

To have R-studio in ‘French’ it needs LANG to be setup with the correct locale.
The way it was decided is to “take the active language in the KF UI and automatically submit it as part of the "New Server" payload, and make the controller pass that locale as an env var (as you suggest) to all notebooks it launches. Then any container can find locale information at a known location and do with it as it pleases.” Which means to send it when creating a new notebook.
The equivalent of testing with docker run -e LANG=fr_CA.UTF-8 imageTag, which overrides whatever value of that environment variable set in the docker file.
The changes therefore need to apply to multiple places, and some things need to be verified
Kubeflow-container to add the French locale (TODO: decide where to add locale: in the base file, or in r-studio file).
Jupyter-api to add the language detection (with a controllable UI). Whatever the setting in those will inject:

  • LANG
  • LANGUAGE
  • LC_ALL ?

R-Studio (image) only needs LANG.
The R-studio in remote desktop needs both LANG and LANGUAGE.
Other applications might be impacted when changing the locales. (to be investigated)

@wg102
Copy link
Contributor

wg102 commented Nov 25, 2020

From what I gathered, the way to have environment variables would be through the PodDefault (see https://www.kubeflow.org/docs/notebooks/setup/ step 12)

@wg102
Copy link
Contributor

wg102 commented Nov 27, 2020

The short answer for this issue, is to have the Environment variable LANG set to the wanted language. For this to work, the locale for that language needs to also be available (ex: fr_CA.UTF-8).

@wg102
Copy link
Contributor

wg102 commented Dec 1, 2020

This issue is split in two part,

Note: The locales are now added as part of the Dockerfile, see https://github.com/StatCan/kubeflow-containers/blob/d2b7863936af5e42ae2d4f342d1524887c1703db/docker-bits/0_Spark.Dockerfile#L8

Some other components in kubeflow-container may need to do similar things. i18n might be related to LANG, LANGUAGE and LC_ALL env variables.

@ca-scribner
Copy link
Contributor

Started work on internationalizing the menus and commands. Command names/labels are defined by XML in src/org/rstudio/studio/client/workbench/commands/Commands.cmd.xml, which is then used by GWT in src/org/rstudio/core/Core.gwt.xml to generate Java classes at compile(?) time. Not sure where these generated classes go yet, but should be able to modify this code to make the text getters use internationalization.

@ca-scribner
Copy link
Contributor

The commands defined in the Commands.cmd.xml file are used via deferred binding to create the Java classes that actually use the commands in menus (for say the menu dropdown lists). An example of part of one of these xml files is:

<commands>
    <cmd id="newPythonDoc"  // <- Not internationalizable (command's id name.  Never shown in UI, only used to access command)
        menuLabel="_Python File"  // <- internationalizable
        desc="Create a new Python file"
        rebindable="false"/> // <- Not internationalizable 
...
</commands>

The generation of code for these classes is called for in ./src/gwt/src/org/rstudio/core/Core.gwt.xml via:

<generate-with class="org.rstudio.core.rebind.command.CommandBundleGenerator" >
  <when-type-assignable class="org.rstudio.core.client.command.CommandBundle"/>
</generate-with>

This process invokes CommandBundleGenerator.generate(), which scans the xml to create java classes for everything defined.

To internationalize this, we must:

  • Modify the resulting java classes created by CommandBundleGenerator.generate() to invoke i18n references to strings (via GWT's i18n tooling), or to directly use internationalized strings (eg: create generatedFile_en.java, generatedFile_fr.java, etc., based on some available internationalization data).
    • This can be done by adding to the generate, emitConstructor and emitCommandInitializers methods to include the required I18N imports (com.google.gwt.core.client.GWT and org.rstudio.studio.client.workbench.commands.CommandConstants), include the Constants in the class definition, etc.
  • If using standard i18n tooling for above, also automatically generate a constants interface that aligns with what was specified in the xml file, like:
    public interface CommandConstants extends Constants {
     @DefaultStringValue("_Python File")
     String newPythonDocMenuLabel();  // <- Where menuLabel is a property of newPython
     @DefaultStringValue("Create a new Python script")
     String newPythonDocDesc();
    }
    
    This should be automatically generated because if we add a new command to the xml file, we don't want to also have to add the command to the interface file (avoiding this is the whole point of managing these commands by the xml file). Note that the interface includes a default translation value (which I think is required?), which will be used if a corresponding _locale.properties file is not available. These defaults should be set automatically from the xml file, again for the same reasons of reducing replication
  • Similarly, we also need to automatically generate a _en.properties file which will also be built off the xml
  • Separately, one or more _XXX.properties files can then be generated via translation (using _en.properties as a template?)

I have successfully modified the generators to use i18n with a hard-coded interface file, but I'm not sure yet how best to automatically generate the constants interface or the properties files. Big questions are how to invoke GWT's generation mechanism properly and where they're placed once they're generated.

(note: this describes commands, but I think other things are similarly generated using this file (shortcuts, others))

@ca-scribner
Copy link
Contributor

As discussed here, JSON user prefs and state are built from a JSON schema file. This schema/resulting files is used in various locations in the UI (e.g. the Options dialog, the Command Palette) and needs to be translated as well.

The developer flow around changing these is to:

  • modify the schema JSON file
  • run this script to generate new .cpp, .hpp, and .java files
  • commit the emitted files to the repo

Maybe I should update this workflow to output files for multiple languages(?). We could build translatable files from the default language version, and maybe use a git diff or similar to identify which keys are modified and need changing (so we don't completely delete the translated files every time). This could also be done in the cmd.xml workflow.

Need to look into what content the .cpp/.hpp files contain. No idea how to internationalize those if I have to... But if just the java classes need it, that could be handled via resource bundle setting the name of the right xml file(?).

@ca-scribner
Copy link
Contributor

Easiest way forward appears to be handling building of the interface/property files for any metadata file translation (eg: XML/JSON files) the same way as the project currently uses the JSON file to build the actual user properties. We will add scripts that translate the XML/JSON files to interface/property files, using the English text in the XML/JSON files to seed the default text in the interface and English version of the property files. The property files can then be translated as needed. Typical development flow would then be:

  • Modify XML/JSON
  • Run xml-json_to_interface-property script
  • Build

@ca-scribner
Copy link
Contributor

Update of progress/general work summary documented here: ca-scribner/rstudio#1 (comment)

11000 new lines and counting in the PR haha. Although a lot of that is automatically generated through scripts

@ca-scribner
Copy link
Contributor

Brief summary of progress:

  • All commands/main menu headings (eg: File, File->New, ...) translated
  • Shortcuts not translated (not sure how to do this cleanly. still thinking)
  • All "preferences" are translated (each setting like "auto save every X seconds" has a human readable title and description - those have been translated). These come from a .json file and are described in metadata rather than code then real java code is generated from them by script. Scripts have been extended to support i18n
  • "preferences" with enumerators now have an additional enumReadable in their defining metadata which defines the human readable text that goes along with an enumerator. These are useful for the Global Options menus where dropdowns are used (eg: Autosave mode enums of [backup, nothing] have readable versions of ["Backup unsaved changes", "Do nothing"]
  • "all" options in the Options->Code menus are translated (all except one which has a specialized handler). Headings in the menu are still to be translated (see image below)

Next steps:

  • translating headings/misc text in Options->Code
  • find out how tabs in the quadrant panels are defined (see image below, for example the tabs of [Environment, History, Connections, Tutorial]) and translate those

Big outstanding items:

  • translate all other Options menus (the ~15 seen in below image)
  • translate all other static strings throughout the codebase
  • figure out how to handle translation of conditional shortcuts (eg: how to handle for i18n a shortcut of: newFile="ctrl+f" if windows, "cmd+f" if mac

image

@ca-scribner
Copy link
Contributor

Refactoring this into an epic tracked by Statcan/daaas/510. Closing this issue to claim the work already completed (fleshing out the task, doing some of the updates, etc). Future work will be tracked in separate issues

@ca-scribner ca-scribner changed the title Add i18n to RStudio Add i18n to RStudio (initial elaboration and work) May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XL 6+ days
Projects
None yet
5 participants