Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] An approach to linting the translations #166

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Quuxplusone
Copy link
Contributor

@Quuxplusone Quuxplusone commented Jun 28, 2021

As noted in #154, it's very easy to "break" the translations files. If we wanted, we could try to automate the detection of broken translations, by comparing the strings in language-??.cpp against the strings in the actual program source code. I wrote a little proof-of-concept, using a Python script to extract the string literals from the program. Run make to see the very spammy output:

Unused translation in PL,CZ,RU,DE: "\n\nOnce you collect 10 Bomberbird Eggs, stepping on a cell with no adjacent mines also reveals the adjacent cells. Collecting even more Eggs will increase the radius. Additionally, collecting 25 Bomberbird Eggs will reveal adjacent cells even in your future games."
Closest match in program: "\n\nOnce you collect a Bomberbird Egg, stepping on a cell with no adjacent mines also reveals the adjacent cells. Collecting even more Eggs will increase the radius."
Unused translation in PL,CZ,RU,TR,DE: " (E:%1)"
Closest match in program: " (e)"
Unused translation in PL,CZ,RU,TR,DE,PT: " (expired)"
Closest match in program: " (Emerald)"
Unused translation in PL,CZ,RU: " (increases treasure spawn)"
Closest match in program: " (killing increases treasure spawn)"
Unused translation in DE: " Hell: %1/9"
Closest match in program: " Hell: %1/%2"
Unused translation in PL,CZ,RU,TR,DE,PT: " [%1 turns]"
Closest match in program: "%1 turns"
Unused translation in PL,CZ,RU,TR,DE: " kills: %1"
Closest match in program: "kills: %1"
Unused translation in PL,CZ,RU,TR,DE: "\"By now, you should have your own formula, you know?\""
Closest match in program: "\"I would like to congratulate you again!\""
Unused translation in PL,CZ,RU,TR,DE: "\"I like collecting ambers at the beach.\""
Closest match in program: "finer lines at the boundary"
Unused translation in PL,CZ: "#%1, cells: %2"
Closest match in program: "bad cells: %1"
Unused translation in PL,CZ,RU,TR,DE: "%1 takes %his1 revenge on %the2!"
Closest match in program: "%The1 takes %his1 revenge on %the2!"
Unused translation in PL,CZ,RU,TR,DE: "%The1 bites %the2!"
Closest match in program: "%The1 eats %the2!"
Unused translation in PL,CZ,RU,TR,DE,PT: "%The1 breaks the mirror!"
Closest match in program: "%The1 breathes fire!"
Unused translation in PL,CZ,RU,TR,DE,PT: "%The1 disperses the cloud!"
Closest match in program: "%The1 fills the hole!"
[...and so on, and so on ... 674 lines...]

I don't expect this to be merged as-is. But it might serve as a jumping-off point for someone either to polish this automated detector, or to go through its spammy output and make a pull request fixing the low-hanging fruit. E.g. changing "%1 takes %his1 revenge on %the2!" into "%The1 takes %his1 revenge on %the2!"

@zenorogue
Copy link
Owner

Have you seen devmods/gentrans.cpp? It does a similar job (looking for texts that should be translated).

@Quuxplusone
Copy link
Contributor Author

Quuxplusone commented Jun 28, 2021

Have you seen devmods/gentrans.cpp? It does a similar job (looking for texts that should be translated).

I had not seen it, no. IIUC, gentrans.cpp performs the opposite operation from what I did in this PR: it looks for strings in the program that lack any translations, as opposed to translations that are unreachable from anywhere in the program.

$ make mymake
$ ./mymake devmods/gentrans
$ ./hyper -gentrans
[...]
001076 // checking all the files
001099 S("HyperRogue %1: online demo", literal in hyperweb.cpp:142)
001099 S("play the game", literal in hyperweb.cpp:145)
001099 S("learn about hyperbolic geometry", literal in hyperweb.cpp:146)
001099 S("toggle high detail", literal in hyperweb.cpp:148)
001099 S("Temple of Cthulhu", literal in hyperweb.cpp:152)
001099 S("Land of Storms", literal in hyperweb.cpp:153)
001099 S("Burial Grounds", literal in hyperweb.cpp:154)
001109 S(
001109     "released under GNU General Public License version 2 and thus "
001109     "comes with absolutely no warranty; see COPYING for details\n\n"
001109     , literal in help.cpp:225)
001113 // unrecognized nonliteral: tour::slides[tour::currentslide].name in help.cpp:1046
001121 S("?", mdmodes[vid.monmode])
001123 S("One wrong move and it is game over!", literal in menus.cpp:757)
[...]

There's some low-hanging fruit in there, too; e.g. the translation files are inconsistent between "One wrong move and it is game over!" and "One wrong move, and it is game over!".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants