Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Italian translation #257

Open
Tnonis90 opened this issue Mar 19, 2024 · 33 comments
Open

Add Italian translation #257

Tnonis90 opened this issue Mar 19, 2024 · 33 comments

Comments

@Tnonis90
Copy link
Contributor

Hello @NSoiffer ,
this is Tommaso from VisionDept SRL., the italian distributor for Vispero / JAWS. We are interested in tackling the Italian for MathCAT, and are ready to start translating.
Regarding speech, I kindly ask you to set up the environment with the automatic translations, so we can start out.
As for Braille, we'd need a discussion on what code to choose: in italy, almost everyone nowadays uses the LAMBDA Math Code (Italian version). Do you happen to have any familiarity with that?

Thanks a lot, and ook forward to getting started with this.

Tommaso

@NSoiffer
Copy link
Owner

I'm very glad to help out with the Italian translations.

I'll build an initial translation in the next day or two and let you know the details.

I know a little about the LAMBDA math code, but I need a specification as to how the MathML maps to its linear format (I didn't see anything at https://www.lambdaproject.org/). I just finished implementing the German LaTeX braille code. For that, I didn't translate directly to the braille dots, but instead to the ASCII chars for LaTeX and then let the current braille mapping table do the translation. I was told that was preferable because each country that might the LaTeX use this would have different 8-dot mappings, or might use a 6-dot mapping. I suspect something similar is desirable for the LAMBDA code. Is that true?

@Tnonis90
Copy link
Contributor Author

Thanks for letting me know. At
www.veia.it
you can download Lambda 1.44, which does contain an XML specification with all Braille markup code. Do not use Lambda 2, as everything's embedded in the executable file in that specific version.

Thanks!

@NSoiffer
Copy link
Owner

NSoiffer commented Mar 23, 2024

I've created an "it" branch. Clone MathCAT and checkout that branch. There are instructions for translators here. Here's a short list:

  1. Go to Rules/Languages/it
  2. Open unicode.yaml and look through the translations. They are likely mostly good. If a translation is good, change the "t: ..." to "T: ...". This marks the translation as having been verified that it is good. There are some if tests for some things and those translations are more likely to be not as good. Hopefully the syntax is understandable.
  3. Open SimpleSpeak_Rules.yaml and again look through the translations (search for "t: "). Again, there are tests here such as for Verbosity and for Blindness. English often uses "the square root of ..." in a verbose setting, but in a terse one, it might be shortened to "square root x" (dropping "the" and "of"). If Italian never uses those extra words, just use an empty string. In a second pass, we can talk about making the rules more natural for Italian.
  4. Open all the files in SharedRules and do the same thing as for SimpleSpeak_Rules.yaml
  5. Open definitions.yaml. This has words for cardinal numbers (one, two, three...) and ordinal numbers (first, second, third...). Also for words used in fractions ("half", ...). These translations are likely correct, but there might be some bad ones.

At any point, you can test these out in NVDA if you have the MathCAT addon. After installing the addon, to test, copy the 'it' directory to %AppData%\nvda\addons\MathCAT\globalPlugins\MathCAT\Rules\Languages. Start NVDA or if it is running, restart NVDA (or you can go to/click on NVDA:Tools:Reload Plugins). If you have an Italian voice, it should use the Italian speech rules. Go to any page with MathML (e.g, https://it.wikipedia.org/wiki/Equazione_di_secondo_grado) and the math should be spoken in Italian. If NVDA+MathPlayer works with LAMBDA, NVDA+MathCAT should also, so that would be another source of math. If NVDA says there is an error in speaking the math, open the NVDA log (NVDA+F1). The message is a little bit hard to understand, but it will hopefully guide you to a place where you have a typo (e.g, accidentally deleted a quote mark).

A similar process applies to MathCAT in JAWS. However, I haven't used JAWS much and not with MathCAT at all. I'm not sure where the MathCAT files are stored, but wherever that is, a similar process (copy the files to Rules/Languages/it) as with NVDA should be followed. I'm not sure if it picks up the Italian voice automatically. In NVDA, that's code that I wrote.

Good luck. If you want to do a teleconference call some time, I can walk you through the process and that might clear up some questions. In a few hours, you might have something that speaks ok for some expressions and in a few days, does ok for many common expressions. To get a really natural translation, you may want to add or delete some rules and I can talk you through those or write them for you with your input.

At some point, you should do some work on unicode-full.yaml. This is where less commonly used characters can be found. It is a huge file (~3,600 lines), so you may want scan through every now and then when you are feeling a little bored and translate characters you think are really poorly translated (anything marked with 'google translation' is more likely poorly translated).

And also we can talk about whether it makes sense to implement ClearSpeak or some other speech style.

@NSoiffer
Copy link
Owner

@Tnonis90

Thanks for letting me know. At www.veia.it you can download Lambda 1.44, which does contain an XML specification with all Braille markup code.

I only see options for LAMBDA 2, BM2021, and EBKey. I tried BM2021 in the hopes that was old version, but after downloading, I see the "BM" stands for "Braille Music", so that's not the right thing. The EBKey description also indicates that's right either. I didn't see anything else that I could download. Can you clarify what I should get?

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Mar 25, 2024 via email

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Mar 26, 2024 via email

@NSoiffer
Copy link
Owner

@Tnonis90 : I don't see your commit. Did you forget to do a "git push"?

Also, I think I answered this:

So far, I have found a problem with an untranslatable string, “out of” This string speaks when you up arrow out of an inner element (e.g. denominator). Could you please tell me where to fix this string so it speaks in correct Italian?

In case I didn't, you need to translate navigate.yaml -- I left that out of my shortened instructions by accident.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 15, 2024 via email

@NSoiffer
Copy link
Owner

When I click on your SHA, the top of the page says "This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository."

I need to get some sleep now. If you don't beat me to it, I'll look into what's going on when I get up and see if I can correct it/bring it into the repo.

@NSoiffer
Copy link
Owner

I asked my git-savy son for help and the only thing we could come up with is to essentially clone things and copy files over. That's error-prone.

If you created a fork, and committed your changes there, I could do something, but I'd need to know what your fork is.

Maybe you can try and do a pull request from your repo or branch into the MathCAT repo. That would likely be the shortest path to getting this right. One place I saw says that the error sometimes comes from pushing a tag, not a branch.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 23, 2024 via email

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 23, 2024 via email

@NSoiffer
Copy link
Owner

Probably the best path is to do a "Pull Request" (on top, third item after "code" and "issues") on your repo's page. It will probably suggest what to do. If not click "New pull request" (on top right) and then choose the "...compare across forks" link.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 29, 2024 via email

@NSoiffer
Copy link
Owner

As you probably saw, I merged your code into the 'it' branch. I know you want to use the JAWS character translations for the characters they have. However, if you think the current files are good enough to use until you do more work, let me know and I'll merge the 'it' branch into main.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 30, 2024 via email

@NSoiffer
Copy link
Owner

At the end of March, I downloaded Lambda from your link and tried it out and ran into several issues. Between having to write/finish a paper for ICCHP and immersing myself in the update to Nemeth for chemistry, my memory is hazy on the problems I found :-{

I do remember that Lambda didn't work on many of the MathML examples I tried to import. I remember decompiling mathml2lambda.pyc to get a better sense of what Lambda supports, but I don't remember what I concluded (if anything). I don't think I found any documentation on the lambda code itself. Do you know where there is documentation on it?

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented May 9, 2024 via email

@NSoiffer
Copy link
Owner

I could take the list of more commonly used math symbols in MathCAT (in unicode.yaml) and pull out the symbols to a file that I could then copy and paste and see what lambda generates. Then copy then back and with the aid of a program, stick that result into proper spot in the MathCAT table. I'm travelling right now and don't have lambda on my laptop, but if you paste in something like ÷, λ, ←, does lambda show useful dots? If so, that would not be too much work.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented May 14, 2024 via email

@NSoiffer
Copy link
Owner

I've attached a list of 360 characters, one character per line. If this isn't a good format, let me know what format you would like (e.g, all chars on a single line or 40 chars per line or ...).

I tried Lambda 2 myself, but I don't know settings I should use to get the proper braille chars. In JAWS, if I set the output table to Italian, I only see 6 dot and computer braille options. I think you need to do the conversion.

Note: there are four invisible chars in the list (U+2061 - U+2064). I don't know if lambda supports these. I suspect there might be some others that aren't supported. The list begins with a blank char (which probably translates to an empty braille cell).

List of characters: chars.txt

In order to know what braille char corresponds to what Unicode char (and hence create the list in MathCAT), don't delete any chars even if they don't translate. That way I can know that what is on line 137 corresponds to α. The alternative is for you send back something like
Λ = dots 1238
for each char.
What you send back can be the actual braille char (you need to let me know what the mapping is or use the Unicode braille chars) or something like 1238 and I can covert that to dots.

Hopefully this approach works.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Jul 3, 2024 via email

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Jul 3, 2024 via email

@NSoiffer
Copy link
Owner

NSoiffer commented Nov 8, 2024

I almost accidentally included the unfinished Italian translation in the latest MathCAT release. That reminded me of unfinished work:

  • the spoken translation (I think just the Unicode files)
  • the lambda braille work

Do you know if Vispero is doing anything to move forward the spoken translation? I haven't seen any progress from them. If they aren't doing it, is it something you would be willing to do. Because the seeded Unicode translations borrowed from earlier translations, I think (maybe wishful thinking :-) that a lot of the translations are ok so it may be just a matter of reading them and mostly changing "t:" to "T:"... at least for those in uncode.yaml. The ones in unicode-full.yaml might be more iffy, but they are much less frequently used.

For the lambda translations, I'm not sure how to proceed. If JAWS is doing the translation, then that means it is either using a user default setting or lambda is changing the default table to use. If it is doing the latter, then if I knew which table it was using, I could grab that and use it.

In most of the braille codes I generate, I generate Unicode braille characters. However, for a few like ASCIIMath and LaTeX, I just pass through the characters (which should all be ASCII chars) and let JAWS/NVDA convert them using whatever table the user has set (6-dot or 8-dot). That means things like ∈ turn into "in" which is translated by JAWS/NVDA to 6 or 8 braille. I don't think that would work for lambda though.

I just tried to run Lambda again to refresh my memory, but my demo expired. Since I think we have reached a dead end, I think the next step if for me to contact them and see if they are willing to send me the tables they use or help in some other way. Does that seem like a good way to move forward? Do you think you would have better success dealing with them than I do (e.g., you have worked with them in the past)? If I contact them, can I mention your name as someone working with me on the translation?

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Nov 11, 2024 via email

@NSoiffer
Copy link
Owner

I had previously downloaded lambda 1.44, so the demo period for that is over and I can't run it. Before you translate the math code with all the Braille combinations, how about just translating 5 - 10 and sending those? That way I you won't be wasting much time if it turns out that isn't the info that I need or I need the info in a different format.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Nov 21, 2024 via email

@NSoiffer
Copy link
Owner

There wasn't an HTML file attached. You may need to go to github directly and attach the files. Or send me email directly and not go through github.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Nov 22, 2024 via email

@NSoiffer
Copy link
Owner

NSoiffer commented Nov 22, 2024 via email

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Nov 22, 2024 via email

@Tnonis90
Copy link
Contributor Author

I don't seem to be able to retrieve youremail in any way. It's grayed out every time. Could you write to me at
Tommaso at visiondept dot it
and share this so I can attach the HTML files?
Best
Tommaso

@NSoiffer
Copy link
Owner

@Tnonis90: I sent you email. If you don't see it, please check your spam folder as my email sometimes ends up in spam folders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants