-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hungarian translation of the essence of calculus video #36
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kijavítottam néhány matematikai pontatlanságot
-Kaszap Máté
Good to know that DeepL is a better starting point. As to how the TTS handles it, in a week or two we can make a few samples for you to see what it does by default, and perhaps that will inform how things should be corrected. Is this one good to merge? |
Not yet. We are still working on the full correction. |
# Conflicts: # 2017/essence-of-calculus/hungarian/sentence_translations.json
Co-authored-by: MrExpert <[email protected]>
This translation was intended to be a quick suffix-fixing one, but instead it quickly went off the rails :D
Problems of the AI translation
Similarly to #13, the problem of the AI translating "you" as a formal you comes up in Hungarian as well, as well the problem of out of context words and mistranslated technical concepts. However, sometimes words, or even whole sentence clauses are omitted. Many sentences range from unnatural to incomprehensible.
Out of curiosity I tried feeding a couple strings to DeepL as proposed in #10 and it was better but wasn't quite there yet (we probably can't expect that from any tool tbh). In the end, I felt like there was something off in almost all of the sentences, so I corrected the whole thing as best as i could.
Dev experience
It wasn't absurd to edit a raw JSON file by hand, but for broader community contribution I believe it would be better to have an interface that only allows access to the relevant data, that way more non-technical people could contribute. Some kind of proofreading or peer review/voting for submitted fixes is also something that would be great to have in my opinion.
Questions
If these subtitles are going to be used for generating audio tracks, how should notation be transcribed in cases where writing it as you say it would negatively affect reading? (For example dA for small change in area could be pronounced as "daa" by the AI voice, but writing it any other way would cause a mismatch between the video and the subtitles). Maybe an optional pronunciation override?