Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for English transcriptions with J and EE #26

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dreamingfifi
Copy link
Collaborator

Added Jack, happy, style, yellow, and phone to the list of English transliterations. I also fixed "green" and "there".

Added Jack, happy, style, yellow, and phone to the list of English transliterations. I also fixed "green" and "there".
@vercel
Copy link

vercel bot commented Oct 12, 2019

This pull request is being automatically deployed with ZEIT Now (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://zeit.co/kriskowal/tengwarjs/5g67rag8z
🌍 Preview: https://tengwarjs-git-fork-dreamingfifi-patch-1.kriskowal.now.sh

@kriskowal kriskowal changed the title Update general-use.js Add tests for English transcriptions with J and EE Oct 12, 2019
@@ -33,17 +33,22 @@ module.exports = {
english: {

// english (appropriate mode) (sorted)
"Jack": "anga;quesse:a,tilde-below",
"happy": "hyarmen;parma:a,tilde-below;long-carrier:y",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to create mappings for y-above in dan-smith.js. The y in this notation currently refers to double dots below and can only be carried by the round carrier (which looks like a Latin "c").

https://tengwarjs-fpft2n6u3.now.sh/#y

We can change the names in this notation as long as we are through. It may be better to make "y" imply "above", or make both "y-above" and "y-below" explicit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then for the Sindarin vowel Y, y-sindar?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe the appearance of y-sindar? I’m only aware of two dots below and the inverted chapeau above.

"cake": "quesse;quesse:a,i-below",
"cakes": "quesse;quesse:a;silme-nuquerna:e",
"cats.": "quesse;tinco:a,s-final;full-stop", // regression
"green": "ungwe;romen;long-carrier:e;numen",
"green": "ungwe;romen;short-carrier:e;numen:e",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any cases where we need to preserve the long carrier for English in the mode for general use? This will require changes in general-use.js.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't specifically need the long carrier at all for English. Tolkien sometimes used it for pronounced vowels at the ends of words, like the Y in "by" and "history" in the title page inscription. But, other times he didn't use a long carrier at all. So, for General Use, there is no pressing need for there to be a long carrier.

I like to use it at the ends of words because I think it's pretty though. But, that's just my personal preference.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to preserve the long carrier for final Y æsthetic. Thank you for the details.

"hobbits": "hyarmen;umbar:o,tilde-below;tinco:i,s-final",
"hobbits'": "hyarmen;umbar:o,tilde-below;tinco:i,s-inverse",
"hobbits''": "hyarmen;umbar:o,tilde-below;tinco:i,s-extended",
"hobbits'''": "hyarmen;umbar:o,tilde-below;tinco:i,s-flourish",
"there": "thule;romen:e,i-below",
"there": "thule;ore:e,i-below",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you restate concisely the rule for romen vs ore? My understanding is that they are both mode and language dependent, and may even in this case have special consideration given that the following vowel is silent. It would be good to include test cases to exercise every variation. Otherwise, we might end up whacking moles.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Tolkien used Rómen and Órë, he used them according to his dialect, which is an R-drop dialect. This means that the R was only retained when followed by a pronounced vowel ("there" would use Órë, but "therein" uses Rómen) but since the computer can't read pronunciation, I suggest if it is followed by a vowel (excluding silent E) that it automatically uses Rómen and uses Órë the rest of the time. This will get it close to being correct most of the time.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the details. I agree we’d have a hard time capturing all the rules of the rhotic dialect.

These examples are good. Let’s capture all of them in cases, including the ones where we’d need to use backtick or full word exception to force the right output, like "ther`e`in".

"these": "thule;silme-nuquerna:e;short-carrier:i-below",
"these'": "thule;silme-nuquerna:e;short-carrier:e",
"these'": "thule;silme-nuquerna:e;short-carrier:e", // invalid input
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s add tests for words with a final "e" that would be the exceptions to the rule like "see" and "naïve". In these cases, we will need to use a notation to allow the user to override the default behavior, and maybe even hard-code as many known exceptions as possible.

I’ll note that pattern matching vowel-consonant-vowel to distinguish the silent E case from other voiced final E cases may require extraordinary programming.

Anything valid should be expressible, even if it requires manual intervention to express. That’s the current function of the apostrophe.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anyways to make classifications for the letters so it can tell "symbol from this list is vowel, and symbol from this list is consonant" ? Or is this what you mean by "extraordinary programming"? I don't know enough about this sort of thing to be able to tell if this would be too complex or not.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Classifying vowels and consonants is easy, except for the cases like w and y where they’re ambiguous for the purposes of English rules.

Matching patterns of three phonemes requires considerable additional work because it involves back-tracking and also taking into account consonant and vowel clusters (not a one character to one phoneme relation). Far easier to pass the burden of distinguishing silent and voiced final e to the user, just assuming e is unvoiced and ë or e` are voiced.

As in the case of naïve, to infer that the final e is voiced from the diaeresis over the ï, we’d have to thread a hint forward, over the following consonant and into the code that matches final e. That is not so hard, but threading the additional state requires altering most of the function calls in the transcriber.

@kriskowal
Copy link
Owner

Pending discussion on design direction #31

"happy": "hyarmen;parma:a,tilde-below;long-carrier:y",
"style": "silme;tinco;lambe:y,i-below",
"yellow": "anna;lambe:e,tilde-below;vala:o",
"phone": "formenparma;numen:o,i-below",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve called this parma-extended. I like formenparma and alike better though. Would be grateful for an issue to track that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants