Add tests for English transcriptions with J and EE #26

dreamingfifi · 2019-10-12T20:44:06Z

Added Jack, happy, style, yellow, and phone to the list of English transliterations. I also fixed "green" and "there".

vercel · 2019-10-12T20:44:10Z

This pull request is being automatically deployed with ZEIT Now (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://zeit.co/kriskowal/tengwarjs/5g67rag8z
🌍 Preview: https://tengwarjs-git-fork-dreamingfifi-patch-1.kriskowal.now.sh

kriskowal · 2019-10-12T21:02:52Z

test/general-use.js

@@ -33,17 +33,22 @@ module.exports = {
    english: {

        // english (appropriate mode) (sorted)
+        "Jack": "anga;quesse:a,tilde-below",
+        "happy": "hyarmen;parma:a,tilde-below;long-carrier:y",


We will need to create mappings for y-above in dan-smith.js. The y in this notation currently refers to double dots below and can only be carried by the round carrier (which looks like a Latin "c").

https://tengwarjs-fpft2n6u3.now.sh/#y

We can change the names in this notation as long as we are through. It may be better to make "y" imply "above", or make both "y-above" and "y-below" explicit.

Then for the Sindarin vowel Y, y-sindar?

Can you describe the appearance of y-sindar? I’m only aware of two dots below and the inverted chapeau above.

kriskowal · 2019-10-12T21:04:07Z

test/general-use.js

        "cake": "quesse;quesse:a,i-below",
        "cakes": "quesse;quesse:a;silme-nuquerna:e",
        "cats.": "quesse;tinco:a,s-final;full-stop", // regression
-        "green": "ungwe;romen;long-carrier:e;numen",
+        "green": "ungwe;romen;short-carrier:e;numen:e",


Are there any cases where we need to preserve the long carrier for English in the mode for general use? This will require changes in general-use.js.

We don't specifically need the long carrier at all for English. Tolkien sometimes used it for pronounced vowels at the ends of words, like the Y in "by" and "history" in the title page inscription. But, other times he didn't use a long carrier at all. So, for General Use, there is no pressing need for there to be a long carrier.

I like to use it at the ends of words because I think it's pretty though. But, that's just my personal preference.

We might be able to preserve the long carrier for final Y æsthetic. Thank you for the details.

kriskowal · 2019-10-12T21:06:37Z

test/general-use.js

        "hobbits": "hyarmen;umbar:o,tilde-below;tinco:i,s-final",
        "hobbits'": "hyarmen;umbar:o,tilde-below;tinco:i,s-inverse",
        "hobbits''": "hyarmen;umbar:o,tilde-below;tinco:i,s-extended",
        "hobbits'''": "hyarmen;umbar:o,tilde-below;tinco:i,s-flourish",
-        "there": "thule;romen:e,i-below",
+        "there": "thule;ore:e,i-below",


Can you restate concisely the rule for romen vs ore? My understanding is that they are both mode and language dependent, and may even in this case have special consideration given that the following vowel is silent. It would be good to include test cases to exercise every variation. Otherwise, we might end up whacking moles.

When Tolkien used Rómen and Órë, he used them according to his dialect, which is an R-drop dialect. This means that the R was only retained when followed by a pronounced vowel ("there" would use Órë, but "therein" uses Rómen) but since the computer can't read pronunciation, I suggest if it is followed by a vowel (excluding silent E) that it automatically uses Rómen and uses Órë the rest of the time. This will get it close to being correct most of the time.

Thanks again for the details. I agree we’d have a hard time capturing all the rules of the rhotic dialect.

These examples are good. Let’s capture all of them in cases, including the ones where we’d need to use backtick or full word exception to force the right output, like "ther`e`in".

kriskowal · 2019-10-12T21:10:51Z

test/general-use.js

        "these": "thule;silme-nuquerna:e;short-carrier:i-below",
-        "these'": "thule;silme-nuquerna:e;short-carrier:e",
+        "these'": "thule;silme-nuquerna:e;short-carrier:e", // invalid input


Let’s add tests for words with a final "e" that would be the exceptions to the rule like "see" and "naïve". In these cases, we will need to use a notation to allow the user to override the default behavior, and maybe even hard-code as many known exceptions as possible.

I’ll note that pattern matching vowel-consonant-vowel to distinguish the silent E case from other voiced final E cases may require extraordinary programming.

Anything valid should be expressible, even if it requires manual intervention to express. That’s the current function of the apostrophe.

Is there anyways to make classifications for the letters so it can tell "symbol from this list is vowel, and symbol from this list is consonant" ? Or is this what you mean by "extraordinary programming"? I don't know enough about this sort of thing to be able to tell if this would be too complex or not.

Classifying vowels and consonants is easy, except for the cases like w and y where they’re ambiguous for the purposes of English rules.

Matching patterns of three phonemes requires considerable additional work because it involves back-tracking and also taking into account consonant and vowel clusters (not a one character to one phoneme relation). Far easier to pass the burden of distinguishing silent and voiced final e to the user, just assuming e is unvoiced and ë or e` are voiced.

As in the case of naïve, to infer that the final e is voiced from the diaeresis over the ï, we’d have to thread a hint forward, over the following consonant and into the code that matches final e. That is not so hard, but threading the additional state requires altering most of the function calls in the transcriber.

kriskowal · 2019-10-13T05:40:54Z

Pending discussion on design direction #31

kriskowal · 2019-10-13T19:56:40Z

test/general-use.js

+        "happy": "hyarmen;parma:a,tilde-below;long-carrier:y",
+        "style": "silme;tinco;lambe:y,i-below",
+        "yellow": "anna;lambe:e,tilde-below;vala:o",
+        "phone": "formenparma;numen:o,i-below",


I’ve called this parma-extended. I like formenparma and alike better though. Would be grateful for an issue to track that.

Update general-use.js

610de41

Added Jack, happy, style, yellow, and phone to the list of English transliterations. I also fixed "green" and "there".

vercel bot deployed to staging October 12, 2019 20:44 View deployment

kriskowal changed the title ~~Update general-use.js~~ Add tests for English transcriptions with J and EE Oct 12, 2019

kriskowal reviewed Oct 12, 2019

View reviewed changes

kriskowal reviewed Oct 13, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for English transcriptions with J and EE #26

Add tests for English transcriptions with J and EE #26

dreamingfifi commented Oct 12, 2019

vercel bot commented Oct 12, 2019 •

edited

Loading

kriskowal Oct 12, 2019

dreamingfifi Oct 13, 2019

kriskowal Oct 13, 2019

kriskowal Oct 12, 2019

dreamingfifi Oct 13, 2019

kriskowal Oct 13, 2019

kriskowal Oct 12, 2019

dreamingfifi Oct 13, 2019

kriskowal Oct 13, 2019

kriskowal Oct 12, 2019

dreamingfifi Oct 13, 2019

kriskowal Oct 13, 2019

kriskowal commented Oct 13, 2019

kriskowal Oct 13, 2019

Add tests for English transcriptions with J and EE #26

Are you sure you want to change the base?

Add tests for English transcriptions with J and EE #26

Conversation

dreamingfifi commented Oct 12, 2019

vercel bot commented Oct 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kriskowal commented Oct 13, 2019

Choose a reason for hiding this comment

vercel bot commented Oct 12, 2019 •

edited

Loading