Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import Noto script data #165

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Import Noto script data #165

wants to merge 5 commits into from

Conversation

simoncozens
Copy link
Contributor

Adam Twardoch pulled together a load of data about each script, which is used to power the script sections in the articles of Noto fonts. It would be helpful if we had this data in the lang repo, where it could be easily updated and edited, as part of making it easier to generate new Noto articles.

@simoncozens simoncozens changed the title Add new proto fields Import Noto script data Sep 13, 2024
@@ -1,3 +1,4 @@
id: "Dupl"
name: "Duployan shorthand"

family: "American"
summary: "Duployan shorthand (Sloan-Duployan shorthand, Duployan stenography) is an American alphabet, written left-to-right. Geometric stenography script created in 1860 by Father Émile Duployé for writing French, later expanded and adapted for writing English, German, Spanish, Romanian, and Chinook Jargon. Heavily cursive (connected), allows words to be written in a single stroke. Praised for simplicity and speed of writing. Needs software support for complex text layout (shaping)."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duployan is not an American script: it is a European script. It was created in France for French.

I recommend against listing German and Spanish: they never used Duployan much. I recommend ending the list with “and many others” as in Cyrl.textproto and Latn.textproto because there are at least a couple dozen more languages for which Duployan was adapted.

@@ -1,3 +1,4 @@
id: "Hani"
name: "Han"

family: "East Asian"
summary: "Han (Hanzi, Kanji, Hanja, <span class=\'autonym\'>汉字, 漢字</span>) is an East Asian logo-syllabary, written vertically right-to-left and horizontally left-to-right (over 1.3 billion users). Used at least since the Shang dynasty (1600–1046 BCE) to write the Chinese (Sinitic) languages like Mandarin and Cantonese, but also, today or in the past, Japanese, Korean, Vietnamese, Okinawan, Zhuang, Miao and other languages. The Han script has regional variations: Traditional Chinese (since the 5th century CE, today used in Taiwan, Hong Kong, Macau), Simplified Chinese (used since 1949–1956 in mainland China, Singapore, and Malaysia), Japanese (called Hanji, used together with the Hiragana and Katakana syllabaries in Japan), Korean (called Hanja, widely used for the Korean language since 400 BCE until the mid-20th century). Fundamentally the same characters represent the same or highly related concepts across dialects and languages, which themselves are often mutually unintelligible or completely unrelated. Some 2,100–2,500 Han characters are required for basic literacy, some 5,200–6,300 for reading typical texts. Many more are needed for specialized or historical texts: the Unicode Standard encodes over 94,000 Han characters. "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s “Kanji” in Japanese, not “Hanji”.


historical: true
family: "Indic"
summary: "Nandinagari (<span class=\'autonym\'>𑧁𑧞𑧤𑦿𑧁𑧑𑦰𑧈𑧓</span>) is a historical Indic abugida, written left-to-right, with unconnected headstrokes. Was used in the 8th–19th centuries in South India for Sanskrit texts about philosophy, science and the arts. Closely related to Devanagari."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The autonym is “𑧁𑧞𑦿𑧒𑧁𑧑𑦰𑧈𑧓”.


historical: true
family: "Middle Eastern"
summary: "Chorasmian is a historical Middle Eastern abjad, written right-to-left. Was used in the 2nd century BCE–-9th century CE in the Khwarazm region of Central Asia for the now-extinct Chorasmian language, until the language switched to the Arabic script. Derived from Imperial Aramaic."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an extraneous hyphen after the dash in “BCE–-9th”.

@@ -1,3 +1,4 @@
id: "Soyo"
name: "Soyombo"

family: "Indic"
summary: "Soyombo (<span class=\'autonym\'>𑪞𑪞‎</span>) is a historical Indic abugida, written left-to-right. Was used in 1686–18th century as a ceremonial and decorative script for the Mongolian language. Also sporadically used for Tibetan and Sanskrit. Created by Bogdo Zanabazar. Needs software support for complex text layout (shaping)."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is two instances of the Soyombo symbol, not the Soyombo autonym.

@simoncozens
Copy link
Contributor Author

And this is why I wanted to have this information in a more visible repository. :-)

@@ -1,3 +1,4 @@
id: "Dupl"
name: "Duployan shorthand"

family: "American"
summary: "Duployan shorthand (Sloan-Duployan shorthand, Duployan stenography) is an European alphabet, written left-to-right. Geometric stenography script created in 1860 by Father Émile Duployé for writing French, later expanded and adapted for writing English, Chinook Jargon and many others. Heavily cursive (connected), allows words to be written in a single stroke. Praised for simplicity and speed of writing. Needs software support for complex text layout (shaping)."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“an European” should be “a European”.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The family should be "European".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants