Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xWidthAvg: Add subset support for non-latin character sets #177

Merged
merged 53 commits into from
Mar 8, 2024
Merged
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
644339e
Add support for `xWidthAvgByLang`
michaeltaranto Nov 16, 2023
314045d
Merge branch 'master' into add-language-support
michaeltaranto Nov 16, 2023
bdbeba2
Add missing newline
michaeltaranto Nov 16, 2023
3d5efb2
Fix preview site build
michaeltaranto Nov 16, 2023
2286f68
Fix build order
michaeltaranto Nov 16, 2023
17a4176
Maybe?
michaeltaranto Nov 16, 2023
114bb01
Update node for actions
michaeltaranto Nov 16, 2023
0eb5055
Update node and ts-node
michaeltaranto Nov 16, 2023
0bdedba
Scripts update
michaeltaranto Nov 17, 2023
f282371
Move to tsx
michaeltaranto Nov 17, 2023
f0a2472
Node ssl legacy fix
michaeltaranto Nov 19, 2023
4b84246
Downgrade node
michaeltaranto Nov 20, 2023
d473b1b
Merge branch 'master' into add-language-support
michaeltaranto Nov 20, 2023
9dc7d85
Sort data sets
michaeltaranto Nov 20, 2023
fa269b2
Merge branch 'master' into add-language-support
michaeltaranto Nov 21, 2023
1d9d464
Move dep and redo lock
michaeltaranto Nov 21, 2023
9f3447c
Tweaks
michaeltaranto Nov 21, 2023
77885ed
Skip corepack check on snapshot and release
michaeltaranto Nov 21, 2023
c96beb4
Named exports for languages
michaeltaranto Nov 29, 2023
f13eb21
Tweaks
michaeltaranto Nov 29, 2023
5dd780b
Add subset support
michaeltaranto Dec 12, 2023
9ddd295
Updates
michaeltaranto Dec 21, 2023
3d10fdc
End to end working
michaeltaranto Dec 22, 2023
8494c9e
Merge branch 'master' into add-subset-support
michaeltaranto Dec 22, 2023
73dcddf
Merge branch 'master' into add-subset-support
michaeltaranto Jan 18, 2024
7acffec
pnpm v7 lock file
michaeltaranto Jan 18, 2024
bfc5d7e
add more changesets
michaeltaranto Jan 31, 2024
ee8350b
Merge branch 'master' into add-subset-support
michaeltaranto Feb 5, 2024
fe45627
More changesets
michaeltaranto Feb 6, 2024
890f57f
Another changeset
michaeltaranto Feb 6, 2024
2efbf2f
Merge master
michaeltaranto Feb 7, 2024
85c1080
Merge branch 'master' into add-subset-support
michaeltaranto Feb 7, 2024
809a9dd
Merge branch 'master' into add-subset-support
michaeltaranto Feb 14, 2024
e2298c0
Merge frequencies PR
michaeltaranto Feb 15, 2024
f6dd6e7
Merge branch 'master' into add-subset-support
michaeltaranto Feb 16, 2024
9d46e47
Fix merge
michaeltaranto Feb 16, 2024
03ebb3e
Fix lock file
michaeltaranto Feb 16, 2024
3595c60
Merge branch 'master' into add-subset-support
michaeltaranto Feb 26, 2024
7e2bbd8
Update changeset and dev command workflow
michaeltaranto Feb 26, 2024
9bb5375
Merge branch 'master' into add-subset-support
michaeltaranto Feb 27, 2024
7b0f00a
Merge branch 'master' into add-subset-support
michaeltaranto Feb 27, 2024
3cf9cd8
Update changeset
michaeltaranto Feb 27, 2024
7e8b99d
Tidy up and document adding future subsets
michaeltaranto Feb 27, 2024
f269177
Apply suggestions from code review
michaeltaranto Feb 27, 2024
9ab3baf
PR feedback
michaeltaranto Feb 27, 2024
d93ffee
Additional feedback
michaeltaranto Feb 29, 2024
186780f
Refactor subsets internal to metric object
michaeltaranto Mar 4, 2024
b90b809
Include unpack in changeset
michaeltaranto Mar 4, 2024
54aa415
Add subset support to createFontStack
michaeltaranto Mar 4, 2024
a3a1900
Include space in weightings
michaeltaranto Mar 4, 2024
8a04bd2
Merge branch 'master' into add-subset-support
michaeltaranto Mar 6, 2024
42bbf67
Make subsets optional on FontMetrics type
michaeltaranto Mar 6, 2024
9d7ae2f
Apply suggestions from code review
michaeltaranto Mar 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions .changeset/pink-plants-develop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
'@capsizecss/metrics': minor
'@capsizecss/unpack': minor
---

xWidthAvg: Add `subset` support for non-latin character sets

Previously the `xWidthAvg` metric was calculated based on the character frequency as measured from English text only — meaning languages that use a different unicode subset range, e.g. Thai, the `xWidthAvg` metric would be incorrect.

Supporting Thai now enables adding support for other unicode ranges in the future.

### Whats changed?

#### `@capsizecss/metrics/...`

Individual metrics can be imported as named exports from each font's entry point.
The default export will continue to be `latin`.

```ts
// Default export provides `latin` subset
import arial from '@capsizecss/metrics/arial';

// Named exports available for all supported subsets:
import { latin as arialLatin } from '@capsizecss/metrics/arial'; // same as default above
import { thai as arialThai } from '@capsizecss/metrics/arial';
```

#### `@capsizecss/metrics/entireMetricsCollection`

Same goes for the `entireMetricCollection`, with named exports for each subset.
The default export will continue to be `latin`.

```ts
// Default export provides `latin` subset
import arial from '@capsizecss/metrics/entireMetricsCollection';

// Named exports available for all supported subsets:
import { latin as metricsLatin } from '@capsizecss/metrics/entireMetricsCollection'; // same as default above
import { thai as metricsThai } from '@capsizecss/metrics/entireMetricsCollection';
```

#### `@capsizecss/unpack`

All APIs now accept a second argument, an options object to specify the `subset`.
This will ensure the returned `xWidthAvg` metric returned is accurate for the specified subset.

```ts
const metrics = await fromUrl(url, {
subset: 'thai',
});
```
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"test": "jest",
"format": "prettier --write .",
"lint": "manypkg check && prettier --check . && tsc",
"dev": "pnpm %packages dev && pnpm generate",
"dev": "pnpm unpack:generate && pnpm %packages dev && pnpm metrics:generate",
"build": "pnpm %packages build && pnpm generate",
"generate": "pnpm unpack:generate && pnpm metrics:generate",
"copy-readme": "node scripts/copy-readme",
Expand Down
47 changes: 34 additions & 13 deletions packages/metrics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,17 +41,17 @@ const capsizeStyles = createStyleObject({

The font metrics object returned contains the following properties if available:

| Property | Type | Description |
| ---------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| familyName | string | The font family name as authored by font creator |
| category | string | The style of the font: serif, sans-serif, monospace, display, or handwriting. |
| capHeight | number | The height of capital letters above the baseline |
| ascent | number | The height of the ascenders above baseline |
| descent | number | The descent of the descenders below baseline |
| lineGap | number | The amount of space included between lines |
| unitsPerEm | number | The size of the font’s internal coordinate grid |
| xHeight | number | The height of the main body of lower case letters above baseline |
| xWidthAvg | number | The average width of character glyphs in the font. Calculated based on character frequencies in written text ([see below]), falling back to the built in [xAvgCharWidth] from the OS/2 table. |
| Property | Type | Description |
| ---------- | ------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| familyName | string | The font family name as authored by font creator |
| category | string | The style of the font: serif, sans-serif, monospace, display, or handwriting. |
| capHeight | number | The height of capital letters above the baseline |
| ascent | number | The height of the ascenders above baseline |
| descent | number | The descent of the descenders below baseline |
| lineGap | number | The amount of space included between lines |
| unitsPerEm | number | The size of the font’s internal coordinate grid |
| xHeight | number | The height of the main body of lower case letters above baseline |
| xWidthAvg | number | The average width of character glyphs in the font for the selected unicode subset. Calculated based on character frequencies in written text ([see below]), falling back to the built in [xAvgCharWidth] from the OS/2 table. |

#### How `xWidthAvg` is calculated

Expand All @@ -61,15 +61,36 @@ The value takes a weighted average of character glyph widths in the font, fallin
The purpose of this metric is to support generating CSS metric overrides (e.g. [`ascent-override`], [`size-adjust`], etc) for fallback fonts, enabling inference of average line lengths so that a fallback font can be scaled to better align with a web font. This can be done either manually or using [`createFontStack`].

For this technique to be effective, the metric factors in a character frequency weightings as observed in written language, using “abstracts” from [Wikinews] articles as a data source.
Currently only supporting English ([source](https://en.wikinews.org/)).
Below is the source analysed for each supported subset:

| Subset | Language |
| ------- | -------------------------------------------- |
| `latin` | English ([source](https://en.wikinews.org/)) |
| `thai` | Thai ([source](https://th.wikinews.org/)) |

For more information on how to access the metrics for different subsets, see the [subsets](#subsets) section below.

[see below]: #how-xwidthavg-is-calculated
[xavgcharwidth]: https://learn.microsoft.com/en-us/typography/opentype/spec/os2#xavgcharwidth
[`ascent-override`]: https://developer.mozilla.org/en-US/docs/Web/CSS/@font-face/ascent-override
[`size-adjust`]: https://developer.mozilla.org/en-US/docs/Web/CSS/@font-face/size-adjust
[`createfontstack`]: ../core/README.md#createfontstack
[wikinews]: https://www.wikinews.org/

## Subsets

The default export for each fonts metrics is the `latin` subset, however there are named exports available for each of the supported subsets.

For example:

```ts
// Default export provides `latin` subset
import arial from '@capsizecss/metrics/arial';

// Named exports available for all supported subsets:
import { latin as arialLatin } from '@capsizecss/metrics/arial'; // same as default above
import { thai as arialThai } from '@capsizecss/metrics/arial';
```

## Supporting APIs

### `fontFamilyToCamelCase`
Expand Down
14 changes: 8 additions & 6 deletions packages/metrics/scripts/analyse.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@ import fs from 'fs';
import path from 'path';
import googleFontsMetrics from './googleFonts.json';

type FontMetrics = (typeof googleFontsMetrics)[number];
type FontName = keyof typeof googleFontsMetrics;
type FontMetrics = (typeof googleFontsMetrics)[FontName]['latin'];

interface Report {
name: string;
run: (fontMetrics: FontMetrics) => boolean;
Expand Down Expand Up @@ -34,7 +36,7 @@ interface Report {
},
{
name: 'capHeightLessThanHalfAscent',
run: (font) => {
run: (font: FontMetrics) => {
if (
'capHeight' in font &&
font.capHeight &&
Expand All @@ -50,13 +52,13 @@ interface Report {

const results: Record<string, Array<FontMetrics>> = {};

for (const font of googleFontsMetrics) {
for (const font of Object.keys(googleFontsMetrics) as FontName[]) {
for (const report of reports) {
if (report.run(font)) {
if (report.run(googleFontsMetrics[font]['latin'])) {
if (report.name in results) {
results[report.name].push(font);
results[report.name].push(googleFontsMetrics[font]['latin']);
} else {
results[report.name] = [font];
results[report.name] = [googleFontsMetrics[font]['latin']];
}
}
}
Expand Down
38 changes: 30 additions & 8 deletions packages/metrics/scripts/buildMetrics.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,11 @@
import { Font, fromFile, fromUrl } from '@capsizecss/unpack';
import {
Font,
fromFile,
fromUrl,
supportedSubsets,
type SupportedSubsets,
} from '@capsizecss/unpack';
import sortKeys from 'sort-keys';

type FontCategory =
| 'serif'
Expand All @@ -10,6 +17,11 @@ export interface MetricsFont extends Font {
category: FontCategory;
}

export type MetricsByFamilyBySubset = Record<
string,
Record<SupportedSubsets, MetricsFont>
>;

interface Options {
fontSource: string;
sourceType: 'file' | 'url';
Expand All @@ -30,12 +42,22 @@ export const buildMetrics = async ({
sourceType,
category,
overrides = {},
}: Options): Promise<MetricsFont> => {
const metrics = await extractor[sourceType](fontSource);
}: Options): Promise<MetricsByFamilyBySubset> => {
const content: MetricsByFamilyBySubset = {};

await Promise.all(
supportedSubsets.map(async (subset) => {
const metrics = await extractor[sourceType](fontSource, { subset });
const name = overrides.familyName || metrics.familyName;
content[name] = content[name] || {};
content[name][subset] = {
...metrics,
...overrides,
category,
};
content[name] = sortKeys(content[name]);
}),
);

return {
...metrics,
...overrides,
category,
};
return content;
};
42 changes: 19 additions & 23 deletions packages/metrics/scripts/extractSystemFontMetrics.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import sortKeys from 'sort-keys';
import fs from 'fs/promises';
import path from 'path';
import { buildMetrics } from './buildMetrics';
import { type MetricsByFamilyBySubset, buildMetrics } from './buildMetrics';

(async () => {
const fontDirectory = process.env.FONT_DIRECTORY;
Expand Down Expand Up @@ -112,28 +113,23 @@ import { buildMetrics } from './buildMetrics';
},
});

const content = [
arial,
appleSystem,
blinkMacSystemFont,
roboto,
segoeui,
oxygen,
helvetica,
helveticaNeue,
timesNewRoman,
tahoma,
lucidaGrande,
verdana,
trebuchetMS,
georgia,
courierNew,
brushScript,
].sort((a, b) => {
const fontA = a.familyName.toUpperCase();
const fontB = b.familyName.toUpperCase();

return fontA < fontB ? -1 : fontA > fontB ? 1 : 0;
const content: MetricsByFamilyBySubset = sortKeys({
...arial,
...appleSystem,
...blinkMacSystemFont,
...roboto,
...segoeui,
...oxygen,
...helvetica,
...helveticaNeue,
...timesNewRoman,
...tahoma,
...lucidaGrande,
...verdana,
...trebuchetMS,
...georgia,
...courierNew,
...brushScript,
});

await fs.writeFile(
Expand Down
Loading
Loading