Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support node names with string containing unicode code points.. #17

Open
zoj613 opened this issue Jul 5, 2024 · 0 comments
Open

Support node names with string containing unicode code points.. #17

zoj613 opened this issue Jul 5, 2024 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed node related to the node module

Comments

@zoj613
Copy link
Owner

zoj613 commented Jul 5, 2024

The Zarr V3 specification insists that node names must have a name, which is a string of unicode code points. It also recommends implementations to only use characters in the sets a-z, A-Z, 0-9, -, _, . , but doesn't enforce this. It also recommends using case-folded NFKC-normalized strings for non-ASCII unicode charecters. Ocaml seems to have a bunch of libraries to support this:

  • Decoding utf-8 encoded strings with uutf
  • Segmentation of unicode text with uuseg
  • Normalization with uunf
  • Inspection with uucp.

There is also an introductory text with usage tips that can help make things easy to implement.

@zoj613 zoj613 added enhancement New feature or request help wanted Extra attention is needed node related to the node module labels Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed node related to the node module
Projects
None yet
Development

No branches or pull requests

1 participant