Skip to content

Commit 279afbc

Browse files
authored
Merge pull request #131 from dart-lang/merge-characters-package
Merge `package:characters`
2 parents 0237f43 + 94061ca commit 279afbc

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+22904
-0
lines changed

.github/labeler.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@
88
- changed-files:
99
- any-glob-to-any-file: 'pkgs/async/**'
1010

11+
"package:characters":
12+
- changed-files:
13+
- any-glob-to-any-file: 'pkgs/characters/**'
14+
1115
"package:convert":
1216
- changed-files:
1317
- any-glob-to-any-file: 'pkgs/convert/**'

.github/workflows/characters.yaml

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
name: package:characters
2+
3+
on:
4+
# Run CI on pushes to the main branch, and on PRs against main.
5+
push:
6+
branches: [ main ]
7+
paths:
8+
- '.github/workflows/characters.yaml'
9+
- 'pkgs/characters/**'
10+
pull_request:
11+
branches: [ main ]
12+
paths:
13+
- '.github/workflows/characters.yaml'
14+
- 'pkgs/characters/**'
15+
schedule:
16+
- cron: "0 0 * * 0"
17+
env:
18+
PUB_ENVIRONMENT: bot.github
19+
20+
defaults:
21+
run:
22+
working-directory: pkgs/characters/
23+
24+
jobs:
25+
# Check code formatting and static analysis on a single OS (linux)
26+
# against dev, stable, and 2.19.0 (the package's lower bound).
27+
analyze:
28+
runs-on: ubuntu-latest
29+
strategy:
30+
fail-fast: false
31+
matrix:
32+
sdk: [dev, stable, 3.4]
33+
steps:
34+
- uses: actions/checkout@d632683dd7b4114ad314bca15554477dd762a938
35+
- uses: dart-lang/setup-dart@0a8a0fc875eb934c15d08629302413c671d3f672
36+
with:
37+
sdk: ${{ matrix.sdk }}
38+
- id: install
39+
name: Install dependencies
40+
run: dart pub get
41+
- name: Check formatting
42+
run: dart format --output=none --set-exit-if-changed .
43+
if: matrix.sdk == 'dev' && steps.install.outcome == 'success'
44+
- name: Analyze code
45+
run: dart analyze --fatal-infos
46+
if: always() && steps.install.outcome == 'success'
47+
48+
# Run tests on a matrix consisting of two dimensions:
49+
# 1. OS: ubuntu-latest
50+
# 2. Release channel: dev, stable, and 2.19.0 (the package's lower bound)
51+
test:
52+
needs: analyze
53+
runs-on: ${{ matrix.os }}
54+
strategy:
55+
fail-fast: false
56+
matrix:
57+
os: [ubuntu-latest]
58+
sdk: [dev, stable, 3.4]
59+
steps:
60+
- uses: actions/checkout@d632683dd7b4114ad314bca15554477dd762a938
61+
- uses: dart-lang/setup-dart@0a8a0fc875eb934c15d08629302413c671d3f672
62+
with:
63+
sdk: ${{ matrix.sdk }}
64+
- id: install
65+
name: Install dependencies
66+
run: dart pub get
67+
- name: Run VM tests
68+
run: dart test --platform vm
69+
if: always() && steps.install.outcome == 'success'
70+
- name: Run Chrome tests
71+
run: dart test --platform chrome
72+
if: always() && steps.install.outcome == 'success'

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ This repository is home to various Dart packages under the [dart.dev](https://pu
1010
|---|---|---|
1111
| [args](pkgs/args/) | Library for defining parsers for parsing raw command-line arguments into a set of options and values. | [![pub package](https://img.shields.io/pub/v/args.svg)](https://pub.dev/packages/args) |
1212
| [async](pkgs/async/) | Utility functions and classes related to the 'dart:async' library.| [![pub package](https://img.shields.io/pub/v/async.svg)](https://pub.dev/packages/async) |
13+
| [characters](pkgs/characters/) | String replacement with operations that are Unicode/grapheme cluster aware. | [![pub package](https://img.shields.io/pub/v/characters.svg)](https://pub.dev/packages/characters) |
1314
| [convert](pkgs/convert/) | Utilities for converting between data representations. | [![pub package](https://img.shields.io/pub/v/convert.svg)](https://pub.dev/packages/convert) |
1415
| [crypto](pkgs/crypto/) | Implementations of SHA, MD5, and HMAC cryptographic functions. | [![pub package](https://img.shields.io/pub/v/crypto.svg)](https://pub.dev/packages/crypto) |
1516
| [fixnum](pkgs/fixnum/) | Library for 32- and 64-bit signed fixed-width integers. | [![pub package](https://img.shields.io/pub/v/fixnum.svg)](https://pub.dev/packages/fixnum) |

pkgs/characters/.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
.dart_tool/
2+
.packages
3+
pubspec.lock
4+
doc/api/

pkgs/characters/AUTHORS

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Below is a list of people and organizations that have contributed
2+
# to the Dart project. Names should be added to the list like so:
3+
#
4+
# Name/Organization <email address>
5+
6+
Google LLC

pkgs/characters/CHANGELOG.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
## 1.3.1
2+
3+
* Fixed README rendering on pub.dev and API docs.
4+
* Require Dart `^3.4.0`.
5+
* Move to `dart-lang/core` monorepo.
6+
7+
## 1.3.0
8+
9+
* Updated to use Unicode 15.0.0.
10+
11+
## 1.2.1
12+
13+
* Update the value of the pubspec `repository` field.
14+
15+
## 1.2.0
16+
17+
* Fix `Characters.where` which unnecessarily did the iteration and test twice.
18+
* Adds `Characters.empty` constant and makes `Characters("")` return it.
19+
* Changes the argument type of `Characters.contains` to (covariant) `String`.
20+
The implementation still accepts `Object?`, so it can be cast to
21+
`Iterable<Object?>`, but you get warned if you try to call directly with a
22+
non-`String`.
23+
24+
## 1.1.0
25+
26+
* Stable release for null safety.
27+
* Added `stringBeforeLength` and `stringAfterLength` to `CharacterRange`.
28+
* Added `CharacterRange.at` constructor.
29+
* Added `getRange(start, end)` and `characterAt(pos)` to `Characters`
30+
as alternative to `.take(end).skip(start)` and `getRange(pos, pos + 1)`.
31+
* Change some positional parameter names from `other` to `characters`.
32+
33+
## 1.0.0
34+
35+
* Core APIs deemed stable; package version set to 1.0.0.
36+
* Added `split` methods on `Characters` and `CharacterRange`.
37+
38+
## 0.5.0
39+
40+
* Change [codeUnits] getter to [utf16CodeUnits] which returns an iterable.
41+
This avoids leaking that the underlying string has efficient UTF-16
42+
code unit access in the API, and allows the same interface to be
43+
just as efficiently implemented on top of UTF-8.
44+
45+
## 0.4.0
46+
47+
* Added an extension method on `String` to allow easy access to the `Characters`
48+
of the string:
49+
50+
```dart
51+
print('The first character is: ' + myString.characters.first)
52+
```
53+
54+
* Updated Dart SDK dependency to Dart 2.6.0
55+
56+
## 0.3.1
57+
58+
* Added small example in `example/main.dart`
59+
* Enabled pedantic lints and updated code to resolve issues.
60+
61+
## 0.3.0
62+
63+
* Updated API which does not expose the underlying string indices.
64+
65+
## 0.1.0
66+
67+
* Initial release

pkgs/characters/LICENSE

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
Copyright 2019, the Dart project authors.
2+
3+
Redistribution and use in source and binary forms, with or without
4+
modification, are permitted provided that the following conditions are
5+
met:
6+
7+
* Redistributions of source code must retain the above copyright
8+
notice, this list of conditions and the following disclaimer.
9+
* Redistributions in binary form must reproduce the above
10+
copyright notice, this list of conditions and the following
11+
disclaimer in the documentation and/or other materials provided
12+
with the distribution.
13+
* Neither the name of Google LLC nor the names of its
14+
contributors may be used to endorse or promote products derived
15+
from this software without specific prior written permission.
16+
17+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
18+
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
19+
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
20+
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
21+
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
22+
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
23+
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
24+
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
25+
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
26+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

pkgs/characters/README.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
[![Build Status](https://github.com/dart-lang/core/actions/workflows/characters.yaml/badge.svg)](https://github.com/dart-lang/core/actions/workflows/characters.yaml)
2+
[![pub package](https://img.shields.io/pub/v/characters.svg)](https://pub.dev/packages/characters)
3+
[![package publisher](https://img.shields.io/pub/publisher/characters.svg)](https://pub.dev/packages/characters/publisher)
4+
5+
[`Characters`][Characters] are strings viewed as
6+
sequences of **user-perceived character**s,
7+
also known as [Unicode (extended) grapheme clusters][Grapheme Clusters].
8+
9+
The [`Characters`][Characters] class allows access to
10+
the individual characters of a string,
11+
and a way to navigate back and forth between them
12+
using a [`CharacterRange`][CharacterRange].
13+
14+
## Unicode characters and representations
15+
16+
There is no such thing as plain text.
17+
18+
Computers only know numbers,
19+
so any "text" on a computer is represented by numbers,
20+
which are again stored as bytes in memory.
21+
22+
The meaning of those bytes are provided by layers of interpretation,
23+
building up to the *glyph*s that the computer displays on the screen.
24+
25+
| Abstraction | Dart Type | Usage | Example |
26+
| --------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
27+
| Bytes | [`ByteBuffer`][ByteBuffer],<br />[`Uint8List`][Uint8List] | Physical layout: Memory or network communication. | `file.readAsBytesSync()` |
28+
| [Code units][] | [`Uint8List`][Uint8List] (UTF&#x2011;8)<br />[`Uint16List`][Uint16List], [`String`][String] (UTF&#x2011;16) | Standard formats for<br /> encoding code points in memory.<br />Stored in memory using one (UTF&#x2011;8) or more (UTF&#x2011;16) bytes. One or more code units encode a code point. | `string.codeUnits`<br />`string.codeUnitAt(index)`<br />`utf8.encode(string)` |
29+
| [Code points][] | [`Runes`][Runes] | The Unicode unit of meaning. | `string.runes` |
30+
| [Grapheme Clusters][] | [`Characters`][Characters] | Human perceived character. One or more code points. | `string.characters` |
31+
| [Glyphs][] | | Visual rendering of grapheme clusters. | `print(string)` |
32+
33+
A Dart `String` is a sequence of UTF-16 code units,
34+
just like strings in JavaScript and Java.
35+
The runtime system decides on the underlying physical representation.
36+
37+
That makes plain strings inadequate
38+
when needing to manipulate the text that a user is viewing, or entering,
39+
because string operations are not working at the grapheme cluster level.
40+
41+
For example, to abbreviate a text to, say, the 15 first characters or glyphs,
42+
a string like "A 🇬🇧 text in English"
43+
should abbreviate to "A 🇬🇧 text in Eng&mldr; when counting characters,
44+
but will become "A 🇬🇧 text in &mldr;"
45+
if counting code units using [`String`][String] operations.
46+
47+
Whenever you need to manipulate strings at the character level,
48+
you should be using the [`Characters`][Characters] type,
49+
not the methods of the [`String`][String] class.
50+
51+
## The Characters class
52+
53+
The [`Characters`][Characters] class exposes a string
54+
as a sequence of grapheme clusters.
55+
All operations on [`Characters`][Characters] operate
56+
on entire grapheme clusters,
57+
so it removes the risk of splitting combined characters or emojis
58+
that are inherent in the code-unit based [`String`][String] operations.
59+
60+
You can get a [`Characters`][Characters] object for a string using either
61+
the constructor [`Characters(string)`][Characters constructor]
62+
or the extension getter `string.characters`.
63+
64+
At its core, the class is an [`Iterable<String>`][Iterable]
65+
where the element strings are single grapheme clusters.
66+
This allows sequential access to the individual grapheme clusters
67+
of the original string.
68+
69+
On top of that, there are operations mirroring the operations
70+
of [`String`][String] that are not index, code-unit or code-point based,
71+
like [`startsWith`][Characters.startsWith]
72+
or [`replaceAll`][Characters.replaceAll].
73+
There are some differences between these and the [`String`][String] operations.
74+
For example the replace methods only accept characters as pattern.
75+
Regular expressions are not grapheme cluster aware,
76+
so they cannot be used safely on a sequence of characters.
77+
78+
Grapheme clusters have varying length in the underlying representation,
79+
so operations on a [`Characters`][Characters] sequence cannot be index based.
80+
Instead, the [`CharacterRange`][CharacterRange] *iterator*
81+
provided by [`Characters.iterator`][Characters.iterator]
82+
has been greatly enhanced.
83+
It can move both forwards and backwards,
84+
and it can span a *range* of grapheme cluster.
85+
Most operations that can be performed on a full [`Characters`][Characters]
86+
can also be performed on the grapheme clusters
87+
in the range of a [`CharacterRange`][CharacterRange].
88+
The range can be contracted, expanded or moved in various ways,
89+
not restricted to using [`moveNext`][CharacterRange.moveNext],
90+
to move to the next grapheme cluster.
91+
92+
Example:
93+
94+
```dart
95+
// Using String indices.
96+
String? firstTagString(String source) {
97+
var start = source.indexOf('<') + 1;
98+
if (start > 0) {
99+
var end = source.indexOf('>', start);
100+
if (end >= 0) {
101+
return source.substring(start, end);
102+
}
103+
}
104+
return null;
105+
}
106+
107+
// Using CharacterRange operations.
108+
Characters? firstTagCharacters(Characters source) {
109+
var range = source.findFirst('<'.characters);
110+
if (range != null && range.moveUntil('>'.characters)) {
111+
return range.currentCharacters;
112+
}
113+
return null;
114+
}
115+
```
116+
117+
[ByteBuffer]: https://api.dart.dev/dart-typed_data/ByteBuffer-class.html "ByteBuffer class"
118+
[CharacterRange.moveNext]: https://pub.dev/documentation/characters/latest/characters/CharacterRange/moveNext.html "CharacterRange.moveNext"
119+
[CharacterRange]: https://pub.dev/documentation/characters/latest/characters/CharacterRange-class.html "CharacterRange class"
120+
[Characters constructor]: https://pub.dev/documentation/characters/latest/characters/Characters/Characters.html "Characters constructor"
121+
[Characters.iterator]: https://pub.dev/documentation/characters/latest/characters/Characters/iterator.html "CharactersRange get iterator"
122+
[Characters.replaceAll]: https://pub.dev/documentation/characters/latest/characters/Characters/replaceAll.html "Characters.replaceAlle"
123+
[Characters.startsWith]: https://pub.dev/documentation/characters/latest/characters/Characters/startsWith.html "Characters.startsWith"
124+
[Characters]: https://pub.dev/documentation/characters/latest/characters/Characters-class.html "Characters class"
125+
[Code Points]: https://unicode.org/glossary/#code_point "Unicode Code Point"
126+
[Code Units]: https://unicode.org/glossary/#code_unit "Unicode Code Units"
127+
[Glyphs]: https://unicode.org/glossary/#glyph "Unicode Glyphs"
128+
[Grapheme Clusters]: https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries "Unicode (Extended) Grapheme Cluster"
129+
[Iterable]: https://api.dart.dev/dart-core/Iterable-class.html "Iterable class"
130+
[Runes]: https://api.dart.dev/dart-core/Runes-class.html "Runes class"
131+
[String]: https://api.dart.dev/dart-core/String-class.html "String class"
132+
[Uint16List]: https://api.dart.dev/dart-typed_data/Uint16List-class.html "Uint16List class"
133+
[Uint8List]: https://api.dart.dev/dart-typed_data/Uint8List-class.html "Uint8List class"

pkgs/characters/analysis_options.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
include: package:dart_flutter_team_lints/analysis_options.yaml
2+
3+
analyzer:
4+
errors:
5+
prefer_single_quotes: ignore

0 commit comments

Comments
 (0)