-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #131 from dart-lang/merge-characters-package
Merge `package:characters`
- Loading branch information
Showing
46 changed files
with
22,904 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
name: package:characters | ||
|
||
on: | ||
# Run CI on pushes to the main branch, and on PRs against main. | ||
push: | ||
branches: [ main ] | ||
paths: | ||
- '.github/workflows/characters.yaml' | ||
- 'pkgs/characters/**' | ||
pull_request: | ||
branches: [ main ] | ||
paths: | ||
- '.github/workflows/characters.yaml' | ||
- 'pkgs/characters/**' | ||
schedule: | ||
- cron: "0 0 * * 0" | ||
env: | ||
PUB_ENVIRONMENT: bot.github | ||
|
||
defaults: | ||
run: | ||
working-directory: pkgs/characters/ | ||
|
||
jobs: | ||
# Check code formatting and static analysis on a single OS (linux) | ||
# against dev, stable, and 2.19.0 (the package's lower bound). | ||
analyze: | ||
runs-on: ubuntu-latest | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
sdk: [dev, stable, 3.4] | ||
steps: | ||
- uses: actions/checkout@d632683dd7b4114ad314bca15554477dd762a938 | ||
- uses: dart-lang/setup-dart@0a8a0fc875eb934c15d08629302413c671d3f672 | ||
with: | ||
sdk: ${{ matrix.sdk }} | ||
- id: install | ||
name: Install dependencies | ||
run: dart pub get | ||
- name: Check formatting | ||
run: dart format --output=none --set-exit-if-changed . | ||
if: matrix.sdk == 'dev' && steps.install.outcome == 'success' | ||
- name: Analyze code | ||
run: dart analyze --fatal-infos | ||
if: always() && steps.install.outcome == 'success' | ||
|
||
# Run tests on a matrix consisting of two dimensions: | ||
# 1. OS: ubuntu-latest | ||
# 2. Release channel: dev, stable, and 2.19.0 (the package's lower bound) | ||
test: | ||
needs: analyze | ||
runs-on: ${{ matrix.os }} | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
os: [ubuntu-latest] | ||
sdk: [dev, stable, 3.4] | ||
steps: | ||
- uses: actions/checkout@d632683dd7b4114ad314bca15554477dd762a938 | ||
- uses: dart-lang/setup-dart@0a8a0fc875eb934c15d08629302413c671d3f672 | ||
with: | ||
sdk: ${{ matrix.sdk }} | ||
- id: install | ||
name: Install dependencies | ||
run: dart pub get | ||
- name: Run VM tests | ||
run: dart test --platform vm | ||
if: always() && steps.install.outcome == 'success' | ||
- name: Run Chrome tests | ||
run: dart test --platform chrome | ||
if: always() && steps.install.outcome == 'success' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.dart_tool/ | ||
.packages | ||
pubspec.lock | ||
doc/api/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Below is a list of people and organizations that have contributed | ||
# to the Dart project. Names should be added to the list like so: | ||
# | ||
# Name/Organization <email address> | ||
|
||
Google LLC |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
## 1.3.1 | ||
|
||
* Fixed README rendering on pub.dev and API docs. | ||
* Require Dart `^3.4.0`. | ||
* Move to `dart-lang/core` monorepo. | ||
|
||
## 1.3.0 | ||
|
||
* Updated to use Unicode 15.0.0. | ||
|
||
## 1.2.1 | ||
|
||
* Update the value of the pubspec `repository` field. | ||
|
||
## 1.2.0 | ||
|
||
* Fix `Characters.where` which unnecessarily did the iteration and test twice. | ||
* Adds `Characters.empty` constant and makes `Characters("")` return it. | ||
* Changes the argument type of `Characters.contains` to (covariant) `String`. | ||
The implementation still accepts `Object?`, so it can be cast to | ||
`Iterable<Object?>`, but you get warned if you try to call directly with a | ||
non-`String`. | ||
|
||
## 1.1.0 | ||
|
||
* Stable release for null safety. | ||
* Added `stringBeforeLength` and `stringAfterLength` to `CharacterRange`. | ||
* Added `CharacterRange.at` constructor. | ||
* Added `getRange(start, end)` and `characterAt(pos)` to `Characters` | ||
as alternative to `.take(end).skip(start)` and `getRange(pos, pos + 1)`. | ||
* Change some positional parameter names from `other` to `characters`. | ||
|
||
## 1.0.0 | ||
|
||
* Core APIs deemed stable; package version set to 1.0.0. | ||
* Added `split` methods on `Characters` and `CharacterRange`. | ||
|
||
## 0.5.0 | ||
|
||
* Change [codeUnits] getter to [utf16CodeUnits] which returns an iterable. | ||
This avoids leaking that the underlying string has efficient UTF-16 | ||
code unit access in the API, and allows the same interface to be | ||
just as efficiently implemented on top of UTF-8. | ||
|
||
## 0.4.0 | ||
|
||
* Added an extension method on `String` to allow easy access to the `Characters` | ||
of the string: | ||
|
||
```dart | ||
print('The first character is: ' + myString.characters.first) | ||
``` | ||
|
||
* Updated Dart SDK dependency to Dart 2.6.0 | ||
|
||
## 0.3.1 | ||
|
||
* Added small example in `example/main.dart` | ||
* Enabled pedantic lints and updated code to resolve issues. | ||
|
||
## 0.3.0 | ||
|
||
* Updated API which does not expose the underlying string indices. | ||
|
||
## 0.1.0 | ||
|
||
* Initial release |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
Copyright 2019, the Dart project authors. | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are | ||
met: | ||
|
||
* Redistributions of source code must retain the above copyright | ||
notice, this list of conditions and the following disclaimer. | ||
* Redistributions in binary form must reproduce the above | ||
copyright notice, this list of conditions and the following | ||
disclaimer in the documentation and/or other materials provided | ||
with the distribution. | ||
* Neither the name of Google LLC nor the names of its | ||
contributors may be used to endorse or promote products derived | ||
from this software without specific prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS | ||
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT | ||
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR | ||
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT | ||
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, | ||
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT | ||
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | ||
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY | ||
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
[![Build Status](https://github.com/dart-lang/core/actions/workflows/characters.yaml/badge.svg)](https://github.com/dart-lang/core/actions/workflows/characters.yaml) | ||
[![pub package](https://img.shields.io/pub/v/characters.svg)](https://pub.dev/packages/characters) | ||
[![package publisher](https://img.shields.io/pub/publisher/characters.svg)](https://pub.dev/packages/characters/publisher) | ||
|
||
[`Characters`][Characters] are strings viewed as | ||
sequences of **user-perceived character**s, | ||
also known as [Unicode (extended) grapheme clusters][Grapheme Clusters]. | ||
|
||
The [`Characters`][Characters] class allows access to | ||
the individual characters of a string, | ||
and a way to navigate back and forth between them | ||
using a [`CharacterRange`][CharacterRange]. | ||
|
||
## Unicode characters and representations | ||
|
||
There is no such thing as plain text. | ||
|
||
Computers only know numbers, | ||
so any "text" on a computer is represented by numbers, | ||
which are again stored as bytes in memory. | ||
|
||
The meaning of those bytes are provided by layers of interpretation, | ||
building up to the *glyph*s that the computer displays on the screen. | ||
|
||
| Abstraction | Dart Type | Usage | Example | | ||
| --------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | ||
| Bytes | [`ByteBuffer`][ByteBuffer],<br />[`Uint8List`][Uint8List] | Physical layout: Memory or network communication. | `file.readAsBytesSync()` | | ||
| [Code units][] | [`Uint8List`][Uint8List] (UTF‑8)<br />[`Uint16List`][Uint16List], [`String`][String] (UTF‑16) | Standard formats for<br /> encoding code points in memory.<br />Stored in memory using one (UTF‑8) or more (UTF‑16) bytes. One or more code units encode a code point. | `string.codeUnits`<br />`string.codeUnitAt(index)`<br />`utf8.encode(string)` | | ||
| [Code points][] | [`Runes`][Runes] | The Unicode unit of meaning. | `string.runes` | | ||
| [Grapheme Clusters][] | [`Characters`][Characters] | Human perceived character. One or more code points. | `string.characters` | | ||
| [Glyphs][] | | Visual rendering of grapheme clusters. | `print(string)` | | ||
|
||
A Dart `String` is a sequence of UTF-16 code units, | ||
just like strings in JavaScript and Java. | ||
The runtime system decides on the underlying physical representation. | ||
|
||
That makes plain strings inadequate | ||
when needing to manipulate the text that a user is viewing, or entering, | ||
because string operations are not working at the grapheme cluster level. | ||
|
||
For example, to abbreviate a text to, say, the 15 first characters or glyphs, | ||
a string like "A 🇬🇧 text in English" | ||
should abbreviate to "A 🇬🇧 text in Eng… when counting characters, | ||
but will become "A 🇬🇧 text in …" | ||
if counting code units using [`String`][String] operations. | ||
|
||
Whenever you need to manipulate strings at the character level, | ||
you should be using the [`Characters`][Characters] type, | ||
not the methods of the [`String`][String] class. | ||
|
||
## The Characters class | ||
|
||
The [`Characters`][Characters] class exposes a string | ||
as a sequence of grapheme clusters. | ||
All operations on [`Characters`][Characters] operate | ||
on entire grapheme clusters, | ||
so it removes the risk of splitting combined characters or emojis | ||
that are inherent in the code-unit based [`String`][String] operations. | ||
|
||
You can get a [`Characters`][Characters] object for a string using either | ||
the constructor [`Characters(string)`][Characters constructor] | ||
or the extension getter `string.characters`. | ||
|
||
At its core, the class is an [`Iterable<String>`][Iterable] | ||
where the element strings are single grapheme clusters. | ||
This allows sequential access to the individual grapheme clusters | ||
of the original string. | ||
|
||
On top of that, there are operations mirroring the operations | ||
of [`String`][String] that are not index, code-unit or code-point based, | ||
like [`startsWith`][Characters.startsWith] | ||
or [`replaceAll`][Characters.replaceAll]. | ||
There are some differences between these and the [`String`][String] operations. | ||
For example the replace methods only accept characters as pattern. | ||
Regular expressions are not grapheme cluster aware, | ||
so they cannot be used safely on a sequence of characters. | ||
|
||
Grapheme clusters have varying length in the underlying representation, | ||
so operations on a [`Characters`][Characters] sequence cannot be index based. | ||
Instead, the [`CharacterRange`][CharacterRange] *iterator* | ||
provided by [`Characters.iterator`][Characters.iterator] | ||
has been greatly enhanced. | ||
It can move both forwards and backwards, | ||
and it can span a *range* of grapheme cluster. | ||
Most operations that can be performed on a full [`Characters`][Characters] | ||
can also be performed on the grapheme clusters | ||
in the range of a [`CharacterRange`][CharacterRange]. | ||
The range can be contracted, expanded or moved in various ways, | ||
not restricted to using [`moveNext`][CharacterRange.moveNext], | ||
to move to the next grapheme cluster. | ||
|
||
Example: | ||
|
||
```dart | ||
// Using String indices. | ||
String? firstTagString(String source) { | ||
var start = source.indexOf('<') + 1; | ||
if (start > 0) { | ||
var end = source.indexOf('>', start); | ||
if (end >= 0) { | ||
return source.substring(start, end); | ||
} | ||
} | ||
return null; | ||
} | ||
// Using CharacterRange operations. | ||
Characters? firstTagCharacters(Characters source) { | ||
var range = source.findFirst('<'.characters); | ||
if (range != null && range.moveUntil('>'.characters)) { | ||
return range.currentCharacters; | ||
} | ||
return null; | ||
} | ||
``` | ||
|
||
[ByteBuffer]: https://api.dart.dev/dart-typed_data/ByteBuffer-class.html "ByteBuffer class" | ||
[CharacterRange.moveNext]: https://pub.dev/documentation/characters/latest/characters/CharacterRange/moveNext.html "CharacterRange.moveNext" | ||
[CharacterRange]: https://pub.dev/documentation/characters/latest/characters/CharacterRange-class.html "CharacterRange class" | ||
[Characters constructor]: https://pub.dev/documentation/characters/latest/characters/Characters/Characters.html "Characters constructor" | ||
[Characters.iterator]: https://pub.dev/documentation/characters/latest/characters/Characters/iterator.html "CharactersRange get iterator" | ||
[Characters.replaceAll]: https://pub.dev/documentation/characters/latest/characters/Characters/replaceAll.html "Characters.replaceAlle" | ||
[Characters.startsWith]: https://pub.dev/documentation/characters/latest/characters/Characters/startsWith.html "Characters.startsWith" | ||
[Characters]: https://pub.dev/documentation/characters/latest/characters/Characters-class.html "Characters class" | ||
[Code Points]: https://unicode.org/glossary/#code_point "Unicode Code Point" | ||
[Code Units]: https://unicode.org/glossary/#code_unit "Unicode Code Units" | ||
[Glyphs]: https://unicode.org/glossary/#glyph "Unicode Glyphs" | ||
[Grapheme Clusters]: https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries "Unicode (Extended) Grapheme Cluster" | ||
[Iterable]: https://api.dart.dev/dart-core/Iterable-class.html "Iterable class" | ||
[Runes]: https://api.dart.dev/dart-core/Runes-class.html "Runes class" | ||
[String]: https://api.dart.dev/dart-core/String-class.html "String class" | ||
[Uint16List]: https://api.dart.dev/dart-typed_data/Uint16List-class.html "Uint16List class" | ||
[Uint8List]: https://api.dart.dev/dart-typed_data/Uint8List-class.html "Uint8List class" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
include: package:dart_flutter_team_lints/analysis_options.yaml | ||
|
||
analyzer: | ||
errors: | ||
prefer_single_quotes: ignore |
Oops, something went wrong.