Skip to content

Commit

Permalink
Fix typos and readability in examples.
Browse files Browse the repository at this point in the history
  • Loading branch information
mathewsanders committed Jan 2, 2017
1 parent 8c48f8a commit 61b23b0
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 19 deletions.
25 changes: 11 additions & 14 deletions Documentation/3. Expressive matching.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,28 @@
# Example: expressive matching

tokens(from:) -> [Token]
The results returned by `tokens(from:)`returns an array tuples with the signature `(tokenizer: TokenType, text: String, range: Range<String.Index>)`

The results returned by `tokens(from:)`returns an array of `Token` where `Token` is a typealias of the tuple `(tokenizer: TokenType, text: String, range: Range<String.Index>)`
To make use of the `tokenizer` element, you need to either use type casting (using `as?`) or type checking (using `is`) for the `tokenizer` element to be useful.

Which requires either type casting (using `as?`) type checking or type checking (using `is`) for the `tokenizer` element to be useful:
Maybe we want to filter out only tokens that are numbers:

````Swift
import Mustard

let messy = "123Hello world&^45.67"
let tokens = messy.tokens(from: .decimalDigits, .letters)
// tokens.count -> 5

// using type checking
if tokens[0].tokenizer is EmojiToken {
print("found emoji token")
}

// using type casting
if let _ = tokens[0].tokenizer as? NumberToken {
print("found number token")
}
let numbers = tokens.filter({ $0.tokenizer is NumberToken })
// numbers.count -> 0

````

This can lead to bugs in your logic-- in the example above neither of the print statements will be executed since the tokenizer used was actually the character sets `.decimalDigits`, and `.letters`.
This can lead to bugs in your logic-- in the example above `numberTokens` will be empty because the tokenizers used were the character sets `.decimalDigits`, and `.letters`, so the filter won't match any of the tokens.

This may seem like an obvious error, but it's the type of unexpected bug that can slip in when we're using loosely typed results.

Mustard can return a strongly typed set of matches if a single `TokenType` is used.
Thankfully, Mustard can return a strongly typed set of matches if a single `TokenType` is used:

````Swift
import Mustard
Expand All @@ -35,6 +31,7 @@ let messy = "123Hello world&^45.67"

// call `tokens()` method on string to get matching tokens from string
let numberTokens: [NumberToken.Match] = messy.tokens()
// numberTokens.count -> 2

````

Expand Down
2 changes: 2 additions & 0 deletions Mustard/MustardTests/CharacterSetTokenTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ class CharacterSetTokenTests: XCTestCase {

let tokens = "123Hello world&^45.67".tokens(from: .decimalDigits, .letters)

let numbers = tokens.filter({ $0.tokenizer is NumberToken })

XCTAssert(tokens.count == 5, "Unexpected number of characterset tokens [\(tokens.count)]")

XCTAssert(tokens[0].tokenizer == CharacterSet.decimalDigits)
Expand Down
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Mustard is a Swift library for tokenizing strings when splitting by whitespace doesn't cut it.

## Quick start
## Quick start using character sets

Mustard extends `String` with the method `tokens(from: CharacterSet...)` which allows you to pass in one
or more character sets to use criteria to find tokens.
Expand Down Expand Up @@ -32,17 +32,18 @@ let tokens = messy.tokens(from: .decimalDigits, .letters)
// tokens[4].range -> Range<String.Index>(19..<21)
````

## Expressive use with custom tokenizers

Creating by creating objects that implement the `TokenType` protocol we can create
more advanced tokenizers. Here's some usage of a `DateToken` type that matches tokens
with the a valid `MM/dd/yy` format, and also exposes a `date` property to access the
more advanced tokenizers. Here's some usage of a `DateToken` type (see example for implementation)
that matches tokens with the a valid `MM/dd/yy` format, and also exposes a `date` property to access the
corresponding `Date` object.

````Swift
import Mustard

let messyInput = "Serial: #YF 1942-b 12/01/27 (Scanned) 12/03/27 (Arrived) ref: 99/99/99"

let tokens:[DateToken.Token] = messyInput.tokens()
let tokens: [DateToken.Token] = messyInput.tokens()
// tokens.count -> 2
// ('99/99/99' is *not* matched by `DateToken`)
//
Expand Down

0 comments on commit 61b23b0

Please sign in to comment.