-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #45 from shonfeder/develop
Release v1.0.0
- Loading branch information
Showing
14 changed files
with
734 additions
and
158 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
version: 2 | ||
|
||
jobs: | ||
build: | ||
docker: | ||
- image: swipl:stable | ||
|
||
steps: | ||
- run: | ||
# TODO Build custom image to improve build time | ||
name: Install Deps | ||
command: | | ||
apt update -y | ||
apt install git make -y | ||
- checkout | ||
|
||
- run: | ||
name: Run tests | ||
command: | | ||
make test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
*~ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Changelog | ||
|
||
All notable changes to this project will be documented in this file. | ||
|
||
The format is based on [Keep a Changelog][keep-a-change-log], and this project | ||
adheres to [Semantic Versioning][semantic-versioning]. | ||
|
||
[keep-a-change-log]: https://keepachangelog.com/en/1.0.0/ | ||
[semantic-versioning]: https://semver.org/spec/v2.0.0.html | ||
|
||
## [unreleased] | ||
|
||
## [1.0.0] | ||
|
||
### Added | ||
|
||
- Support for numbers by [@Annipoo](https://github.com/Anniepoo) #34 | ||
- Support for strings #37 | ||
- Code of Conduct #23 | ||
|
||
### Changed | ||
|
||
- Spaces are now tagged with `space` instead of `spc` #41 | ||
- Tokenization of numbers and strings is enabled by default #40 | ||
- Options are now processed by a more conventional means #39 | ||
- The location for the pack's home is updated | ||
|
||
## [0.1.2] | ||
|
||
Prior to changelog. | ||
|
||
## [0.1.1] | ||
|
||
Prior to changelog. | ||
|
||
## [0.1.0] | ||
|
||
Prior to changelog. | ||
|
||
[unreleased]: https://github.com/shonfeder/tokenize/compare/v1.0.0...HEAD | ||
[1.0.0]: https://github.com/shonfeder/tokenize/compare/v0.1.2...v1.0.0 | ||
[0.1.2]: https://github.com/shonfeder/tokenize/compare/v0.1.1...v0.1.2 | ||
[0.1.1]: https://github.com/shonfeder/tokenize/compare/v0.1.0...v0.1.1 | ||
[0.1.0]: https://github.com/shonfeder/tokenize/releases/tag/v0.1.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
.PHONY: all test clean | ||
|
||
version := $(shell swipl -q -s pack -g 'version(V),writeln(V)' -t halt) | ||
packfile = quickcheck-$(version).tgz | ||
|
||
SWIPL := swipl | ||
|
||
all: test | ||
|
||
version: | ||
echo $(version) | ||
|
||
check: test | ||
|
||
install: | ||
echo "(none)" | ||
|
||
test: | ||
@$(SWIPL) -s test/test.pl -g 'run_tests,halt(0)' -t 'halt(1)' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,37 @@ | ||
# Synopsis | ||
# `pack(tokenize) :-` | ||
|
||
A modest tokenization library for SWI-Prolog, seeking a balance between | ||
simplicity and flexibility. | ||
|
||
[![CircleCI](https://circleci.com/gh/shonfeder/tokenize.svg?style=svg)](https://circleci.com/gh/shonfeder/tokenize) | ||
|
||
## Synopsis | ||
|
||
```prolog | ||
?- tokenize(`\tExample Text.`, Tokens). | ||
Tokens = [cntrl('\t'), word(example), spc(' '), spc(' '), word(text), punct('.')] | ||
Tokens = [cntrl('\t'), word(example), space(' '), space(' '), word(text), punct('.')] | ||
?- tokenize(`\tExample Text.`, Tokens, [cntrl(false), pack(true), cased(true)]). | ||
Tokens = [word('Example', 1), spc(' ', 2), word('Text', 1), punct('.', 1)] | ||
Tokens = [word('Example', 1), space(' ', 2), word('Text', 1), punct('.', 1)] | ||
?- tokenize(`\tExample Text.`, Tokens), untokenize(Tokens, Text), format('~s~n', [Text]). | ||
example text. | ||
Tokens = [cntrl('\t'), word(example), spc(' '), spc(' '), word(text), punct('.')], | ||
Text = [9, 101, 120, 97, 109, 112, 108, 101, 32|...] | ||
Tokens = [cntrl('\t'), word(example), space(' '), space(' '), word(text), punct('.')], | ||
Text = [9, 101, 120, 97, 109, 112, 108, 101, 32|...] | ||
``` | ||
|
||
# Description | ||
## Description | ||
|
||
Module `tokenize` aims to provide a straightforward tool for tokenizing text into a simple format. It is the result of a learning exercise, and it is far from perfect. If there is sufficient interest from myself or anyone else, I'll try to improve it. | ||
|
||
It is packaged as an SWI-Prolog pack, available [here](http://www.swi-prolog.org/pack/list?p=tokenize). Install it into your SWI-Prolog system with the query | ||
It is packaged as an SWI-Prolog pack, available [here](http://www.swi-prolog.org/pack/list?p=tokenize). Install it into your SWI-Prolog system with the query | ||
|
||
```prolog | ||
?- pack_install(tokenize). | ||
``` | ||
|
||
Please [visit the wiki](https://github.com/aBathologist/tokenize/wiki/tokenize.pl-options-and-examples) for more detailed instructions and examples, including a full list of options supported. | ||
|
||
# Contributing | ||
## Contributing | ||
|
||
See [CONTRIBUTING.md](./CONTRIBUTING.md). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
WIP code towards tokenization of comments. | ||
|
||
It was extracted here because it's not ready for release, but we want to keep it | ||
available for the author to resume work on it. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
:- module(comment, | ||
[comment//2, | ||
comment_rec//2, | ||
comment_token//3, | ||
comment_token_rec//3]). | ||
|
||
/** <module> Tokenizing comments | ||
This module defines matchers for comments used by the tokenize module. (Note | ||
that we will use matcher as a name for dcg rules that match parts of the codes | ||
list). | ||
@author Stefan Israelsson Tampe | ||
@license LGPL v2 or later | ||
Interface Note: | ||
Start and End matchers is a matcher (dcg rule) that is either evaluated with no | ||
extra argument (--> call(StartMatcher)) and it will just match it's token or it | ||
can have an extra argument producing the codes matched by the matcher e.g. used | ||
as --> call(StartMatcher,MatchedCodes). The matchers match start and end codes | ||
of the comment, the 2matcher type will represent these kinds of dcg rules or | ||
matchers 2 is because they support two kinds of arguments to the dcg rules. | ||
For examples | ||
see: | ||
@see tests/test_comments.pl | ||
The matchers predicates exported and defined are: | ||
comment(+Start:2matcher,+End:2matcher) | ||
- anonymously match a non recursive comment | ||
comment_rec(+Start:2matcher,+End:2matcher,2matcher) | ||
- anonymously match a recursive comment | ||
coment_token(+Start:2matcher,+End:2matcher,-Matched:list(codes)) | ||
- match an unrecursive comment outputs the matched sequence used | ||
for building a resulting comment token | ||
coment_token_rec(+Start:2matcher,+End:2matcher,-Matched:list(codes)) | ||
- match an recursive comment outputs the matched sequence used | ||
for building a resulting comment token | ||
*/ | ||
|
||
|
||
|
||
%% comment(+Start:2matcher,+End:2matcher) | ||
% non recursive non tokenizing matcher | ||
|
||
comment_body(E) --> call(E),!. | ||
comment_body(E) --> [_],comment_body(E). | ||
|
||
comment(S,E) --> | ||
call(S), | ||
comment_body(E). | ||
|
||
%% comment_token(+Start:2matcher,+End:2matcher,-Matched:list(codes)) | ||
% non recursive tokenizing matcher | ||
|
||
comment_body_token(E,Text) --> | ||
call(E,HE),!, | ||
{append(HE,[],Text)}. | ||
|
||
comment_body_token(E,[X|L]) --> | ||
[X], | ||
comment_body_token(E,L). | ||
|
||
comment_token(S,E,Text) --> | ||
call(S,HS), | ||
{append(HS,T,Text)}, | ||
comment_body_token(E,T). | ||
|
||
%% comment_token_rec(+Start:2matcher,+End:2matcher,-Matched:list(codes)) | ||
% recursive tokenizing matcher | ||
|
||
% Use this as the initial continuation, will just tidy up the matched result | ||
% by ending the list with []. | ||
comment_body_rec_start(_,_,[]). | ||
|
||
comment_body_token_rec(_,E,Cont,Text) --> | ||
call(E,HE),!, | ||
{append(HE,T,Text)}, | ||
call(Cont,T). | ||
|
||
comment_body_token_rec(S,E,Cont,Text) --> | ||
call(S,HS),!, | ||
{append(HS,T,Text)}, | ||
comment_body_token_rec(S,E,comment_body_token_rec(S,E,Cont),T). | ||
|
||
comment_body_token_rec(S,E,Cont,[X|L]) --> | ||
[X], | ||
comment_body_token_rec(S,E,Cont,L). | ||
|
||
comment_token_rec(S,E,Text) --> | ||
call(S,HS), | ||
{append(HS,T,Text)}, | ||
comment_body_token_rec(S,E,comment_body_rec_start,T). | ||
|
||
%% comment_rec(+Start:2matcher,+End:2matcher) | ||
% recursive non tokenizing matcher | ||
|
||
comment_body_rec(_,E) --> | ||
call(E),!. | ||
|
||
comment_body_rec(S,E) --> | ||
call(S),!, | ||
comment_body_rec(S,E), | ||
comment_body_rec(S,E). | ||
|
||
comment_body_rec(S,E) --> | ||
[_], | ||
comment_body_rec(S,E). | ||
|
||
comment_rec(S,E) --> | ||
call(S), | ||
comment_body_rec(S,E). |
Oops, something went wrong.