Merge pull request #45 from shonfeder/develop

Release v1.0.0
shonfeder · Jun 23, 2019 · 0e647b6 · 0e647b6
2 parents 5fd98f7 + c45fb74
commit 0e647b6
Show file tree

Hide file tree

Showing 14 changed files with 734 additions and 158 deletions.
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -0,0 +1,21 @@
+version: 2
+
+jobs:
+  build:
+    docker:
+      - image: swipl:stable
+
+    steps:
+      - run:
+          # TODO Build custom image to improve build time
+          name: Install Deps
+          command: |
+            apt update -y
+            apt install git make -y
+
+      - checkout
+
+      - run:
+          name: Run tests
+          command: |
+            make test
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1 @@
+*~
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,44 @@
+# Changelog
+
+All notable changes to this project will be documented in this file.
+
+The format is based on [Keep a Changelog][keep-a-change-log], and this project
+adheres to [Semantic Versioning][semantic-versioning].
+
+[keep-a-change-log]: https://keepachangelog.com/en/1.0.0/
+[semantic-versioning]: https://semver.org/spec/v2.0.0.html
+
+## [unreleased]
+
+## [1.0.0]
+
+### Added
+
+- Support for numbers by [@Annipoo](https://github.com/Anniepoo) #34
+- Support for strings #37
+- Code of Conduct #23
+
+### Changed
+
+- Spaces are now tagged with `space` instead of `spc` #41
+- Tokenization of numbers and strings is enabled by default #40
+- Options are now processed by a more conventional means #39
+- The location for the pack's home is updated
+
+## [0.1.2]
+
+Prior to changelog.
+
+## [0.1.1]
+
+Prior to changelog.
+
+## [0.1.0]
+
+Prior to changelog.
+
+[unreleased]: https://github.com/shonfeder/tokenize/compare/v1.0.0...HEAD
+[1.0.0]: https://github.com/shonfeder/tokenize/compare/v0.1.2...v1.0.0
+[0.1.2]: https://github.com/shonfeder/tokenize/compare/v0.1.1...v0.1.2
+[0.1.1]: https://github.com/shonfeder/tokenize/compare/v0.1.0...v0.1.1
+[0.1.0]: https://github.com/shonfeder/tokenize/releases/tag/v0.1.0
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -5,20 +5,74 @@ reports, etc.
 
 ## Code of Conduct
 
-Please review and accept to our [code of conduct](CODE_OF_CONDUCT.md) prior to
+Please review and accept our [code of conduct](CODE_OF_CONDUCT.md) prior to
 engaging in the project.
 
+## Overall direction and aims
+
+Consult the [`design_notes.md`](design_notes.md) to see the latest codified
+design philosophy and principles.
+
 ## Setting up Development
 
-TODO
+1. Install swi-prolog's [swipl](http://www.swi-prolog.org/download/stable).
+    - Optionally, you may wish to use [swivm](https://github.com/fnogatz/swivm) to
+      manage multiple installed versions of swi-prolog.
+2. Hack on the source code in `[./prolog](./prolog)`.
+3. Run and explore your changes by loading the file in `swipl` (or using your
+   editors IDE capabilities):
+    - Example in swipl
+
+    ```prolog
+    # in ~/oss/tokenize on git:develop x [22:45:02]
+    $ cd ./prolog
+
+    # in ~/oss/tokenize/prolog on git:develop x [22:45:04]
+    $ swipl
+    Welcome to SWI-Prolog (threaded, 64 bits, version 8.0.2)
+    SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
+    Please run ?- license. for legal details.
+
+    For online help and background, visit http://www.swi-prolog.org
+    For built-in help, use ?- help(Topic). or ?- apropos(Word).
+
+    % lod the tokenize module
+    ?- [tokenize].
+    true.
+
+    % experiment
+    ?- tokenize("Foo bar baz", Tokens).
+    Tokens = [word(foo), space(' '), word(bar), space(' '), word(baz)].
+
+    % reload the module when you make changes to the source code
+    ?- make.
+    % Updating index for library /usr/local/Cellar/swi-prolog/8.0.2/libexec/lib/swipl/library/
+    true.
+
+    % finished
+    ?- halt.
+    ```
+
+Please ask here or in `##prolog` on [freenode](https://freenode.net/) if you
+need any help! :)
 
 ## Running tests
 
 Tests are located in the [`./test`](./test) directory. To run the test suite,
-simply execute the test file:
+simply execute make test:
 
 ```sh
-$ ./test/test.pl
+$ make test
+% PL-Unit: tokenize .. done
+% All 2 tests passed
+```
+
+If inside the swipl repl, make sure to load the test file and query run_tests.
+
+```prolog
+?- [test/test].
+?- run_tests.
 % PL-Unit: tokenize .. done
 % All 2 tests passed
+true.
 ```
diff --git a/Makefile b/Makefile
@@ -0,0 +1,19 @@
+.PHONY: all test clean
+
+version := $(shell swipl -q -s pack -g 'version(V),writeln(V)' -t halt)
+packfile = quickcheck-$(version).tgz
+
+SWIPL := swipl
+
+all: test
+
+version:
+	echo $(version)
+
+check: test
+
+install:
+	echo "(none)"
+
+test:
+	@$(SWIPL) -s test/test.pl -g 'run_tests,halt(0)' -t 'halt(1)'
diff --git a/README.md b/README.md
@@ -1,30 +1,37 @@
-# Synopsis
+# `pack(tokenize) :-`
+
+A modest tokenization library for SWI-Prolog, seeking a balance between
+simplicity and flexibility.
+
+[![CircleCI](https://circleci.com/gh/shonfeder/tokenize.svg?style=svg)](https://circleci.com/gh/shonfeder/tokenize)
+
+## Synopsis
 
 ```prolog
 ?- tokenize(`\tExample  Text.`, Tokens).
-Tokens = [cntrl('\t'), word(example), spc(' '), spc(' '), word(text), punct('.')] 
+Tokens = [cntrl('\t'), word(example), space(' '), space(' '), word(text), punct('.')]
 
 ?- tokenize(`\tExample  Text.`, Tokens, [cntrl(false), pack(true), cased(true)]).
-Tokens = [word('Example', 1), spc(' ', 2), word('Text', 1), punct('.', 1)] 
+Tokens = [word('Example', 1), space(' ', 2), word('Text', 1), punct('.', 1)]
 
 ?- tokenize(`\tExample  Text.`, Tokens), untokenize(Tokens, Text), format('~s~n', [Text]).
 	example  text.
-Tokens = [cntrl('\t'), word(example), spc(' '), spc(' '), word(text), punct('.')],
-Text = [9, 101, 120, 97, 109, 112, 108, 101, 32|...] 
+Tokens = [cntrl('\t'), word(example), space(' '), space(' '), word(text), punct('.')],
+Text = [9, 101, 120, 97, 109, 112, 108, 101, 32|...]
 ```
 
-# Description
+## Description
 
 Module `tokenize` aims to provide a straightforward tool for tokenizing text into a simple format. It is the result of a learning exercise, and it is far from perfect. If there is sufficient interest from myself or anyone else, I'll try to improve it.
 
-It is packaged as an SWI-Prolog pack, available [here](http://www.swi-prolog.org/pack/list?p=tokenize). Install it into your SWI-Prolog system with the query 
+It is packaged as an SWI-Prolog pack, available [here](http://www.swi-prolog.org/pack/list?p=tokenize). Install it into your SWI-Prolog system with the query
 
 ```prolog
 ?- pack_install(tokenize).
 ```
 
 Please [visit the wiki](https://github.com/aBathologist/tokenize/wiki/tokenize.pl-options-and-examples) for more detailed instructions and examples, including a full list of options supported.
 
-# Contributing
+## Contributing
 
 See [CONTRIBUTING.md](./CONTRIBUTING.md).
diff --git a/comment-wip/README.md b/comment-wip/README.md
@@ -0,0 +1,4 @@
+WIP code towards tokenization of comments.
+
+It was extracted here because it's not ready for release, but we want to keep it
+available for the author to resume work on it.
diff --git a/comment-wip/comment.pl b/comment-wip/comment.pl
@@ -0,0 +1,115 @@
+:- module(comment,
+          [comment//2,
+           comment_rec//2,
+           comment_token//3,
+           comment_token_rec//3]).
+
+/** <module> Tokenizing comments
+This module defines matchers for comments used by the tokenize module. (Note
+that we will use matcher as a name for dcg rules that match parts of the codes
+list).
+
+@author Stefan Israelsson Tampe
+@license LGPL v2 or later
+
+Interface Note:
+Start and End matchers is a matcher (dcg rule) that is either evaluated with no
+extra argument (--> call(StartMatcher)) and it will just match it's token or it
+can have an extra argument producing the codes matched by the matcher e.g. used
+as --> call(StartMatcher,MatchedCodes). The matchers match start and end codes
+of the comment, the 2matcher type will represent these kinds of dcg rules or
+matchers 2 is because they support two kinds of arguments to the dcg rules.
+For examples
+see:
+
+  @see tests/test_comments.pl
+
+The matchers predicates exported and defined are:
+
+ comment(+Start:2matcher,+End:2matcher)
+   - anonymously match a non recursive comment
+
+ comment_rec(+Start:2matcher,+End:2matcher,2matcher)
+   - anonymously match a recursive comment
+
+ coment_token(+Start:2matcher,+End:2matcher,-Matched:list(codes))
+   - match an unrecursive comment outputs the matched sequence used
+     for building a resulting comment token
+
+ coment_token_rec(+Start:2matcher,+End:2matcher,-Matched:list(codes))
+   - match an recursive comment outputs the matched sequence used
+     for building a resulting comment token
+*/
+
+
+
+%% comment(+Start:2matcher,+End:2matcher)
+%    non recursive non tokenizing matcher
+
+comment_body(E) --> call(E),!.
+comment_body(E) --> [_],comment_body(E).
+
+comment(S,E) -->
+    call(S),
+    comment_body(E).
+
+%% comment_token(+Start:2matcher,+End:2matcher,-Matched:list(codes))
+%    non recursive tokenizing matcher
+
+comment_body_token(E,Text) -->
+    call(E,HE),!,
+    {append(HE,[],Text)}.
+
+comment_body_token(E,[X|L]) -->
+    [X],
+    comment_body_token(E,L).
+
+comment_token(S,E,Text) -->
+    call(S,HS),
+    {append(HS,T,Text)},
+    comment_body_token(E,T).
+
+%% comment_token_rec(+Start:2matcher,+End:2matcher,-Matched:list(codes))
+%   recursive tokenizing matcher
+
+% Use this as the initial continuation, will just tidy up the matched result
+% by ending the list with [].
+comment_body_rec_start(_,_,[]).
+
+comment_body_token_rec(_,E,Cont,Text) -->
+    call(E,HE),!,
+    {append(HE,T,Text)},
+    call(Cont,T).
+
+comment_body_token_rec(S,E,Cont,Text) -->
+    call(S,HS),!,
+    {append(HS,T,Text)},
+    comment_body_token_rec(S,E,comment_body_token_rec(S,E,Cont),T).
+
+comment_body_token_rec(S,E,Cont,[X|L]) -->
+    [X],
+    comment_body_token_rec(S,E,Cont,L).
+
+comment_token_rec(S,E,Text) -->
+    call(S,HS),
+    {append(HS,T,Text)},
+    comment_body_token_rec(S,E,comment_body_rec_start,T).
+
+%% comment_rec(+Start:2matcher,+End:2matcher)
+%    recursive non tokenizing matcher
+
+comment_body_rec(_,E) -->
+    call(E),!.
+
+comment_body_rec(S,E) -->
+    call(S),!,
+    comment_body_rec(S,E),
+    comment_body_rec(S,E).
+
+comment_body_rec(S,E) -->
+    [_],
+    comment_body_rec(S,E).
+
+comment_rec(S,E) -->
+    call(S),
+    comment_body_rec(S,E).