A left-right parser for the Go language.
Simple install the package to your $GOPATH with the go tool from shell:
$ go get github.com/cesbit/goleri
Make sure Git is installed on your machine and in your system's PATH
.
We recommend using pyleri for creating a grammar and export the grammar to goleri. This way you can create one single grammar and export the grammar to program languages like C, JavaScript, Go and Python.
package main
import (
"fmt"
"regexp"
"github.com/cesbit/goleri"
)
// Element identifiers
const (
NoGid = iota
GidHi = iota
GidName = iota
GidSTART = iota
)
// MyGrammar returns a compiled goleri grammar.
func MyGrammar() *goleri.Grammar {
name := goleri.NewRegex(GidName, regexp.MustCompile(`^(?:'(?:[^']*)')+`))
hi := goleri.NewKeyword(GidHi, "hi", false)
START := goleri.NewSequence(GidSTART, hi, name)
return goleri.NewGrammar(START, nil)
}
func main() {
// compile your grammar by creating an instance of the Grammar Class.
myGrammar := MyGrammar()
// use the compiled grammar to parse 'strings'
if res, err := myGrammar.Parse("hi 'Iris'"); err == nil {
fmt.Printf("%t\n", res.IsValid())
/*
res.IsValid()
returns true or false depending if the string is successful parsed
by the grammar or not.
res.Pos()
returns the position in the string where parsing has end.
(if successful this will be equal to the string length)
res.GetExpecting()
returns expected elements at position res.Pos(). This can be used
for auto-completion, auto correction or suggestions.
res.Tree()
returns the parse tree.
*/
}
}
The first Grammar parameter expects a START
property so the parser knows where to start parsing. The second parameter of Grammar reKeywords
is set to nil
in the Quick Usage example. In that case goleri uses the default setting of ^\w+
which is equal to ^[A-Za-z0-9_]+
. This default property can be overwritten. (See the example of keyword). Grammar has a parse method: Parse()
.
goleri.NewGrammar(elem Element, reKeywords *regexp.Regexp)
syntax:
MyGrammar.Parse("string")
The parse()
method returns a result object which has the following properties:
.GetExpecting()
.isValid()
.Pos()
.Tree()
IsValid()
returns a boolean value, true
when the given string is valid according to the given grammar, false
when not valid.
Let us take the example from Quick usage.
if res, err := MyGrammar.Parse("hi 'Iris'"); err == nil {
fmt.Printf("%t\n", res.IsValid())
Pos
returns the position where the parser had to stop. (when IsValid()
is true
this value will be equal to the length of the given string)
Let us take the example from Quick usage.
if res, err := MyGrammar.Parse("hi 'Iris'"); err == nil {
fmt.Printf("%d\n", res.Pos())
Tree()
contains the parse tree. Even when IsValid()
is false
the parse tree is returned but will only contain results as far as parsing has succeeded. The tree is the root node which can include several children
nodes.
GetExpecting()
returns an object containing elements which goleri expects at Pos
. Even if IsValid
is true
there might be elements in this set, for example when an Optional
element could be added to the string. Expecting is useful if you want to implement things like auto-completion, syntax error handling, and auto-syntax-correction.
Goleri has several elements which are all subclasses of Element and can be used to create a grammar. For every element you can define a Global Identifer (gid) for identifying the element in the parse tree. But the gid is not required and should be set to 0 if not used. Every element has two corresponding methods that can be exported: Gid()
returns the elements gid and String()
returns a string including the gid, the element and possible nested elements like <Keyword gid:0 keyword:hi>
.
syntax:
goleri.NewKeyword(gid int, keyword string, ignCase bool)
The parser needs to match the keyword which is just a string. When matching keywords we need to tell the parser what characters are allowed in keywords. By default goleri uses ^\w+
which is equal to ^[A-Za-z0-9_]+
, but in the example below we overwrite the default. NewKeyword()
accepts a parameter ignCase
to tell the parser if we should match case insensitive. The following methods in turn return the arguments that are passed to the NewKeyword()
method: IsIgnCase()
returns a boolean that indicates if case is ignored and GetKeyword()
returns the keyword.
Example:
// Let's allow keywords with alphabetic characters and dashes.
START := goleri.NewKeyword(0, "tic-tac-toe", true)
grammar := goleri.NewGrammar(START, regexp.MustCompile(`^[A-Za-z-]+`))
if res, err := grammar.Parse("Tic-Tac-Toe"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%s\n", START.GetKeyword()) // tic-tac-toe
fmt.Printf("%t\n", START.IsIgnCase()) // true
fmt.Printf("%s\n", START.String()) // <Keyword gid:0 keyword:tic-tac-toe>
}
syntax:
goleri.NewRegex(gid int, regex *regexp.Regexp)
The parser compiles a regular expression using the regexp
package. The GetRegex()
method in turn returns the regex that is passed to NewRegex()
.
Example:
START := goleri.NewRegex(0, regexp.MustCompile(`^(?:'(?:[^']*)')+`))
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("'Iris'"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%s\n", START.GetRegex()) // ^(?:'(?:[^']*)')+
fmt.Printf("%s\n", START.String()) // <Regex gid:0 regex:^(?:'(?:[^']*)')+>
}
syntax:
goleri.NewTokens(gid int, token string)
A token can be one or more characters and is usually used to match operators like +
, -
, //
and so on. The GetToken()
method in turn returns the token that is passed to NewToken()
.
Example:
START := goleri.NewToken(0, "+")
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("-"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // false
fmt.Printf("%s\n", START.GetToken()) // +
fmt.Printf("%s\n", START.String()) // <Token gid:0 token:+>
}
syntax:
goleri.NewTokens(gid int, tokens string)
Can be used to register multiple tokens at once. The tokens
argument should be a string with tokens separated by spaces. If given tokens are different in size the parser will try to match the longest tokens first. The GetTokens()
method returns the tokens that are passed to NewTokens()
.
Example:
START := goleri.NewTokens(0, "== > < <= >=")
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse(">"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%s\n", START.GetTokens()) // [== <= >= > <]
fmt.Printf("%s\n", START.String()) //<Tokens gid:0 tokens:[== <= >= > <]>
}
syntax:
goleri.NewSequence(gid int, elements ...Element)
The parser needs to match each element in a sequence.
Example:
START := goleri.NewSequence(
0,
goleri.NewKeyword(0, "tic", false),
goleri.NewKeyword(0, "tac", false),
goleri.NewKeyword(0, "toe", false))
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("tic tac toe"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%s\n", START.String()) /* <Sequence gid:0 elements:[<Keyword gid:0 keyword:tic> <Keyword gid:0 keyword:tac> <Keyword gid:0 keyword:toe>]> */
}
syntax:
goleri.NewChoice(gid int, mostGreedy bool, elements ...Element)
The parser needs to choose between one of the given elements. Choice accepts the parameter mostGreedy
. When mostGreedy
is set to false
the parser will stop at the first match. When true
the parser will try each element and returns the longest match. Setting mostGreedy
to false
can provide some extra performance. Note that the parser will try to match each element in the exact same order they are passed to NewChoice()
. The IsMostGreedy()
method returns the boolean of mostGreedy.
Example: let us use Choice to modify the Quick usage example to also allow the string 'bye "Iris"'
choice := goleri.NewChoice(
0,
true,
goleri.NewKeyword(0, "hi", false),
goleri.NewKeyword(0, "bye", false))
START := goleri.NewSequence(
0,
choice,
goleri.NewRegex(0, regexp.MustCompile(`^(?:'(?:[^']*)')+`)))
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("hi 'Iris'"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
}
if res, err := grammar.Parse("bye 'Iris'"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%t\n", choice.IsMostGreedy()) // true
fmt.Printf("%s\n", choice.String()) // <Choice gid:0 elements:[<Keyword gid:0 keyword:hi> <Keyword gid:0 keyword:bye>]>
}
syntax:
goleri.NewRepeat(gid int, elem Element, min, max int)
The parser needs at least min
elements and at most max
elements. When max
is set to 0 we allow unlimited number of elements. min
can be any integer value equal or higher than 0 but not larger then max
. The following methods return the min
and max
arguments that are passed to the NewRepeat()
method: GetMin()
, GetMax()
.
Example:
START := goleri.NewRepeat(
0,
goleri.NewKeyword(0, "na", false),
1,
8)
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("na na na na na na"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%d\n", START.GetMin()) // 1
fmt.Printf("%d\n", START.GetMax()) // 8
fmt.Printf("%s\n", START.String()) // <Repeat gid:0 elem:<Keyword gid:0 keyword:na>>
}
syntax:
goleri.NewList(gid int, elem, delimiter Element, min, max int, optClose bool)
List is like Repeat but with a delimiter. The delimiter must be defined as an element first, for example as a NewToken()
or NewKeyword()
. min
and max
have the same meaning as in Repeat. The parameter optClose
can be set to true
to allow the list to end with a delimiter. The following methods return the min
, max
and optClose
arguments that are passed to the NewRepeat()
method: GetMin()
, GetMax()
, IsOptClose()
.
Example:
START := goleri.NewList(
0,
goleri.NewKeyword(0, "ni", false),
goleri.NewToken(0, ","),
0,
8,
true)
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("ni, ni, ni, ni,"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%d\n", START.GetMin()) // 0
fmt.Printf("%d\n", START.GetMax()) // 8
fmt.Printf("%t\n", START.IsOptClose()) // true
fmt.Printf("%s\n", START.String()) // <List gid:0 elem:<Keyword gid:0 keyword:ni> delimiter:<Token gid:0 token:,>>
}
syntax:
goleri.NewOptional(gid int, elem Element)
The parser looks for an optional element. It is like using NewRepeat(0, element, 0, 1)
but we encourage to use Optional
since it is more readable. (and slightly faster)
Example:
optional := goleri.NewOptional(
0,
goleri.NewRegex(0, regexp.MustCompile(`^(?:'(?:[^']*)')+`)))
START := goleri.NewSequence(0, goleri.NewKeyword(0, "hi", false), optional)
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("hi"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
}
if res, err := grammar.Parse("hi 'iris'"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%s\n", optional.String()) // <Optional gid:0 elem:<Regex gid:0 regex:^(?:'(?:[^']*)')+>>
}
syntax:
goleri.NewRef()
The grammar can make a forward reference to make recursion possible. In the example below we create a forward reference to START but note that a reference to any element can be made.
Warning: A reference is not protected against testing the same position in a string. This could potentially lead to an infinite loop. For example:
r := goleri.NewRef() r.Set(goleri.NewOptional(0, r)) // DON'T DO THISUse Prio if such recursive construction is required.
Example:
START := goleri.NewRef()
START.Set(goleri.NewSequence(
0,
goleri.NewToken(0, "["),
goleri.NewList(
0,
goleri.NewChoice(0, true, goleri.NewKeyword(0, "ni", false), START),
goleri.NewToken(0, ","),
0,
0,
false),
goleri.NewToken(0, "]")))
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("[ni, ni, [ni, [], [ni, ni]]]"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%s\n", START.String()) // <Ref isSet:true>
}
syntax:
goleri.NewPrio(gid int, elements ...Element)
Choose the first match from the prio elements and allow THIS
for recursive operations. With THIS
we point to the Prio
element. Probably the example below explains how Prio
and THIS
can be used.
Note: Use a Ref when possible. A
Prio
element is required when the same position in a string is potentially checked more than once.
Example:
START := goleri.NewPrio(
0,
goleri.NewKeyword(0, "ni", false),
goleri.NewSequence(0, goleri.NewToken(0, "("), goleri.THIS, goleri.NewToken(0, ")")),
goleri.NewSequence(0, goleri.THIS, goleri.NewKeyword(0, "or", false), goleri.THIS),
goleri.NewSequence(0, goleri.THIS, goleri.NewKeyword(0, "and", false), goleri.THIS))
grammar := goleri.NewGrammar(START, nil)
if res, err := grammar.Parse("(ni or ni) and (ni or ni)"); err == nil {
fmt.Printf("%t\n", res.IsValid()) // true
fmt.Printf("%s\n", START.String()) /* <Prio gid:0 elements:[<Keyword gid:0 keyword:ni> <Sequence gid:0 elements:[<Token gid:0 token:(> <This> <Token gid:0 token:)>]> <Sequence gid:0 elements:[<This> <Keyword gid:0 keyword:or> <This>]> <Sequence gid:0 elements:[<This> <Keyword gid:0 keyword:and> <This>]>]> */
}