These build rules are used for processing ANTLR grammars with Bazel.
Different rules are available that align with the ANTLR release streams.
Add the following to your WORKSPACE
file to include the external repository and load the external dependencies necessary for
the antlr4
rule:
http_archive(
name = "rules_antlr",
sha256 = "acd2a25f31aeeea5f58cdb434ae109d03826ae7cc11fe9efce1740102e3f4531",
strip_prefix = "rules_antlr-0.1.0",
urls = ["https://github.com/marcohu/rules_antlr/archive/0.1.0.tar.gz"],
)
load("@rules_antlr//antlr:deps.bzl", "antlr_dependencies")
antlr_dependencies()
More detailed instructions can be found in the Setup section.
Suppose you have the following directory structure for a simple ANTLR project:
[workspace]/
WORKSPACE
HelloWorld/
BUILD
src/
main/
antlr4/
Hello.g4
HelloWorld/src/main/antlr4/Hello.g4
grammar Hello;
r : 'hello' ID;
ID : [a-z]+;
WS : [ \t\r\n]+ -> skip;
HelloWorld/BUILD
package(default_visibility = ["//visibility:public"])
load("@rules_antlr//antlr:antlr4.bzl", "antlr4")
antlr4(
name = "generated",
srcs = ["src/main/antlr4/Hello.g4"],
package = "hello.world",
)
java_library(
name = "HelloWorld",
srcs = [":generated"],
)
Compiling the project generates the lexer/parser files:
$ bazel build //HelloWorld
INFO: Analysed target //HelloWorld:HelloWorld (0 packages loaded).
INFO: Found 1 target...
Target //HelloWorld:HelloWorld up-to-date:
bazel-bin/HelloWorld/libHelloWorld.jar
INFO: Elapsed time: 0.940s, Critical Path: 0.76s
INFO: Build completed successfully, 4 total actions
The generated source files can be found in the generated.srcjar
archive below your workspace bazel-bin/HelloWorld
directory.
To just generate the source files you would use:
$ bazel build //HelloWorld:generated
Refer to the examples directory for further samples.
ANTLR rules will store all generated source files in a target-name.srcjar
zip archive below your workspace bazel-bin
folder.
Depending on the ANTLR version, there are three ways to control namespacing and directory structure for generated code, all with their pros and cons.
-
The
package
rule attribute (antlr4
only). Setting the namespace via thepackage
attribute will generate the corresponding target language specific namespacing code (where applicable) and puts the generated source files below a corresponding directory structure. To not create the directory structure, set thelayout
attribute toflat
.
Very expressive and allows language independent grammars, but only available with ANTLR 4, requires several runs for different namespaces, might complicate refactoring and can conflict with language specific code in@header {...}
sections as they are mutually exclusive. -
Language specific application code in grammar
@header {...}
section. To not create the corresponding directory structure, set thelayout
attribute toflat
.
Allows different namespaces to be processed in a single run and will not require changes to build files upon refactoring, but ties grammars to a specific language and can conflict with thepackage
attribute as they are mutually exclusive. -
The project layout (
antlr4
only). Putting your grammars below a common project directory will determine namespace and corresponding directory structure for the generated source files from the relative project path. ANTLR rules uses different defaults for the different target languages (see below), but you can define the root directory yourself via thelayout
attribute.
Allows different namespaces to be processed in a single run without language coupling, but requires conformity to a specific (albeit configurable) project layout and thelayout
attribute for certain languages.
The antlr4
rule supports a common directory layout to figure out namespacing from the relative directory structure. The table below lists the default paths for the different target languages. The version number at the end is optional.
Language | Default Directory |
---|---|
C | src/antlr4 |
Cpp | src/antlr4 |
CSharp, CSharp2, CSharp3 | src/antlr4 |
Go | |
Java | src/main/antlr4 |
JavaScript | src/antlr4 |
Python, Python2, Python3 | src/antlr4 |
Swift |
For languages with no default, you have to set your preference with the layout
attribute.
antlr4(name, deps=[], srcs=[], atn, depend, encoding, error,
force_atn, imports=[], language, layout, listener, log,
long_messages, message_format, no_listener, no_visitor,
options={}, package, visitor)
Runs ANTLR 4 on the given grammar files.
name |
A unique name for this rule. |
deps |
The dependencies to use. Defaults to the official ANTLR 4 release, but if you need to use a different version, you can specify the dependencies here. |
srcs |
The grammar files to process. |
atn |
Generate rule augmented transition network diagrams. |
depend |
Generate a list of file dependencies instead of parser and/or lexer. |
encoding |
The grammar file encoding, e.g. euc-jp. |
error |
Treat warnings as errors. |
force_atn |
Use the ATN simulator for all predictions. |
imports |
The grammar and .tokens files to import. Must be all in the same directory. |
language |
The code generation target language. Either Cpp, CSharp, Go, JavaScript, Java, Python2, Python3 or Swift (case-sensitive). |
layout |
The directory layout to match file paths against for package/namespace detection by convention. The default depends on the target language. |
listener |
Generate parse tree listener. |
log |
Dump lots of logging info to antlr-timestamp.log. |
long_messages |
Show exception details when available for errors and warnings. |
message_format |
The output style for messages. Either antlr, gnu or vs2005. |
no_listener |
Do not generate parse tree listener. |
no_visitor |
Do not generate parse tree visitor. |
options |
Set/override grammar-level options. |
package |
The package/namespace for the generated code. |
visitor |
Generate parse tree visitor. |
antlr3(name, deps=[], srcs=[], debug, depend, dfa, dump, imports=[],
language, message_format, nfa, profile, report, trace,
Xconversiontimeout, Xdbgst, Xdbgconversion, Xdfa, Xdfaverbose,
Xgrtree, Xm, Xmaxdfaedges, Xmaxinlinedfastates, Xminswitchalts,
Xmultithreaded, Xnfastates, Xnocollapse, Xnomergestopstates,
Xnoprune, Xsavelexer, Xwatchconversion)
Runs ANTLR 3 on the given grammar files.
name |
A unique name for this rule. |
deps |
The dependencies to use. Defaults to the most recent ANTLR 3 release, but if you need to use a different version, you can specify the dependencies here. |
srcs |
The grammar files to process. |
Xconversiontimeout |
Set NFA conversion timeout for each decision. |
Xdbgconversion |
Dump lots of info during NFA conversion. |
Xdbgst |
Put tags at start/stop of all templates in output. |
Xdfa |
Print DFA as text. |
Xdfaverbose |
Generate DFA states in DOT with NFA configs. |
Xgrtree |
Print the grammar AST. |
Xm |
Max number of rule invocations during conversion. |
Xmaxdfaedges |
Max "comfortable" number of edges for single DFA state. |
Xmaxinlinedfastates |
Max DFA states before table used rather than inlining. |
Xminswitchalts |
Don't generate switch() statements for dfas smaller than given number. |
Xmultithreaded |
Run the analysis in 2 threads. |
Xnfastates |
For nondeterminisms, list NFA states for each path. |
Xnocollapse |
Collapse incident edges into DFA states. |
Xnomergestopstates |
Do not merge stop states. |
Xnoprune |
Do not test EBNF block exit branches. |
Xsavelexer |
Don't delete temporary lexers generated from combined grammars. |
Xwatchconversion |
Print a message for each NFA before converting. |
debug |
Generate a parser that emits debugging events. |
depend |
Generate file dependencies; don't actually run antlr. |
dfa |
Generate a DFA for each decision point. |
dump |
Print out the grammar without actions. |
imports |
The grammar and .tokens files to import. Must be all in the same directory. |
language |
The code generation target language. Either C, Cpp, CSharp2, CSharp3, JavaScript, Java, ObjC, Python, Python3 or Ruby (case-sensitive). |
message_format |
Specify output style for messages. |
nfa |
Generate an NFA for each rule. |
profile |
Generate a parser that computes profiling information. |
report |
Print out a report about the grammar(s) processed. |
trace |
Generate a parser with trace output. If the default output is not enough, you can override the traceIn and traceOut methods. |
antlr2(name, deps=[], srcs=[], debug, diagnostic, docbook, html,
imports=[], traceLexer, traceParser, traceTreeParser)
Runs ANTLR 2 on the given grammar files.
name |
A unique name for this rule. |
deps |
The dependencies to use. Defaults to the final ANTLR 2 release, but if you need to use a different version, you can specify the dependencies here. |
srcs |
The grammar files to process. |
debug |
Launch the ParseView debugger upon parser invocation. Unless you have downloaded and unzipped the debugger over the top of the standard ANTLR distribution, the code emanating from ANTLR with this option will not compile. |
diagnostic |
Generate a text file from your grammar with a lot of debugging info. |
docbook |
Generate a docbook SGML file from your grammar without actions and so on. It only works for parsers, not lexers or tree parsers. |
html |
Generate a HTML file from your grammar without actions and so on. It only works for parsers, not lexers or tree parsers. |
imports |
The grammar file to import. |
trace |
Have all rules call traceIn/traceOut. |
traceLexer |
Have lexer rules call traceIn/traceOut. |
traceParser |
Have parser rules call traceIn/traceOut. |
traceTreeParser |
Have tree walker rules call traceIn/traceOut. |