Skip to content

Commit

Permalink
Introduction improvements.
Browse files Browse the repository at this point in the history
  • Loading branch information
Gohla committed Dec 15, 2023
1 parent 2ac9b05 commit a7ce493
Show file tree
Hide file tree
Showing 4 changed files with 106 additions and 52 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ cargo install --path mdbook-diff2html
If you have [`cargo install-update`](https://github.com/nabijaczleweli/cargo-update) installed, you can instead install and/or update the external binaries with:

```shell
cargo install-update mdbook mdbook-admonish mdbook-external-links
cargo install-update -i mdbook mdbook-admonish mdbook-external-links
```

## Building
Expand Down
144 changes: 99 additions & 45 deletions src/0_intro/index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# Build your own Programmatic Incremental Build System

This is a programming tutorial where you will build your own _programmatic incremental build system_ in [Rust](https://www.rust-lang.org/).
This is a programming tutorial where you will build your own _programmatic incremental build system_, which is a mix between an incremental build system and an incremental computation system.
Programmatic incremental build systems enable programmers to write expressive build scripts and interactive programs in a regular programming language, with the system taking care of correct incrementality once and for all, freeing programmers from having to manually implement complicated and error-prone incrementality every time.

The primary goal of this tutorial is to provide understanding of programmatic incremental build systems through implementation and experimentation.

Although the tutorial uses Rust, you _don't_ need to be a Rust expert to follow it.
A secondary goal of this tutorial is to teach more about Rust through implementation and experimentation, given that you already have programming experience (in another language) and are willing to learn.
In this programming tutorial you will write [Rust](https://www.rust-lang.org/) code, but you _don't_ need to be a Rust expert to follow it.
A secondary goal of this tutorial is to teach more about Rust through implementation and experimentation, given that you already have some programming experience (in another language) and are willing to learn.
Therefore, all Rust code is available, and I try to explain and link to the relevant Rust book chapters as much as possible.

This is of course not a full tutorial or book on Rust.
Expand All @@ -17,73 +18,126 @@ However, if you like to learn through examples and experimentation, or already k

[//]: # (Where possible I will try to explain design decisions, discuss tradeoffs, or provide more info about optimizations.)

We will first motivate programmatic incremental build systems.
We will first motivate programmatic incremental build systems in more detail.

## Motivation

A programmatic incremental build system is a mix between an incremental build system and an incremental computation system, with the following key properties:

- _Programmatic_: Build scripts are regular programs written in a programming language, where parts of the build script implement an API from the build system. This enables build authors to write incremental builds with the full expressiveness of the programming language.
- _Programmatic_: Build scripts are regular programs written in a programming language, where parts of the program implement an API from the build system. This enables programmers to write incremental builds scripts and interactive programs with the full expressiveness of the programming language.
- _Incremental_: Builds are truly incremental -- only the parts of a build that are affected by changes are executed.
- _Correct_: Builds are fully correct -- all parts of the build that are affected by changes are executed. Builds are free of glitches: only up-to-date (consistent) data is observed.
- _Automatic_: The build system takes care of incrementality and correctness. Build authors _do not_ have to manually implement incrementality. Instead, they only have to explicitly _declare dependencies_.
- _Multipurpose_: The same build script can be used for incremental batch builds in a terminal, but also for live feedback in an interactive environment such as an IDE. For example, a compiler implemented in this build system can provide incremental batch compilation but also incremental editor services such as syntax highlighting or code completion.
- _Automatic_: The system takes care of incrementality and correctness. Programmers _do not_ have to manually implement incrementality. Instead, they only have to explicitly _declare dependencies_.

#### Teaser Toy Example
[//]: # (- _Multipurpose_: The same build script can be used for incremental batch builds in a terminal, but also for live feedback in an interactive environment such as an IDE. For example, a compiler implemented in this build system can provide incremental batch compilation but also incremental editor services such as syntax highlighting or code completion.)

As a small teaser, here is a simplified version of a programmatic incremental toy build script that copies a text file by reading and writing:
[//]: # ()
[//]: # (#### Teaser Toy Example)

To show the benefits of a build system with these key properties, here is a simplified version of the programmatic incremental build script for compiling a formal grammar and parsing text with that compiled grammar, which is the build script you will implement in the [final project chapter](../4_example/index.md).
This simplified version removes details that are not important for understanding programmatic incremental build systems at this moment.

```admonish info
Don't worry if you do not (fully) understand this code, the tutorial will guide you more with programming and understanding this kind of code.
This example is primarily here to motivate programmatic incremental build systems, as it is hard to do so without it.
```

```rust
struct ReadFile {
file: PathBuf
}
impl Task for ReadFile {
fn execute<C: Context>(&self, context: &mut C) -> Result<String, io::Error> {
context.require_file(&self.file)?;
fs::read_to_string(&self.file)
}
pub enum ParseTasks {
CompileGrammar { grammar_file_path: PathBuf },
Parse { compile_grammar_task: Box<ParseTasks>, program_file_path: PathBuf, rule_name: String }
}

struct WriteFile<T> {
task: T,
file: PathBuf
pub enum Outputs {
CompiledGrammar(CompiledGrammar),
Parsed(String)
}
impl<T: Task> Task for WriteFile<T> {
fn execute<C: Context>(&self, context: &mut C) -> Result<(), io::Error> {
let string: String = context.require_task(&self.task)?;
fs::write(&self.file, string.as_bytes())?;
context.provide_file(&self.file)

impl Task for ParseTasks {
fn execute<C: Context>(&self, context: &mut C) -> Result<Outputs, Error> {
match self {
ParseTasks::CompileGrammar { grammar_file_path } => {
let grammar_text = context.require_file(grammar_file_path)?;
let compiled_grammar = CompiledGrammar::new(&grammar_text, Some(grammar_file_path))?;
Ok(Outputs::CompiledGrammar(compiled_grammar))
}
ParseTasks::Parse { compile_grammar_task, program_file_path, rule_name } => {
let compiled_grammar = context.require_task(compile_grammar_task)?;
let program_text = context.require_file_to_string(program_file_path)?;
let output = compiled_grammar.parse(&program_text, rule_name, Some(program_file_path))?;
Ok(Outputs::Parsed(output))
}
}
}
}

fn main() {
let read_task = ReadFile {
file: PathBuf::from("in.txt")
let compile_grammar_task = Box::new(ParseTasks::CompileGrammar {
grammar_file_path: PathBuf::from("grammar.pest")
});
let parse_1_task = ParseTasks::Parse {
compile_grammar_task: compile_grammar_task.clone(),
program_file_path: PathBuf::from("test_1.txt"),
rule_name: "main"
};
let write_task = WriteFile {
task: read_task,
file: PathBuf::from("out.txt")
let parse_2_task = ParseTasks::Parse {
compile_grammar_task: compile_grammar_task.clone(),
program_file_path: PathBuf::from("test_2.txt"),
rule_name: "main"
};
Pie::default().new_session().require(&write_task);

let mut context = IncrementalBuildContext::default();
let output_1 = context.require_task(&parse_1_task).unwrap();
println("{output_1:?}");
let output_2 = context.require_task(&parse_2_task).unwrap();
println("{output_2:?}");
}
```

The unit of computation in a programmatic incremental build system is a _task_.
This is in essence just a normal (pure) Rust program: it has enums, a trait implementation for one of those enums, and a `main` function.
However, this program is also a build script because `ParseTasks` implements the `Task` trait, which is the core trait defining the unit of computation in a programmatic incremental build system.

##### Tasks

A task is kind of like a closure, a function along with its inputs that can be executed, but incremental.
For example, the `ReadFile` task carries the file path it reads from.
When we `execute` the task, it reads from the file and returns its text as a string.
However, due to incrementality, we mark the file as a `require_file` dependency through `context`, such that this task is only re-executed when the file changes!
For example, `ParseTasks::CompileGrammar` carries `grammar_file_path` which is the file path of the grammar that it will compile.
When we `execute` a `ParseTasks::CompileGrammar` task, it reads the text of the grammar from the file, compiles that text into a grammar, and returns a compiled grammar.

##### Incremental File Dependencies

However, we want this task to be incremental, such that this task is only re-executed when the `grammar_file_path` file changes.
Therefore, `execute` has a `context` parameter which is an _incremental build context_ that tasks use to tell the build system about dependencies.
For example, `ParseTasks::CompileGrammar` tells the build system that it _requires_ the file with `context.require_file(grammar_file_path)`, marking the file as a _read dependency_.
It is then the responsibility of the incremental build system to only execute this task if the file has changed.

##### Dynamic Dependencies

Note that this file dependency is created _while the task is executing_.
We call these _dynamic dependencies_, as opposed to static dependencies that are hardcoded into the build script.
Dynamic dependencies enable the _programmatic_ part of programmatic incremental build systems, because dependencies are made while your program is running, and can thus depend on values computed earlier in your program.

Another benefit of dynamic dependencies is that they enable _exact_ dependencies: the dependencies of a task exactly describe when the task should be re-executed, increasing incrementality.
With static dependencies, you often have to over-approximate dependencies, leading to reduced incrementality.

##### Incremental Task Dependencies

Dynamic dependencies are also created _between tasks_.
For example, `ParseTasks::Parse` carries `compile_grammar_task` which is an instance of the `ParseTasks::CompileGrammar` task to compile a grammar.
When we `execute` a `ParseTasks::Parse` task, it tells the build system that it depends on the compile grammar task with `context.require_task(compiled_grammar_task)`, but also asks the build system to return the most up-to-date (consistent) output of that task.
It is then the responsibility of the incremental build system to _check_ whether the task is _consistent_, and to _re-execute_ it only if it is _inconsistent_.

If `compile_grammar_task` was never executed before, the build system executes it, caches the compiled grammar, and returns the compiled grammar.
Otherwise, to check if the compile grammar task is consistent, we need to check the file dependency to `grammar_file_path` that `ParseTasks::CompileGrammar` created earlier.
If the contents of the `grammar_file_path` file has changed, the task is inconsistent and the build system re-executes it, caches the new compiled grammar, and returns it.
Otherwise, the build system simply returns the cached compiled grammar.

Note that this file read dependency is created _while the task is executing_.
We call these _dynamic dependencies_.
This is one of the main benefits of programmatic incremental build systems: you create dependencies _while the build is executing_, instead of having to declare them upfront!
The `main` function creates instances of these tasks, creates an `IncrementalBuildContext`, and asks the build system to return the up-to-date outputs for two tasks with `context.require_task`.

Dynamic dependencies are also created between tasks.
For example, `WriteFile` carries a task as input, which it requires with `context.require_task` to retrieve the text for writing to a file.
We'll cover how this works later on in the tutorial.
For now, let's zoom back out to the motivation of programmatic incremental build systems.
This is the essence of programmatic incremental build systems.
In this tutorial, we will define the `Task` trait and implement the `IncrementalBuildContext`.
However, before we start doing that, I want to first zoom back out and discuss the benefits of programmatic incremental build systems.

#### Back to Motivation
### Benefits

I prefer writing builds in a programming language like this, over having to _encode_ a build into a YAML file with underspecified semantics, and over having to learn and use a new build scripting language with limited tooling.
By _programming builds_, I can reuse my knowledge of the programming language, I get help from the compiler and IDE that I'd normally get while programming, I can modularize and reuse parts of my build as a library, and can use other programming language features such as unit testing, integration testing, benchmarking, etc.
Expand All @@ -105,7 +159,7 @@ A task is re-executed when one or more of its dependencies become inconsistent.
For example, the `WriteFile` task from the example is re-executed when the task dependency returns different text, or when the file it writes to is modified or deleted.
This is both incremental and correct.

#### Disadvantages
### Disadvantages

Of course, programmatic incremental build systems also have some disadvantages.
These disadvantages become more clear during the tutorial, but I want to list them here to be up-front about it:
Expand All @@ -120,7 +174,7 @@ We have developed [PIE, a Rust library](https://github.com/Gohla/pie) implementi
It is still under development, and has not been published to crates.io yet, but it is already usable
If you are interested in experimenting with a programmatic incremental build system, do check it out!

In this tutorial we will implement a subset of [PIE, the Rust library](https://github.com/Gohla/pie).
In this tutorial we will implement a subset of PIE.
We simplify the internals in order to minimize distractions as much as possible, but still go over all the key ideas and concepts that make programmatic incremental build systems tick.

However, the _idea_ of programmatic incremental build systems is not limited to PIE or the Rust language.
Expand Down
10 changes: 5 additions & 5 deletions src/4_example/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Example: Interactive Parser Development
# Project: Interactive Parser Development

To demonstrate what can be done with the programmatic incremental build system we just created, we will create a simple "parser development" example.
In this example, we can develop a grammar for a new (programming) language, and test that grammar against several example files written in the new language.
To demonstrate what can be done with the programmatic incremental build system we just created, we will develop a "parser development" build script and interactive editor as a project.
In this project, we can develop a grammar for a new (programming) language, and test that grammar against several example files written in the new language.

It will have both a batch mode and an interactive mode.
In the batch mode, the grammar is checked and compiled, the example program files are parsed with the grammar, and the results are printed to the terminal.
Expand All @@ -10,10 +10,10 @@ We will develop tasks to perform grammar compilation and parsing, and incrementa
Both batch and interactive mode will use the same tasks!

We will use [pest](https://pest.rs/) as the parser framework, because it is written in Rust and can be easily embedded into an application.
Pest uses Parsing Expression Grammars (PEGs) which are easy to understand, which is also good for this example.
Pest uses Parsing Expression Grammars (PEGs) which are easy to understand, which is also good for this project.

For the GUI, we will use [Ratatui](https://ratatui.rs/), which is a cross-platform terminal GUI framework, along with [tui-textarea](https://github.com/rhysd/tui-textarea) for a text editor widget.
We could use a more featured GUI framework like [egui](https://github.com/emilk/egui), but for this example we'll keep it simple and runnable in a terminal.
We could use a more featured GUI framework like [egui](https://github.com/emilk/egui), but for this project we'll keep it simple and runnable in a terminal.

As a little teaser, this is what the interactive mode looks like:

Expand Down
2 changes: 1 addition & 1 deletion src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
- [Prevent Overlapping File Writes](./3_min_sound/5_overlap/index.md)
- [Prevent Hidden Dependencies](./3_min_sound/6_hidden_dep/index.md)
- [Prevent Cycles](./3_min_sound/7_cycle/index.md)
- [Example: Interactive Parser Development](./4_example/index.md)
- [Project: Interactive Parser Development](./4_example/index.md)

# Appendix

Expand Down

0 comments on commit a7ce493

Please sign in to comment.