Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental compilation model and UX improvements #716

Open
thoughtpolice opened this issue Jul 18, 2024 · 0 comments
Open

Incremental compilation model and UX improvements #716

thoughtpolice opened this issue Jul 18, 2024 · 0 comments

Comments

@thoughtpolice
Copy link
Contributor

thoughtpolice commented Jul 18, 2024

Note: this can be seen as a follow on to #165 because ultimately my goal is good incremental rebuilds and fast iteration times.

The documentation isn't very clear about how to do manual incremental compilation, i.e. compilation without -u to automatically recalculate dependencies, but instead compiling individual modules in a dependency graph incrementally, and then reusing the results later on in other builds.

In short, I want to know how to build a set of Bluespec files into a "linkable object" (whatever that is, even a collection of .bo files in a directory) manually, and then reuse those in another project, and then emit Verilog or Bluesim models from that. For example, I might have a collection of .bsv files that implement AXI4, then a collection of files that implements a DDR controller, then a collection that implements a soft CPU, then a single Top file that ties them all together.

I think this is a really important feature for integration of bsc into a larger project. Most projects do not need it, but some really need it. In this case, I am using Buck2 which is a very granular and powerful build system that supports multiple languages, precisely so I can express relationships between RTL/C++/whatever. Bluespec is actually just one small part of the build graph.

However, to get the best results, I need to use Buck to compile individual source files into binary objects and reuse them, not compile whole packages at once with bsc -u.

How it works today

My understanding of how it works today is this:

  • Every .bs or .bsv file becomes one .bo file.
  • Use makedepend.tcl to get the ordered relationships between .bo files.
  • Run bsc -bdir $BDIR Module.bsv in the right order given by makedepend.tcl, which produces a bunch of .bo files.
  • You can distribute these .bo files as a "library" of Bluespec code.
  • Now your users can use -p path/to/bo/directory:+ in order to import those modules

You can use this process recursively to build binary libraries (directories of .bo files) that depend on binary libraries. That works fine, I think. I know how to get this working in Buck2. But now there are two main questions:

  • How do I generate bluesim models from a set of .bo files?
  • How do I generate verilog from a set of .bo files?

The reason we're generating from a set of .bo files is because we just compiled them! Furthermore, in Buck, it's convenient to separate the "build .bo files from source" and "build .v files from .bo files" step because it can capture that granularity.

Right now, I don't understand how to do this. It seems like the entire UX requires you to specify an actual source file with a synthesize attribute, if you want to generate Verilog from something, for instance. Shouldn't you just be able to do this from the .bo file if you just got done compiling it?

Actually, it does seem like it's possible to do this with bluesim:

$ bsc -vdir vo -bdir bo -sim Core.bsv 
$ bsc -vdir vo -bdir bo -sim SOC.bsv 
$ bsc -vdir vo -bdir bo -sim Moon.bsv 
Elaborated module file created: bo/moon.ba
Elaborated module file created: bo/soc.ba
$ bsc -vdir vo -bdir bo -sim -e soc

In this example we can run the last command purely with .bo files as inputs. However, this seems to be based on the .ba file, and the .ba file still has to be generated in the .bsv compilation step from the source code. I see no way to generate a .ba file from a .bo file after the fact?

Is there a way to do the same with Verilog and Bluesim consistently?

Hypothetical UX

Basically, I think once you generate a set of .bo files, you need to be able to produce:

  • .v files from top-level synthesize modules
  • .ba files, similarly

That means the first step in all flows is always compiling a Bluespec module to its corresponding .bo file, which is good because it gives a simple understanding of what "compile" means, and it means by default everything is ensured to typecheck and be valid.

You also need a "Single shot" mode which can produce the same outputs as the above "automatic" recompilation flow. In "single shot" mode, you need more control over the inputs and outputs to each command:

  • First, you should be able to specify the imported .bo files on the command line, rather than a -bdir flag and letting the implicit dependency resolution handle it.
  • Second, you should be able to specify the output .bo file directly, too. Actually, all output files should be specified manually. You can always move them into place (i.e. a single directory) later on.

This design is much more convenient for systems like Buck and others because they want to invoke commands directly with paths to outputs they produce, and then track that. This is what lets them build such precise dependency graphs between all input files. If you instead put everything in an opaque directory, then you have to run commands to populate the directory in the right order first, and fixing that is more annoying than just writing the output file directly.

So maybe we could imagine a set of steps like:

mkdir -p out/lib1
bsc -c -o $PWD/out/lib1/Basic1.bo \
  $PWD/lib1/Basic1.bsv 
bsc -c -o $PWD/out/lib1/Basic2.bo \
  $PWD/out/lib1/Basic1.bo \
  $PWD/lib1/Basic2.bsv 

This assumes that Basic2 imports Basic1. Note that:

  • Every .bo file is compiled in "Single-shot" mode via -c; this is the same flag used by other compilers like GCC and GHC.
  • Every .bo output file is manually specified via -o. Why? Because in -c mode, it's obvious -o means the .bo file, that's the main output.
  • Basic1.bo is passed as an input when we compile Basic2, because we import it.

Now, once you have a set of .bo files, you could produce a .v file, given you know a module which is exported as synthesize:

bsc -verilog -o output.v -g module \
  $PWD/out/lib1/Basic1.bo
  $PWD/out/lib1/Basic2.bo

Note that the -p flag basically would behave the same as it does today, in all these examples; it could be used to implicitly resolve imports like it does with the Prelude right now. Maybe you could add -p my/other/lib:+ for example and then magically import Foobar from it and it would work like today.

So rather than specify extra .bo files, you could specify -p directories to find imports in this example. That is more important for importing libraries to compile another library. However, it is very convenient when compiling modules inside a library to support taking .bo files on the command line directly, like it does here, because it's easier to directly model these intra-library deps in these build systems, rather than an opaque directory of random blobs. For example, let's assume a module M has 5 total transitive imports but there are 100 modules in the total library dependency graph, the build system would have to copy around 100 .bo files instead of just 5 of them in order to compile M. Also, in the future, Buck2 will be able to express fine-grained dependencies across libraries, so in theory one day the usage of -p might not be necessary at all. (For example, if library B depends on A, and you modify a module in A that B does not import, then B doesn't need to be recompiled. This is only possible if you can understand the full module dependency graph.)

Is the above design reasonable?

I'm not sure how much the above makes sense, but it would make it much easier to model Bluespec builds in Buck2, at least, and probably Bazel and other Make-based systems as well.

It would also mean that if we could generate .bo files first and then always generate Bluesim executables or Verilog from the .bo files, then you can potentially improve the build speed a lot by not recompiling everything with -verilog and -sim options.

Prior art

This proposal is actually very similar to how OCaml does incremental dependencies; you typically need to do something like run ocamldep, then incrementally compile files in the order it specifies. When a file produces an output, you provide it as an input when you need to compile something that needs it, directly on the command line. It's very similar to the above in that sense.

This design requires that the build system is able to run makedepend and then compile incrementally files in the proper order. In Buck2, these cases are handled naturally by so-called "dynamic dependencies".

Other notes

The above design would also make it trivially possible to re-compile the stdlib within the Buck2 dependency graph, which would be handy to express for the most fine-grained builds possible.

Can this be achieved with bluetcl?

Basically, above. If bluetcl can be used to hack all this together into a consistent CLI, that would be amazing, honestly.

Even if I can only do it for Verilog, that's tolerable, because at least I can simulate the synthesized netlist afterwords.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant