-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Library
trait
#1422
Remove Library
trait
#1422
Conversation
Also adds validation in the constructor to ensure that the implementation of `namespace()` is valid.
…ary::write_to_dir()`
37c898c
to
64b444b
Compare
This commit fixes the issue where re-exported procedures (aliases) were not being handled properly during assembly. It now explicitly represents them in the call graph, so that topological ordering of the call graph will ensure that we only visit the aliases once the aliased procedure has been compiled/visited. As part of the solution implemented here, some refinements to `AliasTarget` were made, in order to explicitly represent whether the target is absolute or not (just like `InvocationTarget`). Additionally, to avoid confusion, `FullyQualifiedProcedureName` was renamed to `QualifiedProcedureName`, so as to make it clear that just because the path is qualified, does not mean it is absolute (and thus "fully" qualified). Some conveniences were added to `LibraryNamespace`, `LibraryPath`, and `AliasTarget` to make certain operations/conversions more ergonomic and expressive.
This commit refactors `CompiledLibrary` a bit, to remove some unnecessary restrictions leftover from the old MASL libraries: * A `CompiledLibrary` no longer has a name, but it has a content digest obtained by lexicographically ordering the exported MAST roots of the library, and merging the hashes in order. * As a consequence of being unnamed/having no namespace, a `CompiledLibrary` can now consist of procedures from many modules _and_ many namespaces. Any limitation we impose on top of that can be done via wrapper types, like how `KernelLibrary` is implemented. * Support for re-exported procedures in a `CompiledLibrary` is implemented. It is assumed that all required libraries will be provided to the `Host` when executing a program. * Some ergonomic improvements to APIs which accept libraries or sets of modules, to allow a greater variety of ways you can pass them.
Previously, we assumed that two procedures with the same MAST root, but differing numbers of locals, was a bug in the assembler. However, this is not the case, as I will elaborate on below. If you compile a program like so: ```masm use.std::u64 begin exec.u64::checked_and end ``` The resulting MAST would look something like: ```mast begin external.0x.... end ``` This MAST will have the exact same MAST root as `std::u64::checked_and`, because `external` nodes have the same digest as the node they refer to. Now, if the exec'd procedure has the same number of locals as the caller, this is presumed to be a "compatible" procedure, meaning it is fine to let both procedures refer to the same MAST. However, when the number of procedure locals _differs_, we were raising a compatible definition error, because it was assumed that due to the instructions added when procedure locals are present, two procedures with differing numbers of locals _could not_ have the same root by definition. This is incorrect, let me illustrate: ```masm export.foo.2 ... end use.foo proc.bar.3 exec.foo::foo end begin exec.foo::foo end ``` Assume that `foo.masm` is first compiled to a `CompiledLibrary`, which is then added to the assembler when assembling an executable program from `bar.masm`. Also, recall that `begin .. end` blocks are effectively an exported procedure definition, with zero locals, that has a special name - but in all other ways it is just a normal procedure. The above program is perfectly legal, but we would raise an error during assembly of the program due to the `begin .. end` block compiling to an `external` node referencing the MAST root of `foo::foo`, which has a non-zero number of procedure locals. The `bar` procedure is there to illustrate that even though it too simply "wraps" the `foo::foo` procedure, it has a non-zero number of procedure locals, and thus cannot ever have the same MAST root as a wrapped procedure with a non-zero number of locals, due to the presence of locals changing the root of the wrapping procedure. A check has been kept around that ensures we catch if ever there are two procedures with non-zero counts of procedure locals, with the same MAST root.
9ffbc80
to
ca50ab3
Compare
@plafer I broke up the commits I added into a few discrete chunks, definitely read the commit description (when present), because I go into more detail on what that commit was for. @bobbinth This is ready for your final review, and can be merged whenever IMO. Re-exported procedures now work as expected again, and are now fully supported in compiled libraries, so we're one step ahead of where we thought we'd be. Also relevant to both of you: I made some last minute changes to the way
Also, The remaining changes I made were to fix various test failures I encountered while cleaning up/validating. Virtually all of them are artifacts of needing to set up the test environment a bit differently when referencing libraries, inlining of blocks being disabled in earlier commits on this branch, etc. If you see anything that catches your eye though, let me know and I can let you know what prompted the change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving the parts of this that I didn't write, I'll leave it up to one of you two to approve my changes to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good! Thank you! I left a few comments inline - but I'll merge as is and address them in a follow-up PR.
Another general comment: I think it would be nice to have a FullyQualifiedProcedureName
type - but this may require a bigger refactoring of LibraryPath
and associated types. So, also leaving it for the future.
#[derive(Debug, Clone, PartialEq, Eq)] | ||
pub struct CompiledLibrary { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rename this into just Library
as we don't have a concept of "uncompiled" libraries any more.
#[derive(Debug, Clone, PartialEq, Eq)] | ||
#[repr(u8)] | ||
enum Export { | ||
/// The export is contained in the [MastForest] of this library | ||
Local(MastNodeId), | ||
/// The export is a re-export of an externally-defined procedure from another library | ||
External(RpoDigest), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear to me whether we actually need to make this distinction. I think we should be able to treat them the same way (i.e., External
procedure could be represented by a single External
node in the MastForest
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main thing I wanted to represent here was whether the export was part of the library, or re-exported from another library. If we make the change you suggest, this information would be obscured. We don't currently expose this information, but I suspect it will come in handy when building packaging infrastructure, so that our tooling can use information like this for analysis purposes.
That said, it's not critical, so I could go either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should still be able to get this info. For example, we could provide something CompiledLibrary::get_reexports() -> Vec<(QualifiedProcedureName, RpoDigest)>
. This method would just iterate over roots of the MAST forest and filter out roots which are not direct references to External
nodes.
/// Serialize to `target` using `options` | ||
pub fn write_into_with_options<W: ByteWriter>(&self, target: &mut W, options: AstSerdeOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably rename AstSerdeOptions
into something else as these no longer refer to abstract syntax tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, definitely feels more general now.
/// Adds the compiled library to provide modules for the compilation. | ||
pub fn add_compiled_library(&mut self, library: CompiledLibrary) -> Result<(), Report> { | ||
pub fn add_compiled_library( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to one of the previous comments, I think this and similar methods could be called just add_library()
since we don't have uncompiled libraries any more.
/// TODO: add docs | ||
pub struct StdLibrary(MaslLibrary); | ||
pub struct StdLibrary(CompiledLibrary); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we actually need to define a new struct. An alternative could to have this crate expose a single method - something like:
pub fn get_library() -> CompiledLibrary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is nice to use wrapper structs that contain a CompiledLibrary
, since that lets them be used in places where a CompiledLibrary
is expected, without needing to extract the actual CompiledLibrary
explicitly. In fact, I would suggest re-introducing some library traits, that looks like so:
/// Exposes `CompiledLibrary` functionality on any wrapper struct that impls AsRef,
/// but does not permit overriding the core `CompiledLibrary` APIs
///
/// Extra library-related functionality that allows customization, would be added here
pub trait Library: sealed::LibraryImpl {}
mod sealed {
/// Ensures that only types that contain a `CompiledLibrary` can implement `Library`
pub(super) trait LibraryImpl: AsRef<CompiledLibrary> {
fn digest(&self) -> RpoDigest {
self.as_ref().digest()
}
fn mast_forest(&self) -> &MastForest {
self.as_ref().mast_forest()
}
}
impl<T: AsRef<CompiledLibrary>> LibraryImpl for T {}
}
/// Marker trait for `Library` impls which are valid kernels
pub trait Kernel: Library + AsRef<KernelLibrary> {}
In virtually every place where we use CompiledLibrary
now, we actually expect something that impls AsRef<CompiledLibrary>
, but I would argue it makes sense to have a dedicated trait for this, so that we can allow custom library types to customize certain behaviors.
It would also make it possible to instantiate a static Library
impl, as opposed to constructing it every time someone calls get_library()
.
Anyway, food for thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the things I'm hoping to do is define the assembler in such a way that we construct it with a set of libraries upfront. This way, we don't need to deal with situations when libraries can be added dynamically during the assembly process (which, I think, in practice should never happen).
With Library
being a struct, we could have the following constructor for the assembler:
impl Assembler {
pub fn new(
source_manager: Arc<dyn SourceManager>,
kernel: KernelLibrary,
libraries: impl IntoIterator<Item = Library>,
) -> Result<Self, Report> {
...
}
}
We could have other, simpler constructors (e.g., with_kernel()
) which would internally call this constructor - but the main idea is that once we've instantiated the assembler, we can only add modules and assembler programs/libraries etc.
If Library
is a trait - this becomes a bit more complicated, I think.
// TODO: Add version to `Library` | ||
let _version = env!("CARGO_PKG_VERSION").parse::<Version>().expect("invalid cargo version"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably do this before the release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should probably define the semantics of library versioning first, since it isn't super clear to me just yet what one can do with a version that makes sense at the moment. The only thing useful at this point, is the digest
computed for a CompiledLibrary
, which tells you when the contents of the library has changed.
Removes the old
Library
trait andMaslLibrary
, and renamesCompiledLibrary
.