Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Dialect generation from ODS #274

Merged
merged 4 commits into from
Aug 17, 2023
Merged

Conversation

Danacus
Copy link
Contributor

@Danacus Danacus commented Aug 14, 2023

I've created an early draft of dialect generation based on the MLIR Python Bindings.

It's okay if you don't have the time to review this. I admit that there's quite a lot of ugly and hard to read code (especially the code dealing with variadic arguments), and it might not be in a state currently where you would want to maintain it within melior, hence I'm marking it as a draft.

Not much changed since my original implementation mentioned in #262, but I got a bit stuck on error handling in my TableGen wrapper library (tblgen-rs) and lost motivation for a while after that. Anyway, I decided to finish what I started, hence I am making this draft.

To build this branch, you might need to set TABLEGEN_160_PREFIX to match MLIR_SYS_160_PREFIX.

I've added the generated dialects in a new dialect_gen module for now, such that they can be compared with the original hand-written bindings in the dialect module.

There are still a few issues:

  • Some parts of the code are hacky and ugly, and may be hard to read.
  • Type inference is not always detected (but it should be as good as the Python bindings at least)
  • Need to add tests (already have them in a separate repository)
  • Need to fix some issues with the CI

I wanted complete parity with the existing hand-written dialect bindings, but there are some things that aren't generated as nicely. For example, arith::CmpiPredicate is not generated and plain Attribute is used instead for arith::cmpi. It might be feasible to generate dialect specific attributes from ODS instead. Or perhaps being able to write some function manually would be useful.

Currently I generate wrapper types around Operation that provide additional methods, for example:

        pub struct AddIOp<'c> {
            operation: ::melior::ir::operation::Operation<'c>,
        }
        impl<'c> AddIOp<'c> {
            pub fn name() -> &'static str {
                "arith.addi"
            }
            pub fn operation(&self) -> &::melior::ir::operation::Operation<'c> {
                &self.operation
            }
            pub fn builder(
                location: ::melior::ir::Location<'c>,
            ) -> AddIOpBuilder<'c, AddIOp__No__Lhs, AddIOp__No__Rhs> {
                AddIOpBuilder::new(location)
            }
            pub fn result(&self) -> ::melior::ir::operation::OperationResult<'c, '_> {
                self.operation.result(0usize).expect("operation should have this result")
            }
            pub fn lhs(&self) -> ::melior::ir::Value<'c, '_> {
                self.operation
                    .operand(0usize)
                    .expect("operation should have this operand")
            }
            pub fn rhs(&self) -> ::melior::ir::Value<'c, '_> {
                self.operation
                    .operand(1usize)
                    .expect("operation should have this operand")
            }
        }

I then provide implementations of Into<Operation> and TryFrom<Operation> to "cast" to and from an Operation.

        impl<'c> TryFrom<::melior::ir::operation::Operation<'c>> for AddIOp<'c> {
            type Error = ::melior::Error;
            fn try_from(
                operation: ::melior::ir::operation::Operation<'c>,
            ) -> Result<Self, Self::Error> {
                Ok(Self { operation })
            }
        }
        impl<'c> Into<::melior::ir::operation::Operation<'c>> for AddIOp<'c> {
            fn into(self) -> ::melior::ir::operation::Operation<'c> {
                self.operation
            }
        }

I wonder if it would be better to use an OperationLike trait, similar to AttributeLike? That way we wouldn't have to call operation or into to use the methods on Operation.

On a related note: should we also generate Ref (and RefMut?) types for these operation wrappers? It might be useful to be able to cast a OperationRef into a AddIOpRef, for example, to more easily analyze operations in external passes (or even external analyses, which is something else I'm working on as well: mlir-rust-tools).

@raviqqe
Copy link
Owner

raviqqe commented Aug 16, 2023

Thank you so much for the huge work! It's gonna take a few days to review it probably. So please wait patiently. 😄

@raviqqe
Copy link
Owner

raviqqe commented Aug 16, 2023

I wanted complete parity with the existing hand-written dialect bindings, but there are some things that aren't generated as nicely. For example, arith::CmpiPredicate is not generated and plain Attribute is used instead for arith::cmpi

This is fine for now. What we should do first is to make this implementation of dialects usable. We can deal with the issues later.

@raviqqe
Copy link
Owner

raviqqe commented Aug 16, 2023

I wonder if it would be better to use an OperationLike trait, similar to AttributeLike? That way we wouldn't have to call operation or into to use the methods on Operation.

Yeah, I think that would be a better solution. But it doesn't have to be a scope of this PR.

@raviqqe
Copy link
Owner

raviqqe commented Aug 16, 2023

On a related note: should we also generate Ref (and RefMut?) types for these operation wrappers? It might be useful to be able to cast a OperationRef into a AddIOpRef, for example, to more easily analyze operations in external passes (or even external analyses, which is something else I'm working on as well: mlir-rust-tools).

If you think they are useful, we definitely can! Actually, I have little experience in modifying operations built already. For the RefMut stuff, as described in #24, I haven't decided yet on the best solution (the list item of "Mutable operations vs dynamic checks with RefCell (vs silently potentially unsafe mutability).") So we can use Ref for now.

@raviqqe
Copy link
Owner

raviqqe commented Aug 16, 2023

It's already a pretty big PR. We can merge this one first, fix the build, tests, and CI, feature-flag these auto-generated dialects, and then improve the codes and features gradually. What do you think?

@Danacus
Copy link
Contributor Author

Danacus commented Aug 16, 2023

On a related note: should we also generate Ref (and RefMut?) types for these operation wrappers? It might be useful to be able to cast a OperationRef into a AddIOpRef, for example, to more easily analyze operations in external passes (or even external analyses, which is something else I'm working on as well: mlir-rust-tools).

If you think they are useful, we definitely can! Actually, I have little experience in modifying operations built already. For the RefMut stuff, as described in #24, I haven't decided yet on the best solution (the list item of "Mutable operations vs dynamic checks with RefCell (vs silently potentially unsafe mutability).") So we can use Ref for now.

I can't say I have much experience with modifying MLIR operation either, I have mostly been hacking the LLVM backend, and MLIR just caught my interest. Now I'm wondering how and if similar things could be achieved in MLIR, and if I can use Rust for this. But some of the things I'm trying to achieve are likely out-of-scope for the MLIR C API and/or melior.

I also don't really have enough experience with Rust and ffi to know how to properly deal with mutability accross language boundaries (e.g. what if C++ code in MLIR is also holding a mutable pointer? is are mutable reference no longer valid then? can you ever be sure?).

It's already a pretty big PR. We can merge this one first, fix the build, tests, and CI, feature-flag these auto-generated dialects, and then improve the codes and features gradually. What do you think?

I think that's a great idea! That way PR's can have a more limited scope, making them easier to review and polish. I kind of feel bad for making such a large PR. Once I get started on something, I can't seem to stop myself 😅 Feel free to tell me when a PR is too big, or when it is not written well enough, or if there is any other reason you would prefer to not merge something. I don't mean to pressure you in any way.

@Danacus
Copy link
Contributor Author

Danacus commented Aug 16, 2023

CI is failing because the documentation of dialect operations is generated from the TableGen files, which may contain some code blocks. Those code blocks are assumed to be valid rust code, although they do not contain rust code at all. There doesn't seem to be a nice way to deal with this right now (see rust-lang/rust#59867). The only solutions I can find are:

  • Hack the description strings to replace ``` with ```ignore
  • Disable the module in doctests using #[cfg(not(doctest))], but then any example in docs that references the dialect module will also fail to build.
  • Disable all doctests, but that decreases test coverage
  • Remove the autogenerated docs for now and solve this problem in a later PR.

macro/Cargo.toml Show resolved Hide resolved

// Writes `tablegen_compile_commands.yaml` for any TableGen file that is being parsed.
// See: https://mlir.llvm.org/docs/Tools/MLIRLSP/#tablegen-lsp-language-server--tblgen-lsp-server
fn emit_tablegen_compile_commands(td_file: &str, includes: &[String]) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand why this is built here. I don't see any .td files committed or generated. How do you use the database file in your editor? Do you view the LLVM install directory from the crate's top directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably remove it. I was creating some TableGen files in a separate directory and including them using the dialect! macro, but it doesn't really belong here.

def: Record<'a>,
}

#[allow(unused)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you expecting this to be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These structs are used, there are just some functions that on these structs that aren't used. The attribute should be moved down, or these unused functions should be removed. Most of these structs are partial reimplementations of some classes in MLIR (e.g. https://mlir.llvm.org/doxygen/classmlir_1_1tblgen_1_1TypeConstraint.html), since it was easier to reimplement parts than to create bindings.

The operands and results in ODS operations are TypeConstraints, and attributes are AttributeConstraints.

macro/src/dialect/error.rs Show resolved Hide resolved
#[derive(Debug, Clone, Copy)]
pub struct TypeConstraint<'a>(Record<'a>);

#[allow(unused)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These contraints too.

@raviqqe raviqqe merged commit f9eece9 into raviqqe:main Aug 17, 2023
9 of 10 checks passed
@raviqqe
Copy link
Owner

raviqqe commented Aug 17, 2023

Are there any reasons you used panics rather than results in the codes especially in the macro crate?

Syn(syn::Error),
TableGen(tblgen::Error),
ExpectedSuperClass(SourceError<ExpectedSuperClassError>),
ParseError,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we directly propagate the errors from the tblgen crate?

}
}

impl From<Error> for syn::Error {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this implementation for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a leftover from when the code was in a separate crate. I used to convert all errors to syn::Error and used syn::Error::into_compile_error. I guess this also why I made a custom error type, but it can probably be removed now.

@Danacus
Copy link
Contributor Author

Danacus commented Aug 17, 2023

Are there any reasons you used panics rather than results in the codes especially in the macro crate?

Do you mean in the macro code itself, or in the generated code. If you mean the macro code itself, those should probably be replaced with results. My code initially didn't use results, but there are still some unwraps left that should be removed.

For the generated code, I'm assuming that each "typed" operation from a dialect is valid according to the dialect spec. I was thinking about verifying this in the TryInto implementations. However, it would still be possible that these operations are modified after they are verified such that they are no longer valid according to the dialect, in which case some of those accessor functions might panic, which isn't ideal.

@Danacus
Copy link
Contributor Author

Danacus commented Aug 17, 2023

Thank you for reviewing this PR!

@raviqqe
Copy link
Owner

raviqqe commented Aug 17, 2023

Hack the description strings to replace with ignore

I've marked all the code blocks with empty info strings text. Now, build, tests, and linting are all enabled on the CI with the ods-dialects feature flag.

@Danacus
Copy link
Contributor Author

Danacus commented Aug 17, 2023

I've marked all the code blocks with empty info strings text. Now, build, tests, and linting are all enabled on the CI with the ods-dialects feature flag.

I had something similar in mind, but I forgot regex was a thing. I'm a little worried that there might be some non-markdown descriptions in some dialects, since I don't know how strict they are about this, but I guess it works for now.

For the RefMut stuff, as described in #24, I haven't decided yet on the best solution (the list item of "Mutable operations vs dynamic checks with RefCell (vs silently potentially unsafe mutability).") So we can use Ref for now.

I've been thinking about this, and I think I understand the problem now. When borrowing a Block for append_operation, ownership of the operation is moved into the block and a reference is returned, but this reference is tied to the borrow of the Block, so Block remains borrowed for as long as the operation reference exists.

If you would borrow Block mutably, the returned reference to the operation would be tied to the mutable borrow of block, meaning that we cannot access the block while the reference to the operation is live, which would be highly inconvenient.

Borrowing immutably from Block is slightly unsafe, since you do mutate the block to append the operation.

I'll think a bit about this issue, because I find it interesting, but I don't think I'll be able to come up with a better solution than anything you have in mind.

(I'm sorry if this is a bit off-topic)

@raviqqe
Copy link
Owner

raviqqe commented Aug 17, 2023

Do you mean in the macro code itself, or in the generated code. If you mean the macro code itself, those should probably be replaced with results. My code initially didn't use results, but there are still some unwraps left that should be removed.

For the generated code, I'm assuming that each "typed" operation from a dialect is valid according to the dialect spec. I was thinking about verifying this in the TryInto implementations. However, it would still be possible that these operations are modified after they are verified such that they are no longer valid according to the dialect, in which case some of those accessor functions might panic, which isn't ideal.

I see. In that case, I think we should prefer Results in both macros and "typed" operations. For the "typed" operations, we can also restrict the conversion from "typed" operations to generic operations by moving them like:

impl From<FooOperation> for Operation { 
  fn from(operation: FooOperation) -> Operation {
    // ...
  }
}

But I'm not sure if it's possible yet.

}

impl<'c> Into<::melior::ir::operation::Operation<'c>> for #class_name<'c> {
fn into(self) -> ::melior::ir::operation::Operation<'c> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have impl From<FooOperation> for Operation through this impl of Into. I think From is preferred over Into though, but I'm not sure if we can implement From for melior::ir::Operation in foreign crates that might use this dialect! macro as well, which is why I chose to implement Into.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I was wrong about Into providing From, it's only the other way around. And I just read orphaning rules have become more relaxed, and you can implement From on foreign types now. This should be replaced with an impl of From, my bad!

raviqqe added a commit that referenced this pull request Aug 18, 2023
As previously suggested in #274, I replaced the `.expect` with returning
a `Result` in the operation accessors of ODS generated dialects.

I also added some variants to the `Error` enum to support this. The
`Infallible` variant was added to allow using `TryInto` to convert
`Attribute` into itself, such that we avoid needing to handle
`Attribute` differently from `StringAttribute`, `IntAttribute`, etc. But
if you prefer, I can change the macro code to deal with this case
instead, I'm honestly not a big fan of my `Infallible` hack either.

---------

Co-authored-by: Yota Toyama <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants