Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial mlir-format PoC #121260

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

Wheest
Copy link
Contributor

@Wheest Wheest commented Dec 28, 2024

This is a draft PR as a PoC of a basic formatting tool for MLIR. This was first described in this forum thread.

It does not aim to be anything close to clang-format. Instead, it uses the existing dialect printers to format IR.

An example use-case might be "I'm making some small edits to this MLIR test case, and I want it to be better formatted". Right now, users would either need to use mlir-opt (and reinsert their comments and SSA names by-hand), or do the formatting manually.

// Add two values
%x1 = arith.addf       
            %x,     %cst1 : f64

⬇️ 🪄

// Add two values
%x1 = arith.addf %x, %cst1 : f64

The two features that this tool introduces that cannot be done simply using mlir-opt are the following:

  1. retain comments
  2. retain SSA names

Feature 1 is achieved by using the original source buffer, and only replacing the character range of the formatted op. This means comments are left as-is.

Feature 2 is achieved by using the recently added NameLoc support, where we can retain identifier names for debugging. %alice = op() -> %0 = op() loc("alice"), which can then be printed again as %alice = op().

From a design perspective, this tool adapts mlir-rewrite. It creates 2 rewrite buffers, the first one it uses to insert the loc, which is then used to generate the formatted ops. The second rewrite buffer is where these formatted ops are inserted. I might be able to do it with one buffer, and there are some other code improvements that could be made.

There's a laundry list of small features that would be good to have to make this a good tool:

  1. if there are in-op comments, do not format the op
  2. Identify if we have any of the cases which are unsupported by NameLoc (e.g., %group1:2 %group2:3 = return_5vals)., if so do not format those ops.
  3. text editor integration
  4. reduce fragility of named block args
  5. Add the correct amount of indentation
  6. Add support for type aliases.

For 2 I have an idea for a workaround to enable this, for 1. we could try and get clever with reinsertion of the comment, but I'd prefer to keep the tool simple for now.

For 6, I've added a mini-parser to re-insert the original type aliases. But it might be better to allow alias printing as a general feature.

For 3., this Emacs function works, something more general, also with VScode support would be desirable.

Emacs mlir-format function

(defun mlir-format-buffer ()
  "Run the mlir-format program on the current buffer and replace it with the formatted output."
  (interactive)
  (let* ((temp-file (make-temp-file "mlir-format" nil ".mlir"))
         (output-buffer (generate-new-buffer "*mlir-format-output*"))
         (program (concat (getenv "LLVM_HOME") "/build/bin/mlir-format"))
         (args `("--mlir-use-nameloc-as-prefix" ,temp-file)))
    ;; Write the current buffer to a temporary file
    (write-region (point-min) (point-max) temp-file nil 'quiet)
    ;; Call the external program
    (if (zerop (apply #'call-process program nil output-buffer nil args))
        (progn
          ;; Replace the current buffer with the output
          (erase-buffer)
          (insert-buffer-substring output-buffer)
          (message "Buffer formatted with mlir-format."))
      (message "Error running mlir-format."))
    ;; Clean up
    (kill-buffer output-buffer)
    (delete-file temp-file)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant