Add initial `mlir-format` PoC #121260

Wheest · 2024-12-28T10:29:35Z

This is a draft PR as a PoC of a basic formatting tool for MLIR. This was first described in this forum thread.

It does not aim to be anything close to clang-format. Instead, it uses the existing dialect printers to format IR.

An example use-case might be "I'm making some small edits to this MLIR test case, and I want it to be better formatted". Right now, users would either need to use mlir-opt (and reinsert their comments and SSA names by-hand), or do the formatting manually.

// Add two values
%x1 = arith.addf       
            %x,     %cst1 : f64

⬇️ 🪄

// Add two values
%x1 = arith.addf %x, %cst1 : f64

The two features that this tool introduces that cannot be done simply using mlir-opt are the following:

retain comments
retain SSA names

Feature 1 is achieved by using the original source buffer, and only replacing the character range of the formatted op. This means comments are left as-is.

Feature 2 is achieved by using the recently added NameLoc support, where we can retain identifier names for debugging. %alice = op() -> %0 = op() loc("alice"), which can then be printed again as %alice = op().

From a design perspective, this tool adapts mlir-rewrite. It creates 2 rewrite buffers, the first one it uses to insert the loc, which is then used to generate the formatted ops. The second rewrite buffer is where these formatted ops are inserted. I might be able to do it with one buffer, and there are some other code improvements that could be made.

There's a laundry list of small features that would be good to have to make this a good tool:

if there are in-op comments, do not format the op
Identify if we have any of the cases which are unsupported by NameLoc (e.g., %group1:2 %group2:3 = return_5vals)., if so do not format those ops.
text editor integration
reduce fragility of named block args
Add the correct amount of indentation
Add support for type aliases.

For 2 I have an idea for a workaround to enable this, for 1. we could try and get clever with reinsertion of the comment, but I'd prefer to keep the tool simple for now.

For 6, I've added a mini-parser to re-insert the original type aliases. But it might be better to allow alias printing as a general feature.

For 3., this Emacs function works, something more general, also with VScode support would be desirable.

Emacs mlir-format function

(defun mlir-format-buffer ()
  "Run the mlir-format program on the current buffer and replace it with the formatted output."
  (interactive)
  (let* ((temp-file (make-temp-file "mlir-format" nil ".mlir"))
         (output-buffer (generate-new-buffer "*mlir-format-output*"))
         (program (concat (getenv "LLVM_HOME") "/build/bin/mlir-format"))
         (args `("--mlir-use-nameloc-as-prefix" ,temp-file)))
    ;; Write the current buffer to a temporary file
    (write-region (point-min) (point-max) temp-file nil 'quiet)
    ;; Call the external program
    (if (zerop (apply #'call-process program nil output-buffer nil args))
        (progn
          ;; Replace the current buffer with the output
          (erase-buffer)
          (insert-buffer-substring output-buffer)
          (message "Buffer formatted with mlir-format."))
      (message "Error running mlir-format."))
    ;; Clean up
    (kill-buffer output-buffer)
    (delete-file temp-file)))

Wheest added 2 commits December 28, 2024 10:18

Add initial mlir-format PoC

3eebcfd

Add support for type alias via ad-hoc parser

68ae2fa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial `mlir-format` PoC #121260

Add initial `mlir-format` PoC #121260

Wheest commented Dec 28, 2024 •

edited

Loading

Add initial mlir-format PoC #121260

Are you sure you want to change the base?

Add initial mlir-format PoC #121260

Conversation

Wheest commented Dec 28, 2024 • edited Loading

Add initial `mlir-format` PoC #121260

Add initial `mlir-format` PoC #121260

Wheest commented Dec 28, 2024 •

edited

Loading