Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate source maps #356

Closed
wants to merge 2 commits into from
Closed

Generate source maps #356

wants to merge 2 commits into from

Conversation

Namek
Copy link
Contributor

@Namek Namek commented Jan 10, 2021

(tackles issue #184)

Source map is a piece of data for a (browser's) debugger which enables to put breakpoints in actual Mint code instead of the generated JavaScript code for arbitrary Mint application.

Technical explanation

Currently JS code is generated by concatenating strings to generate the JS code. This makes it hard to find out connections between a string represting e.g. variable name and the origins of the variable inside original Mint code. Thus, this PR changes string concatenation to building a new tree of nodes which are mainly two things:

  1. normal strings with no mapping at all
  2. objects of SourceMappable class which identify which Ast::Node is an origin of strings that those nodes contain

So, during compilation whole tree is built and then we need to build it as a string which happens in Codegen.build function (see src/utils/codegen.cr). The source map JSON itself is generated by code in src/utils/source_map.cr

Resources:

How to use

For commands compile, build and start there is a new flag boolean --source_map (-m for short).

mint start has this flag enabled by default (since it's good for development experience) so you don't have to do anything new.

Just open the website for your mint app (127.0.0.1:3000 by default). Then, in the browser's DevTools (F12) you can go to Sources tab, hit CTRL+P and type in a filename, e.g. Main.mint. The rest is just normal debugging - breakpoints and looking at variables.

Call for Review

Debugging experience is not perfect but I believe it's already valuable. I tested it only with Chrome. I would highly appreciate a review and some testing to see whether this is appropriate enough for merging.

@Namek Namek force-pushed the source-maps branch 6 times, most recently from a47ec64 to 1707fb3 Compare January 10, 2021 12:32
@gdotdesign
Copy link
Member

gdotdesign commented Jan 15, 2021

First of all thank you very much for this! 🙏 It's the biggest feature PR yet! 🎉

Since there are many files changed, and also I need to fully understand how it works so it will take a while to fully review.

I've tested it on the realworld project https://github.com/mint-lang/mint-realworld and fixed the encoder to support (it threw an error).

There are a couple of places I see breakpoints which work well like this:
breakpoints

There are a lot of places there is no breakpoint at all:
breakpoints1

I guess because it's not symbol_mapped? Or probably I just don't understand where the breakpoints show up because sometimes the "Jump to the generated location" works (sometimes it just jumps to the last jump)

I'll probably will have more questions as I am progressing.

@Namek
Copy link
Contributor Author

Namek commented Jan 17, 2021

First of all thank you very much for this! 🙏 It's the biggest feature PR yet! 🎉

<3

Since there are many files changed, and also I need to fully understand how it works so it will take a while to fully review.

👍

I've tested it on the realworld project http://github.com/mint-realworld and fixed the encoder to support (it threw an error).

Great! Some specs fail now, though.

There are a lot of places there is no breakpoint at all:
breakpoints1

We can't have breakpoints inside the JS interop code since we don't analyse it, simply copy-paste. However, I'd have to debug why there is not even a single breakpoint at where the first backtick appears (inside function). Maybe there is no return instruction and that's why.

Some explanation

I guess because it's not symbol_mapped? Or probably I just don't understand where the breakpoints show up because sometimes the "Jump to the generated location" works (sometimes it just jumps to the last jump)

So, there are 2 methods:

  • Codegen.symbol_mapped - this one includes a symbol, it should be a segment without newline characters, for example variable name, function name, argument name, etc.
  • Codegen.source_mapped - this one is kind of just a pointer to a line, for example at the beginning of if or at every Statement.

The biggest issue with mapping is that the source map format only takes positions without lengths. So if you call Codegen.source_mapped or Codegen.symbol_mapped it does take Ast::Node which includes both from and to fields but in final output only the from field is actually put to source map file. The only way to simulate to field is outputting the additional symbol which can be done by calling Codegen.symbol_mapped.

Now, some things fail.

Example 1.

For example the if/else in a function:

if (some_condition) {
  2
} else {
  3
}

would be compiled to:

return (some_condition ? 2 : 3);

Now, you would expect that it's possible to put a breakpoint at if and it should work in this sitation. You would also expect to put a breakpoint inside the true case or false case. And here's the thing - no matter what mappings we create (at least to my tests) the debugger can't put a breakpoint at number 2 or 3 because debugger doesn't work with ternary operator (the cond ? true : false syntax).

The only reason why you can put a breakpoint at if line is because we have a return statement there.

Example 2.

There is another issue, in some cases the Mint compiler puts things into IIF (immediately-invoked function). Look at this example:

try {
  predicate =
    if (some_condition) {
      2
    } else {
      3
    }

  if (predicate) {
    4
  } else {
    5
  }
}

Probably not exactly this example but in it's formula it should compile into something like:

return (() => { // try
  let a = some_condition ? 2 : 3;

  return (a ? (() => { // <--- here's our problem
    return 4;
  } : 5);
});

For the 4 inside true statement there are 2 mappings generated:

  1. exactly at the part where (() => { begins
  2. at the return 4 line

The first one is generated because it's a top view from the if/else perspective. The second is naturally what we would want to have. And because there are 2 things generated for exactly one source segment, the debugger loses a position and I think it actually makes impossible put a breakpoint in there or it moves.

Solutions

The mentioned two issues can be solved but I didn't want to go with too many changes into the compiler.

For the if/else (the Example 1) we need to output a longer code than ternary operator. The easiest approach would be an IIF with JS if/else inside. However, as the Example 2 shows this would introduce more problems with positioning because IIF introduces (() => { which is a boilerplate with an unwanted mapping which breaks stuff. I didn't want to look for all the places where we should deal with it and not output this additional source map point.

The really good solution to IIFs would be outputting JavaScript AST instead of the text directly, then process the JS AST to figure out where mappings should go. But I felt it would be a larger change to the compiler (not sure atm though).

I'll probably will have more questions as I am progressing.

Sure

@Namek Namek force-pushed the source-maps branch 2 times, most recently from 3086b1a to 975b212 Compare January 31, 2021 02:15
@Namek
Copy link
Contributor Author

Namek commented Jan 31, 2021

I've just resolved the conflicts. I have overwritten the branch after rebasing onto master because I don't like merging master to feature branch. Reasoning about conflicts is much harder when you don't keep the chronological order of commits in line.

@Namek Namek force-pushed the source-maps branch 3 times, most recently from f055213 to b1b704c Compare February 2, 2021 01:04
@Namek
Copy link
Contributor Author

Namek commented Feb 5, 2021

@Sija please have a look at my comment above.

@Sija
Copy link
Member

Sija commented Feb 5, 2021

@Namek My bad, excuse my hastiness 🙇‍♂️ You can still remove it and rebase it manually if u like :)

@gdotdesign
Copy link
Member

Sorry for taking this long to check this, will do in the next couple of days, and it will definitely be part of the next release!

@Namek
Copy link
Contributor Author

Namek commented Apr 21, 2021

@gdotdesign no problem, didn't have much time myself. I will rebase and solve the conflicts this weekend

@Sija Sija added enhancement New feature or request tooling Tooling related feature (formatter, documentation, production builder) labels Apr 25, 2021
@Namek Namek force-pushed the source-maps branch 4 times, most recently from 36f3317 to 7d19afd Compare April 25, 2021 22:54
@Namek
Copy link
Contributor Author

Namek commented Apr 25, 2021

@gdotdesign It's rebased again :)

def self.empty?(node : Node)
case node
in NodeContainer
[email protected]? { |n| empty?(n) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid accessing @Ivars from outside of the object, use getters for such properties.

src/compiler.cr Outdated
end

def compile(nodes : Array(Ast::Node))
nodes.map { |node| compile(node).as(String) }.reject!(&.empty?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest implementing Codegen::Node#empty? instead.

@Namek
Copy link
Contributor Author

Namek commented Apr 26, 2021

@Sija thanks for review, keep posting when you spot bad practices. I will refactor such stuff when general algorithm gets accepted by @gdotdesign Beause maybe there's a better approach to achieve source mapping or this one somehow destroys idea of how code is generated. Also I am not sure about performance since I didn't benchmark-compared it.

@gdotdesign
Copy link
Member

I've taken a deeper look mostly on the mapping side, I've played around with adding Codegen.source_mapped to some places and it added a lot of break points to the debugger. Then I realized that all of the nodes go through the main compile method here https://github.com/Namek/mint/blob/source-maps/src/compiler.cr#L39 so changing it to this:

def compile(node : Ast::Node) : Codegen::Node
  if checked.includes?(node)
    Codegen.source_mapped(node, _compile(node)).as(Codegen::Node)
  else
    ""
  end
end

added most of the breakpoints that I was missing. There are some cases where it still needs manually adding the mapping but it's a lot better then it was.

Now I'm wondering if it would be sufficient to only have that, but I'm guessing not.

On the side of understanding the actual implementation of the SourceMap and VLQ I need to spend some more time. It would be helpful if there would be some spec for them I would make sense of it quicker (maybe I'll add them while I go through).


In general I think this adds a lot of value in it's current state so we should merge it. I feel that a refactor of the compiler will happen in the future and with that we can create a better architecture for this as well.

@Sija
Copy link
Member

Sija commented May 9, 2021

On the side of understanding the actual implementation of the SourceMap and VLQ I need to spend some more time. It would be helpful if there would be some spec for them I would make sense of it quicker (maybe I'll add them while I go through).

fwiw I've found some possibly useful materials:

end
end

src_from_line, src_from_col = file.get_line_column_position src_from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be refactored to use the Ast::Node#location instead (added in #443).

@Sija
Copy link
Member

Sija commented Jul 3, 2021

@Namek Hi, could you please resolve the merge conflicts, so this could be possibly merged for the upcoming (0.15.0) release?

@gdotdesign
Copy link
Member

Rebased from master and tested the build times of https://github.com/mint-lang/mint-ui-website (27K LOC) and it slows down compiling considerably:

source-maps branch:

mint-dev build  25,12s user 0,20s system 104% cpu 24,136 total
mint-dev build --source-map  24,02s user 0,23s system 104% cpu 23,126 total

master branch:

mint-dev build  3,92s user 0,25s system 154% cpu 2,691 total

The development server is affected as well (obviously), so before merging it we need to see what where the bottleneck is.

@gdotdesign
Copy link
Member

gdotdesign commented Jul 5, 2021

The culprit seems to be the IndentedStringBuilder using String.build instead the build times return to normal.

@Namek any idea why it could be slow?

@Namek
Copy link
Contributor Author

Namek commented Jul 6, 2021

hey guys, big thanks for continuation of the work on this feature. Unfortunately, some unforeseen stuff happened in my life so I didn't have much time during last months. However, I'm ok now, so don't worry :)

The culprit seems to be the IndentedStringBuilder using String.build instead the build times return to normal.

@Namek any idea why it could be slow?

@gdotdesign well, I only remember I wrote that class because it covers a lot of small cases. One was actually a fix covered by change in tests with incorrect indentation. So during code generation you'd have to fall back from indentation in current line, then add empty line, then start with correct indentation size on the next line.

Where exactly did you replace using String.build? Didn't the test fail?

One suggestion I'd have for now would be to reduce the initial size of buffer in line 6 (inside initialize method): @str_builder = String::Builder.new(16384)

I picked 16k because I was probably thinking about size of a cacheline in CPU. But maybe Garbage Collection hurts too much because of this? Anyway, I wanted to pick the number big enough so I wouldn't have too many resizes of a buffer. However, I didn't benchmark it, I only knew it had to be done because I've seen the performance drop in there.

@gdotdesign
Copy link
Member

Where exactly did you replace using String.build? Didn't the test fail?

I didn't run the tests, just wanted to find what slows it down 😂

I picked 16k because I was probably thinking about size of a cacheline in CPU. But maybe Garbage Collection hurts too much because of this? Anyway, I wanted to pick the number big enough so I wouldn't have too many resizes of a buffer. However, I didn't benchmark it, I only knew it had to be done because I've seen the performance drop in there.

Thanks I'll try that 😉

@listepo
Copy link

listepo commented Mar 15, 2024

any news?

@gdotdesign
Copy link
Member

any news?

The compiler is being rewritten, so this PR will not be compatible with the new one. We will see if we can use parts of this for the new compiler.

@listepo
Copy link

listepo commented Nov 9, 2024

I think this should be a priority one

@gdotdesign gdotdesign mentioned this pull request Nov 9, 2024
4 tasks
@gdotdesign
Copy link
Member

I think this should be a priority one

I agree, that's why I started working on it a couple of days ago. See #702

@gdotdesign
Copy link
Member

gdotdesign commented Nov 9, 2024

Superseded by #702

@gdotdesign gdotdesign closed this Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request tooling Tooling related feature (formatter, documentation, production builder)
Development

Successfully merging this pull request may close these issues.

4 participants