Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Rule Performance Statistics #108

Merged
merged 33 commits into from
Oct 11, 2023
Merged

feat: Rule Performance Statistics #108

merged 33 commits into from
Oct 11, 2023

Conversation

mnkiefer
Copy link
Contributor

@mnkiefer mnkiefer commented Mar 13, 2023

Summary

This document describes how to expose ESLint's TIMING information as well as collect additional runtime statistics to facilitate monitoring of rule/plugin development.

Related Issues

eslint/eslint#16690

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Mar 13, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@nzakas nzakas added Initial Commenting This RFC is in the initial feedback stage and removed triage labels Mar 14, 2023
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. I've left a few comments on the approach.

designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
@bradzacher
Copy link

Something I'd love is more granular performance data per rule - I.e. Breakdown time spent in create, parsing + evaluating selectors, and time spent in selector callbacks.

Currently the timing only displays the sum total of time spent in create plus selector callbacks.

Right now the time spent spent in esquery is not tracked - which means that rules can appear faster by moving logic to a selector - which can fool users into thinking a rule is faster than it is.

Time spent in a rule's create would be good to separate out because that space is used to do per-file init. Sometimes rules do expensive things that could be hoisted to the module-level, but it's currently hard to pick up on that.

There's also pre-processor performance and parser performance that isn't tracked - which would be good to see in more complex project setups.

designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
@nzakas
Copy link
Member

nzakas commented Mar 16, 2023

After sleeping on it, I also believe that a better path forward is to create a generic way to report timing information in the report that is sent to formatters. So, rather than keying in on rule performance, I could see a generic data structure like this being added in the result object:

{
    // result object properties

    stats: {
         fixPasses: 3,
         timing: {
            lint: 123,
            parse: 123,
            fix: 456,
            rules: {
                semi: 123,
                quotes: 123
            }
        }  
     }
}

And maybe we have a --timing CLI argument that indicates the linter should be run in profiling mode to add this info into the result object?

We could always start with rule timing and add the others later (after the rewrite, we should have more granular data to use than we do right now).

@mnkiefer
Copy link
Contributor Author

mnkiefer commented Mar 19, 2023

And maybe we have a --timing CLI argument that indicates the linter should be run in profiling mode to add this info into the result object?

Ok, so instead of providing a specialized formatter, any formatter could pick up this statistics object only when the timing flag is added? (i.e. independent of TIMING?) Or should the output rather be available as JSON (as suggested in #108 (comment)) via the Node.js API?

We could always start with rule timing and add the others later (after the rewrite, we should have more granular data to use than we do right now).

Sounds good! 👍🏼 As this is my first time going through this process: Do I now adapt the RFC text before implementing the changes in the POC or should the text and code be changed simultaneously as we progress?

@nzakas
Copy link
Member

nzakas commented Mar 20, 2023

Actually, maybe --stats would be a better flag name to make it more generic?

Do I now adapt the RFC text before implementing the changes in the POC or should the text and code be changed simultaneously as we progress?

This is up to you. We don't require a POC to accept an RFC, we just want a detailed description of what an implementation might look like. So, we still want the details about which files will be modified and where, but you don't actually need to build it unless you want to. In some cases it can help to build a POC if you're not entirely sure, I myself tend to do small POCs of the more difficult parts of my RFCs just to validate I know what I'm doing.

Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnkiefer it looks like you agreed with the proposals I made, but the RFC hasn't been updated. Do you still plan on updating it?

designs/2023-rule-performance-dashboard/README.md Outdated Show resolved Hide resolved
@mnkiefer mnkiefer requested review from nzakas and bmish April 10, 2023 20:02
@mnkiefer
Copy link
Contributor Author

@nzakas: Could we have another review on this? 😇

@mnkiefer
Copy link
Contributor Author

mnkiefer commented Aug 28, 2023

Yes, everything works exactly the same as with eslintrc. The link you provided 404s for me (maybe a private repo?).

I just changed the visibility to public, you should now be able to navigate to the file. 👍

@nzakas
Copy link
Member

nzakas commented Aug 29, 2023

Just to clarify, when you say the linter isn't applying fixes, do you mean it's not writing to disk? Or that it's not producing output fields in the results?

@mnkiefer
Copy link
Contributor Author

mnkiefer commented Aug 29, 2023

Just to clarify, when you say the linter isn't applying fixes, do you mean it's not writing to disk? Or that it's not producing output fields in the results?

It wasn't fixing the file, but I did have a combination of lint rules with some not auto-fixable, so that makes sense. 🙈

With your recent advice, I finally managed to come up with an example that has 2 fix passes:

/*eslint no-regex-spaces: "error", wrap-regex: "error"*/

function a() {
    return /  foo/.test("bar");
}

When I have some time this week, I will return to this, do some final testing, and report back as soon as I'm done. 👍

@nzakas
Copy link
Member

nzakas commented Aug 29, 2023

Awesome, thanks for the update.

By default, ESLint does not output fixes to disk while linting. In order to output to disk, you'd need to call FlatESLint#outputFixes(). This is how it's implemented in the core:

https://github.com/eslint/eslint/blob/10c4f85dca978b42d37619f50565a06b9a28c9ac/lib/cli.js#L394-L397

@mnkiefer
Copy link
Contributor Author

mnkiefer commented Sep 3, 2023

@nzakas: The above example produces the following stats output. Here, I'm wondering:

  1. Why is the first parse-time always the most expensive?
  2. For 2 fix passes, the length of the passes array is actually 3 (since you mentioned that ESLint does parse one last time without doing a fix do know if all fixes are done). Should this (pass) entry of the array then be discarded (i.e. if not fixedRersult.fixed)?
        "stats": {
            "directives": 0,
            "fixPasses": 2,
            "times": {
                "passes": [
                    {
                        "parse": {
                            "total": 4.526709
                        },
                        "rules": {
                            "total": 1.266792,
                            "no-regex-spaces": {
                                "total": 1.063667
                            },
                            "wrap-regex": {
                                "total": 0.203125
                            }
                        },
                        "fix": {
                            "total": 0.001333
                        },
                        "total": 5.794834
                    },
                    {
                        "parse": {
                            "total": 0.770792
                        },
                        "rules": {
                            "total": 0.22520800000000002,
                            "no-regex-spaces": {
                                "total": 0.21666600000000003
                            },
                            "wrap-regex": {
                                "total": 0.008542000000000001
                            }
                        },
                        "fix": {
                            "total": 0.000334
                        },
                        "total": 0.9963339999999999
                    },
                    {
                        "parse": {
                            "total": 0.593667
                        },
                        "rules": {
                            "total": 0.031083,
                            "no-regex-spaces": {
                                "total": 0.013791999999999999
                            },
                            "wrap-regex": {
                                "total": 0.017291
                            }
                        },
                        "fix": {
                            "total": 0
                        },
                        "total": 0.6247499999999999
                    }
                ]
            },
            "violations": 0
        }

@nzakas
Copy link
Member

nzakas commented Sep 6, 2023

  1. I think this is the cold-start of the parser. JavaScript engines work first by observing code and then optimize it for subsequent runs. So my theory is that the first pass is running the parser through the interpreter and the following passes are running through the compiled version.
  2. I don't think we need to remove the entry from passes. It's still a pass with useful data.

Unrelated: I think we can remove rules.total, as total is the only entry that doesn't correspond to a rule, so that seems confusing.

@mnkiefer
Copy link
Contributor Author

mnkiefer commented Sep 16, 2023

Unrelated: I think we can remove rules.total, as total is the only entry that doesn't correspond to a rule, so that seems confusing.

I agree! 👍 I have updated the RFC and POC accordingly and I think we have finally captured all of the requirements which have trickled in since we started. 🎉

What are the next steps and what needs to be done to move this into the final commenting phase?

Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think this is a good general direction to head in.

@mdjermanovic please review when you have a moment, and if there are no objections, we should move this to Final Commenting.

designs/2023-rule-performance-statistics/README.md Outdated Show resolved Hide resolved
"fix": {
"total": 0.001792
},
"total": 5.739875
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we calculate total? From this example, it looks like total = parse + rules + fix. But, linter is also doing other things in each pass, e.g., processes inline configuration and adds globals, so the sum of only those parts of the process that we're individually tracking might be misleading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdjermanovic: Yes, the total is computed as you say. I agree that from this perspective, it might be confusing and be interpreted as the total lint timing.

@nzakas: Should we just leave it out (as we did in rules) or keep it and make it track the overall timing of a particular run?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep it and make it track the overall timing of a particular pass (which should be >= parse + rules + fix).

@nzakas what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using it for the overall timing makes sense to me. 👍

Copy link
Contributor Author

@mnkiefer mnkiefer Oct 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've updated the implementation and RFC. Each pass total time now corresponds to the time that is spent when running Linting code for ${debugTextDescription} (pass ${passNumber})...
Please let me know if this aligns with what you had in mind. 👍

designs/2023-rule-performance-statistics/README.md Outdated Show resolved Hide resolved
Copy link
Member

@mdjermanovic mdjermanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directives and violations fields aren't clearly defined.

Perhaps we could leave these fields for later, and for the first version of the stats feature focus on returning only timing data as @nzakas suggested in #108 (comment)?

designs/2023-rule-performance-statistics/README.md Outdated Show resolved Hide resolved
designs/2023-rule-performance-statistics/README.md Outdated Show resolved Hide resolved
@mnkiefer
Copy link
Contributor Author

The directives and violations fields aren't clearly defined.

Perhaps we could leave these fields for later, and for the first version of the stats feature focus on returning only timing data as @nzakas suggested in #108 (comment)?

@mdjermanovic: I'm fine with leaving them out for now. However, the suggested metrics by @bmish should definitely be included in future stats.

@nzakas: Would you be ok with leaving them out or now?

@mdjermanovic
Copy link
Member

However, the suggested metrics by @bmish should definitely be included in future stats.

eslint/eslint#14597 proposes that ESLint directly prints out stats to the console, while this RFC proposes adding stats to LintResult objects which are then further processed by formatters and other tools. Some of the data proposed in eslint/eslint#14597 may already be available in LintResult objects. For example, rule violations can be calculated from LintResult#messages (except if someone wants to know the number of violations per fix pass but I'm not sure if there's a use case for it) so adding this data to stats would be redundant.

@mnkiefer
Copy link
Contributor Author

mnkiefer commented Sep 26, 2023

However, the suggested metrics by @bmish should definitely be included in future stats.

eslint/eslint#14597 proposes that ESLint directly prints out stats to the console, while this RFC proposes adding stats to LintResult objects which are then further processed by formatters and other tools. Some of the data proposed in eslint/eslint#14597 may already be available in LintResult objects. For example, rule violations can be calculated from LintResult#messages (except if someone wants to know the number of violations per fix pass but I'm not sure if there's a use case for it) so adding this data to stats would be redundant.

I understand, I should have better phrased it as future outputs, it doesn't have to be in stats. I only wanted to make sure that information is not lost as the item was linked to this RFC and I was asked to include suppressions and violations here. This was just a heads up not to close that request. 🙏

@nzakas
Copy link
Member

nzakas commented Sep 26, 2023

Yup, I'm fine with leaving out violations and directives. 👍 Always easier to add things later than to change or remove them.

@nzakas
Copy link
Member

nzakas commented Oct 3, 2023

Because we are just making tweaks at this point, I'm promoting this RFC to Final Commenting.

@nzakas nzakas added Final Commenting This RFC is in the final week of commenting and removed Initial Commenting This RFC is in the initial feedback stage labels Oct 3, 2023
Copy link
Member

@mdjermanovic mdjermanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

I left two small suggestions about the RFC text.

Also, I'm not sure if proof of concept supports the case when eslint is used with processors, but we can address that during the implementation.

designs/2023-rule-performance-statistics/README.md Outdated Show resolved Hide resolved
designs/2023-rule-performance-statistics/README.md Outdated Show resolved Hide resolved
@nzakas
Copy link
Member

nzakas commented Oct 11, 2023

The final commenting period is now closed. This RFC is accepted! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Final Commenting This RFC is in the final week of commenting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants