add initial lint level config #784

suaviloquence · 2024-05-29T04:45:55Z

This PR adds:

log level on each lint
handling of warnings in check_release
ability to set overrides for lint level/required semver bump in GlobalConfig

Questions:

How do we want to store the configured queries? We can store in GlobalConfig, Check struct, on its own as an argument to check_release... it may be tricky to store one copy in GlobalConfig because we need to mutably borrow it to write to config, although we could use (a)rcs
Is this how we want to handle warnings and errors? Mostly in behavior/sorting and running checks, and calculating the required version bump, but also about consistency in terminal output. I copied a lot of the output format from playing around with clippy but it's super flexible
Where should tests go (lib unit tests vs binary tests), and (how) should we expose functionality for checking warnings emitted? One idea: add a suggested version update to the return value of check_release if warnings suggest a bigger version bump than failed checks.

suaviloquence · 2024-05-29T05:03:29Z

internet went out right as i opened it, writing from my phone. I will fix the lint tomorrow/once it comes back (I forgot to turn on more pedantic lints when I moved editors).

obi1kenobi · 2024-05-29T16:38:48Z

src/query.rs

+    /// the required version bump for this lint; see [`SemverQuery`].required_update
+    pub required_update: Option<RequiredSemverUpdate>,
+    /// the lint level for the query; see [`SemverQuery`].lint_level
+    pub lint_level: Option<LintLevel>,


Let's make sure to document what None means for both of these, and try to keep all the docs to a consistent style (for example, on whether we use capitalization and sentences or not).

obi1kenobi · 2024-05-29T16:44:20Z

src/query.rs

@@ -73,6 +102,9 @@ pub struct SemverQuery {

    pub required_update: RequiredSemverUpdate,

+    /// The error level for when this lint procs (allow, warn, or deny)


We used occurs to describe an instance of a lint being triggered earlier. Consistent word choice helps understanding, especially because it allows us to use different word choice in a way that stands out much more.

So consider this instead:

Suggested change

/// The error level for when this lint procs (allow, warn, or deny)

/// The error level for when this lint occurs (allow, warn, or deny)

obi1kenobi · 2024-05-29T17:10:40Z

src/query.rs

+    pub fn apply_override(&mut self, query_override: &QueryOverride) {
+        if let Some(lint_level) = query_override.lint_level {
+            self.lint_level = lint_level;
+        }
+
+        if let Some(required_update) = query_override.required_update {
+            self.required_update = required_update;
+        }
+    }


This design doesn't feel right to me.

The overrides feel more like a property of the linter execution and its environment, not the lints themselves.

Consider a situation where the workspace has one set of overrides, and individual crates within it have other overrides. If the overrides are applied to the lints themselves, then with this design we have to iterate over hundreds of lints once per crate to apply its overrides. We also risk bugs — we may fail to apply the overrides correctly because of how many moving parts there are.

This design also seems to make it harder to have optimizations like:

skip running lints that are allow level

skip running lints whose SemVer requirement is already satisfied by the crate bump
etc.

I suspect we'll eventually want to have a two-stage linting process, similar to what cargo-nextest does for testing: first, analyze the environment to determine what needs to be done, and only then do it. In the first stage, we determine which crates to check, with which crate features, against which versions, with which lints and with which overrides. In the second stage, we execute the plan. In addition to performance gains, this will make it simpler to test our tool's behavior in complex environments: we can test that the correct plan is generated separately from testing that the plan is correctly executed.

We don't have to build that two-stage execution right away, but we should try to make it easier, not harder, to build it in the future. With that in mind, consider keeping the SemverQuery values immutable and addressing the overrides at a higher level?

A related piece of work-in-progress that might be of interest is this draft PR: #600

It proposes a design for describing "the plan" for what to do if various kinds of problems arise while linting. In the future, we'll want both the "what things run on which crates" plan I described earlier, and the "what if something goes wrong" bit here to be part of the same inspectable, testable plan.

Sorry for the late response - school got a little crazy.

Thanks for the feedback! It definitely makes sense that when checking a crate, we want to have pass in the effective levels of each lint, both for semver requirement and lint level. We can calculate the effective lint level for each crate to be checked and pass it as an argument (mapping from lint name to { required_update, lint_level }) to check_release (and add it to the CrateCheck struct in the proposed API). We could also just pass a map that only has an entry for a given lint if it differs from the default value in the lint file.

Passing all effective values might be easier to use, and to test in the proposed API version, as well as potentially easier to calculate multiple levels of overrides, but passing only ones that are different from the default would save space and execution time, but it might be a little more complicated and potentially susceptible to bugs about calculating and using the right effective version.

Which of those seems like the better idea? Or perhaps something else? Thanks!

No worries at all, as you know I've been quite busy too :)

From a testing perspective, I think it would be easier to pass only the overrides ("different from default"). That way, we won't have to regenerate the snapshot test data for every newly-added lint, like we would have to if all computed values were part of the passed data.

Okay, sounds good! Thanks!

suaviloquence · 2024-05-29T04:51:42Z

src/query.rs

    Minor,
+    Major,


reordering to make Major > Minor

suaviloquence · 2024-06-03T02:00:29Z

src/query.rs

+    pub fn apply_override(&mut self, query_override: &QueryOverride) {
+        if let Some(lint_level) = query_override.lint_level {
+            self.lint_level = lint_level;
+        }
+
+        if let Some(required_update) = query_override.required_update {
+            self.required_update = required_update;
+        }
+    }


Sorry for the late response - school got a little crazy.

Thanks for the feedback! It definitely makes sense that when checking a crate, we want to have pass in the effective levels of each lint, both for semver requirement and lint level. We can calculate the effective lint level for each crate to be checked and pass it as an argument (mapping from lint name to { required_update, lint_level }) to check_release (and add it to the CrateCheck struct in the proposed API). We could also just pass a map that only has an entry for a given lint if it differs from the default value in the lint file.

Passing all effective values might be easier to use, and to test in the proposed API version, as well as potentially easier to calculate multiple levels of overrides, but passing only ones that are different from the default would save space and execution time, but it might be a little more complicated and potentially susceptible to bugs about calculating and using the right effective version.

Which of those seems like the better idea? Or perhaps something else? Thanks!

suaviloquence · 2024-06-03T19:15:49Z

Here's one idea for storing different levels of overrides: have a list (stack) of lint name->override mappings, where overrides at the top of the stack (later indices) have higher precedence than lower ones.

This should be more reusable/modular where we can add multiple different levels of precedence and we can save allocations/copies when using workspace-level overrides, for instance.

Another option is to use a flattened version of this (so only pass one QueryOverrides map of overrides to the check_release function) which has a faster lookup time and is fewer nested data structures, but we would have to create a new one for each package in the workspace and mutate it whenever we want to add another level of precedence.

What do you think?

suaviloquence · 2024-06-03T19:18:43Z

(Also, the CI is failing because it is looking for --- failure {...} --- lines to check if a lint is proc'ing. I changed it to use shell_error on failure instead so it will read error: {...} failed, which has more consistency with warnings, clippy, and with the annotate-snippets crate if we eventually want to use that for error messages. Do you think this is a good idea, or should I revert it to the original formatting to not break CI like ours?)

obi1kenobi · 2024-06-03T20:05:20Z

(Also, the CI is failing because it is looking for --- failure {...} --- lines to check if a lint is proc'ing. I changed it to use shell_error on failure instead so it will read error: {...} failed, which has more consistency with warnings, clippy, and with the annotate-snippets crate if we eventually want to use that for error messages. Do you think this is a good idea, or should I revert it to the original formatting to not break CI like ours?)

It's probably not a good idea to include many kinds of changes not directly related to each other in the same PR, since it makes the code harder to review. Larger and more complex PRs take exponentially longer to review and merge, because I have to keep re-reviewing a larger amount of code on each round-trip and it's more likely there's at least one thing worth changing before merging that large amount of code. I wouldn't recommend that approach.

So I'd strongly recommend not changing output formatting in this PR at all. Aim to make the smallest, most targeted PR you possibly can, so we can merge it and move on quickly.

obi1kenobi · 2024-06-03T20:12:12Z

It might even be a good idea to split this PR into several:

a refactoring-only PR, that changes nothing at all about the observable behavior or functionality of the tool, and just moves code around to make future work easier
a PR that adds the APIs needed for the overrides, but doesn't do all the plumbing needed to actually perform overrides (so it lets us focus on the design instead of getting distracted by changes to lint files and lots of test cases)
subsequent PRs that fill in the remaining blanks, like adding test harnesses and test data, or actually reading manifest files, etc.

It's quite difficult to review a PR that does hundreds of lines of no-functional-changes refactoring while also adding new functionality in some places. It's often so hard to review such a PR that it's on par difficulty-wise with having the reviewer rewrite it from scratch, which is not a recipe for fast iteration and generally frustrates both PR authors and PR reviewers.

obi1kenobi · 2024-06-03T20:25:10Z

The stack-of-maps approach sounds good btw 👍 I'd just suggest avoiding situations where two different types have names within 1 character of edit distance from each other, like QueryOverride and QueryOverrides. This is an easy typo to make but a hard one to spot, and I've had situations where it's been a headache to debug this because it "seems like" the compiler is complaining for "nothing" (because it's hard to see that we actually made a typo).

suaviloquence · 2024-06-26T20:44:45Z

closing because work is being done in other, smaller PRs (looking back, wow this was big)

suaviloquence marked this pull request as draft May 29, 2024 04:46

obi1kenobi reviewed May 29, 2024

View reviewed changes

suaviloquence commented Jun 3, 2024

View reviewed changes

suaviloquence force-pushed the add-lint-level branch from 2942a15 to f32df41 Compare June 3, 2024 02:03

suaviloquence added 3 commits June 3, 2024 12:11

add initial lint level config

545b72f

lint & doc fixes

6990dc0

one idea for managing overrides

6759fcb

suaviloquence force-pushed the add-lint-level branch from c041cd5 to 6759fcb Compare June 3, 2024 19:11

suaviloquence closed this Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add initial lint level config #784

add initial lint level config #784

suaviloquence commented May 29, 2024 •

edited

Loading

suaviloquence commented May 29, 2024

obi1kenobi May 29, 2024

obi1kenobi May 29, 2024

obi1kenobi May 29, 2024

obi1kenobi May 29, 2024 •

edited

Loading

suaviloquence Jun 3, 2024

obi1kenobi Jun 3, 2024

suaviloquence Jun 3, 2024

suaviloquence May 29, 2024

suaviloquence Jun 3, 2024

suaviloquence commented Jun 3, 2024

suaviloquence commented Jun 3, 2024

obi1kenobi commented Jun 3, 2024

obi1kenobi commented Jun 3, 2024

obi1kenobi commented Jun 3, 2024

suaviloquence commented Jun 26, 2024

		@@ -73,6 +102,9 @@ pub struct SemverQuery {

		pub required_update: RequiredSemverUpdate,

		/// The error level for when this lint procs (allow, warn, or deny)

	/// The error level for when this lint procs (allow, warn, or deny)
	/// The error level for when this lint occurs (allow, warn, or deny)

add initial lint level config #784

add initial lint level config #784

Conversation

suaviloquence commented May 29, 2024 • edited Loading

suaviloquence commented May 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

obi1kenobi May 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

suaviloquence commented Jun 3, 2024

suaviloquence commented Jun 3, 2024

obi1kenobi commented Jun 3, 2024

obi1kenobi commented Jun 3, 2024

obi1kenobi commented Jun 3, 2024

suaviloquence commented Jun 26, 2024

suaviloquence commented May 29, 2024 •

edited

Loading

obi1kenobi May 29, 2024 •

edited

Loading