Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pyupgrade] Add rules to use PEP 695 generics in classes and functions (UP046, UP047) #15565

Open
wants to merge 61 commits into
base: main
Choose a base branch
from

Conversation

ntBre
Copy link
Contributor

@ntBre ntBre commented Jan 18, 2025

Summary

This PR extends our PEP 695 handling from the type aliases handled by UP040 to generic function and class parameters, as suggested in the latter two examples from #4617:

# Input
T = TypeVar("T", bound=float)
class A(Generic[T]):
    ...

def f(t: T):
    ...

# Output
class A[T: float]:
    ...

def f[T: float](t: T):
    ...

I first implemented this as part of UP040, but based on a brief discussion during a very helpful pairing session with @AlexWaygood, I opted to split them into rules separate from UP040 and then also separate from each other. From a quick look, and based on this issue, I'm pretty sure neither of these rules is currently in pyupgrade, so I just took the next available codes, UP046 and UP047.

The last main TODO, noted in the rule file and in the fixture, is to handle generic method parameters not included in the class itself, S in this case:

T = TypeVar("T")
S = TypeVar("S")

class Foo(Generic[T]):
    def bar(self, x: T, y: S) -> S: ...

but Alex mentioned that that might be okay to leave for a follow-up PR.

I also left a TODO about handling multiple subclasses instead of bailing out when more than one is present. I'm not sure how common that would be, but I can still handle it here, or follow up on that too.

I think this is unrelated to the PR, but when I ran cargo dev generate-all, it removed the rule code PLW0101 from ruff.schema.json. It seemed unrelated, so I left that out, but I wanted to mention it just in case.

Test Plan

New test fixture, cargo nextest run

@ntBre ntBre added the rule Implementing or modifying a lint rule label Jan 18, 2025
Copy link
Contributor

github-actions bot commented Jan 18, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@ntBre ntBre added the preview Related to preview mode features label Jan 18, 2025
@AlexWaygood AlexWaygood self-requested a review January 19, 2025 13:22
@ntBre
Copy link
Contributor Author

ntBre commented Jan 20, 2025

I'm not quite sure what's going wrong with mkdocs. I tried pasting in the code it suggested and also adding the rule to KNOWN_FORMATTING_VIOLATIONS, but running python scripts/check_docs_formatted.py locally still reported an issue after each attempt. I'm planning to try again tomorrow morning, but otherwise this should still be ready for review.

I was also mildly surprised to see no changes in the ecosystem checks. Do those run with Python 3.12?

@dhruvmanila
Copy link
Member

I'm not quite sure what's going wrong with mkdocs. I tried pasting in the code it suggested and also adding the rule to KNOWN_FORMATTING_VIOLATIONS, but running python scripts/check_docs_formatted.py locally still reported an issue after each attempt. I'm planning to try again tomorrow morning, but otherwise this should still be ready for review.

I'm not exactly sure but making the change it suggested seems to work for me 😅

The script runs on the generated docs which means that you'll have to generate the docs every time you make a change in the Rust code. It's always preferable to include the --generate-docs when running this script just to make sure that it's running on the latest docs version.

I was also mildly surprised to see no changes in the ecosystem checks. Do those run with Python 3.12?

Yes, the ecosystem checks are run on Python 3.12

ecosystem:
name: "ecosystem"
runs-on: depot-ubuntu-latest-8
needs:
- cargo-test-linux
- determine_changes
# Only runs on pull requests, since that is the only we way we can find the base version for comparison.
# Ecosystem check needs linter and/or formatter changes.
if: ${{ github.event_name == 'pull_request' && needs.determine_changes.outputs.code == 'true' }}
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}

PYTHON_VERSION: "3.12"

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this overall looks good. I didn't review pep695 in detail because it isn't clear what's copied over and what needs reviewing.

The title suggests to me that UP040 now handles generic functions and classes but reading the summary this doesn't seem to be true?

What I understand from the summary is that this PR simply introduces a new rule? We should change the title accordingly for a clear changelog entry.

@@ -0,0 +1,218 @@
//! Shared code for [`use_pep695_type_alias`] (UP040) and [`use_pep695_type_parameter`] (UP046)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make any changes to the code in this module or did you copy it over as is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified expr_name_to_type_var to include the TypeVar.kind field, factored out TypeParam::from from its call site in use_pep695_type_alias.rs and added the match kind, and added the fmt_* functions. The Visitor code is untouched, and so is the TypeVarRestriction struct.

Sorry for the confusing mix of changes. I modified it all in-place and then tried splitting it out to separate the rules.

UP046.py:9:7: UP046 [*] Generic class `A` uses `Generic` subclass instead of type parameters
|
9 | class A(Generic[T]):
| ^^^^^^^^^^^^^ UP046
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the diagnostic range to only the Generic[T] instead of the entire class header because the name or other base classes aren't a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I thought this looked nice for the examples, but I think your suggestion makes more sense for longer, more realistic class names! And it will be even more important when we handle multiple base classes.

I wanted to make a similar change for functions, but I think it will require tracking the range of the parameter in TypeVar and possibly even emitting separate diagnostics for each affected parameter. I could cheat a little for classes because I only handle a single base class for now.

I'm picturing a function like this with non-contiguous, unique type parameters as a problematic case:

def h(p: T, another_param, and_another: U): ...
      ^^^^                 ^^^^^^^^^^^^^^

I think that's the ideal kind of diagnostic, but I think it requires separate diagnostics and fixes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah that's an interesting case (CC @BurntSushi: While not directly related to diagnostics, it is an interesting problem how we may want multiple diagnostic ranges but a single fix).

Aren't we tracking the ranges for the fix (by only underlining the type?). although that still doesn't scale to multiple parameters unless we extend the range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the generic class case, I'm using the assumption of a single base class to underline the type. This is the range used later for the diagnostic:

// TODO(brent) only accept a single, Generic argument for now. I think it should be fine to have
// other arguments, but this simplifies the fix just to delete the argument list for now
let [Expr::Subscript(ExprSubscript {
value,
slice,
range,
..
})] = arguments.args.as_ref()

Once we add a loop to support multiple base classes, like multiple function parameters, I think we'd need to capture each range from the parameters as we loop.

for parameter in parameters {
if let Some(annotation) = parameter.annotation() {
let vars = {
let mut visitor = TypeVarReferenceVisitor {
vars: vec![],
semantic: checker.semantic(),
};
visitor.visit_expr(annotation);
visitor.vars
};
type_vars.extend(vars);
}
}

But I could be misunderstanding or missing an easier way to do it.

crates/ruff_linter/src/rules/pyupgrade/rules/pep695.rs Outdated Show resolved Hide resolved
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is great!

crates/ruff_linter/src/rules/pyupgrade/rules/pep695.rs Outdated Show resolved Hide resolved
Comment on lines 55 to 59
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
enum GenericKind {
GenericClass,
GenericFunction,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these should even be the same rule...? Generic functions and generic classes feel quite distinct.

I don't know how methods play into this, though -- we've obviously left them unimplemented for now, but we might want to try fixing them as a followup extension. We might have to fix methods in the same pass as the class definition...? Not sure...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was kind of wondering that myself, especially as I was splitting UP040 and this one. The rule function bodies are just different enough that I don't see an easy way of sharing much more code. Should I separate them?

I agree, not sure about methods either. Maybe they fit with functions if we can just subtract the enclosing class's type parameters from the function code (now I'm realizing we might have a problem with nesting in general?). But I could see it going with the class code, the function code, or being a third rule even.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nesting is a very good point! I think it's fine to not offer a fix for generic functions or classes nested inside other generic functions or classes (they're rare, and complicated to get right!). You can check whether a class or function is nested inside any class or function scopes by using this method on the semantic model to iterate up through all parent statements and checking if any are a StmtClassDef or a StmtFunctionDef:

/// Returns an [`Iterator`] over the current statement hierarchy, from the current [`Stmt`]
/// through to any parents.
pub fn current_statements(&self) -> impl Iterator<Item = &'a Stmt> + '_ {
let id = self.node_id.expect("No current node");
self.nodes
.ancestor_ids(id)
.filter_map(move |id| self.nodes[id].as_statement())
}

We should definitely add tests for nested functions and classes as well!

For methods -- after thinking about it more, I think we might be okay? Whether or not there's also a fix for the class statement being applied by a different rule simultaneously, I think the same fix needs to be applied for the method definition:

  T = TypeVar("T")
  S = TypeVar("S")

  class Foo(Generic[T]):
-     def bar(self, x: T, y: S) -> S: ...
+     def bar[S](self, x: T, y: S) -> S: ...

or, if the class has also been fixed...

  S = TypeVar("S")

  class Foo[T]:
-     def bar(self, x: T, y: S) -> S: ...
+     def bar[S](self, x: T, y: S) -> S: ...

So I think it should be okay for methods to be their own rule -- which, I think, means that it would be good for free functions and classes to be separate rules too!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late to the party but can you elaborate a bit on the reason why we chose to split the rule? The reason for the split aren't clear to me as a user. What's so different between classes and functions that I'd only want this rule enabled for one or the other? I don't think it should matter how much code the implementations can share -- that would be an us problem and we shouldn't forward it to the users by "overwhelming" them with more rules".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason Alex mentioned down below was that the rules table in the docs uses the first format! call from Violation::message, so splitting them will give two more specialized descriptions in that table. Otherwise, we discussed that it's generally nice to lean toward more fine-grained rule selection, but I see what you mean about overwhelming users with rules. I also think you're right that users will probably want both of these together, but @AlexWaygood may have a different intuition here. I'm happy to go either way.

I think this is a side note related to rule categorization, but my ideal picture is something like a PEP 695 parent category that would activate all three of these (UP040, UP046, and UP047). That's why I put them all in UP040 initially, but the naming of UP040 (non-pep695-type-alias) was a bit too specific to include the others.

If we do leave them split, you've reminded me that I should add a See also section to both of them pointing to the other at least.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the format issue. While not ideal, I don't think it's reason enough to split a rule because I consider having to configure two rules more inconvenient than a slightly confusing message in the rules table (which we can fix when we refactor Ruff's rule infrastructure to Red Knot's)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think that these are doing distinct enough things that they deserve to be separate rules -- the documentation examples for them are completely different, for example. I do see Micha's point that there's maybe not really a reason for why somebody might want to enable one but leave the other disabled -- but in general, I feel like we see a lot of requests for fine-grained configuration on the issue tracker, but not much demand for coarser configuration. I think there are many confusing things about Ruff's configuration setup, but I don't think the sheer number of rules is one of them... but I know that this is somewhere where Micha and I disagree :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I don't think the sheer number of rules is one of them

I don't think this is necessarily true, given that users ask for presets, better categorization etc. Users struggle with configuring Ruff and the sheer amount of rules is one big factor causing it. Just speaking for myself, finding if a rule already exists is no easy task (you can try pasting it in the playground but good luck if the rule requires configuration). You have to scroll through a sheer endless list of rules. And I think it's a problem we don't understand well yet because users simply tend to not enable rules because they're overwhelmed (or enable all of them) -- giving them a worse experience overall. That's why I keep pushing back on defaulting to more granular rules. There's a non zero cost.

Looking at the two rules and reading the documentation I think either works. I've a slight preference to just have one, but considering that we split them out from UP40, maybe it's makes sense to have them separate. I'd probably go with separate rules just to avoid extra work.

@AlexWaygood
Copy link
Member

I was also mildly surprised to see no changes in the ecosystem checks. Do those run with Python 3.12?

I think we run them with Python 3.12 in CI, but when the ecosystem check is running on a project that has requires-python = ">= 3.9" in its pyproject.toml file, it will assume that that project cannot use Python 3.12 syntax. I think few projects have requires-python = ">= 3.12" at this stage, unfortunately, as it's still a fairly new Python version

@ntBre ntBre changed the title [pyupgrade] Expand UP040 to handle generic functions and classes (UP040, UP046) [pyupgrade] Add rules to use PEP 695 generics in classes and functions (UP046, UP047) Jan 20, 2025
@ntBre
Copy link
Contributor Author

ntBre commented Jan 20, 2025

Okay, I think I covered all of the suggestions! For follow ups I have

  • Rename private type variables in generics (don't make inline generics like [_T])
  • Allow multiple base classes for generic classes
  • Handle generic methods
  • Handle default kwargs for Python >= 3.13

Plus the separate, related rule for mixed old- and new-style generics in #15620. Should I open issues for each of these, or just track them here?

@MichaReiser
Copy link
Member

I'd suggest creating a new issue. They all seem like extensions and can be implemented after this PR.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very close!

@AlexWaygood
Copy link
Member

I'd suggest creating a new issue. They all seem like extensions and can be implemented after this PR.

Agree -- I think a single new issue that tracks all these followups as sub-items would be perfect

ntBre and others added 2 commits January 21, 2025 08:50
Co-authored-by: Alex Waygood <[email protected]>
@ntBre
Copy link
Contributor Author

ntBre commented Jan 21, 2025

Thanks for all of the documentation fixes and the then_some improvement! It looks like there's a tiny typo that I'm fixing locally. Otherwise, do we want to revisit splitting the rule? If not, I just want to extend the See also sections to point to each other. I opened the other issue for follow ups too.

@AlexWaygood
Copy link
Member

AlexWaygood commented Jan 21, 2025

I updated the PR against my typeshed fork to run the autofixes from the latest version of this PR. The pyright CI there is flagging a couple of other issues:

  1. /home/runner/work/typeshed/typeshed/stdlib/builtins.pyi:1547:32 - error: Type parameter "SupportsRichComparisonT" is not included in the type parameter list for "max" (reportGeneralTypeIssues).

    This is because SupportsRichComparisonT is defined in another module, so we can't see that it's a type variable. This actually means that the autofix breaks the function definition, because the function now mixes old-style type variable declarations with new-style type parameters in its annotations, and pyright correctly rejects this. I don't think there's a good solution here other than to make sure that the fix is marked unsafe, and to document that this is a known limitation of the rule.

  2. In a similar vein, it looks like we don't currently recognise typing.AnyStr as being a type variable. This is the same issue as issue (1), but it's a very commonly used type variable (and a public member of the standard-library typing module), so I think it's worth us adding some special-casing for it so that it's recognised as a type variable.

ntBre added 7 commits January 21, 2025 12:57
the main motivation here was to differentiate early returns from
`expr_name_to_type_var` caused by default values from those caused by the
variable not being a TypeVar, but it has the nice side effect of setting us up
to handle defaults correctly in the future
initially I thought the any_skipped check would be more invasive, but carrying
around an extra bool in the cases where we don't care about it seems preferable
to duplicating the AnyStr code especially
@ntBre
Copy link
Contributor Author

ntBre commented Jan 21, 2025

Thanks again for the thorough review and for your typeshed tests! I think I have now

  • handled imported type variables (by giving only a diagnostic in classes and by giving an unsafe fix (that excludes them) in functions)
  • updated the Known issues and Fix safety sections of both rules
  • added a special case for AnyStr

The AnyStr case felt a little awkward, so hopefully it's at least close to what you had in mind.

Comment on lines +51 to +52
def generic_method(t: T):
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

micro-nit: it's nice to keep fixtures looking like "realistic" Python code where possible (this looks a bit strange to me, as there's generally no reason to use a TypeVar in a function's annotations like this unless either two parameters are annotated with the TypeVar or the return annotation uses the TypeVar):

Suggested change
def generic_method(t: T):
pass
def generic_method(var: T) -> T:
return var

this applies to a few other functions and classes in your fixtures, but I won't comment on all of them ;)


# This case gets a diagnostic but not a fix because we can't look up the bounds
# or constraints on the generic type from another module
class ExternalType(Generic[T, SupportsRichComparisonT]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should SupportsRichComparisonT be imported from another module at the top of this file?

Comment on lines +106 to +111
// typing.AnyStr special case doesn't have a real range
if let Expr::Name(name) = v {
f.write_str(name.id.as_ref())?;
} else {
f.write_str(&self.source[v.range()])?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm... I'm not a huge fan of the way that the special handling for AnyStr is kind-of implicit here. It also seems a shame that we're now cloning all TypeVarRestriction::Constraint Exprs just to handle the AnyStr special case.

I think we can solve this by:

  1. Adding a dedicated variant for TypeVarRestriction, and
  2. Only storing an &'a str for the name field on TypeVar, rather than a whole ast::name::Name.

Something like this? I think this patch also has the advantage that we'd recognise a fully qualified use of AnyStr as an attribute expression, e.g. import typing; x: typing.AnyStr:

Suggested patch
diff --git a/crates/ruff_linter/src/rules/pyupgrade/rules/pep695/mod.rs b/crates/ruff_linter/src/rules/pyupgrade/rules/pep695/mod.rs
index 3de4228bc..de9a6407a 100644
--- a/crates/ruff_linter/src/rules/pyupgrade/rules/pep695/mod.rs
+++ b/crates/ruff_linter/src/rules/pyupgrade/rules/pep695/mod.rs
@@ -28,7 +28,10 @@ enum TypeVarRestriction<'a> {
     /// A type variable with a bound, e.g., `TypeVar("T", bound=int)`.
     Bound(&'a Expr),
     /// A type variable with constraints, e.g., `TypeVar("T", int, str)`.
-    Constraint(Vec<Expr>),
+    Constraint(Vec<&'a Expr>),
+    /// `AnyStr` is a special case: the only public `TypeVar` defined in the standard library,
+    /// and thus the only one that we recognise when imported from another module.
+    AnyStr,
 }
 
 #[derive(Copy, Clone, Debug, Eq, PartialEq)]
@@ -40,7 +43,7 @@ enum TypeParamKind {
 
 #[derive(Debug)]
 struct TypeVar<'a> {
-    name: &'a ExprName,
+    name: &'a str,
     restriction: Option<TypeVarRestriction<'a>>,
     kind: TypeParamKind,
     default: Option<&'a Expr>,
@@ -92,23 +95,19 @@ impl Display for DisplayTypeVar<'_> {
             TypeParamKind::TypeVarTuple => f.write_str("*")?,
             TypeParamKind::ParamSpec => f.write_str("**")?,
         }
-        f.write_str(&self.type_var.name.id)?;
+        f.write_str(self.type_var.name)?;
         if let Some(restriction) = &self.type_var.restriction {
             f.write_str(": ")?;
             match restriction {
                 TypeVarRestriction::Bound(bound) => {
                     f.write_str(&self.source[bound.range()])?;
                 }
+                TypeVarRestriction::AnyStr => f.write_str("(bytes, str)")?,
                 TypeVarRestriction::Constraint(vec) => {
                     let len = vec.len();
                     f.write_str("(")?;
                     for (i, v) in vec.iter().enumerate() {
-                        // typing.AnyStr special case doesn't have a real range
-                        if let Expr::Name(name) = v {
-                            f.write_str(name.id.as_ref())?;
-                        } else {
-                            f.write_str(&self.source[v.range()])?;
-                        }
+                        f.write_str(&self.source[v.range()])?;
                         if i < len - 1 {
                             f.write_str(", ")?;
                         }
@@ -135,7 +134,7 @@ impl<'a> From<&'a TypeVar<'a>> for TypeParam {
             TypeParamKind::TypeVar => {
                 TypeParam::TypeVar(TypeParamTypeVar {
                     range: TextRange::default(),
-                    name: Identifier::new(name.id.clone(), TextRange::default()),
+                    name: Identifier::new(*name, TextRange::default()),
                     bound: match restriction {
                         Some(TypeVarRestriction::Bound(bound)) => Some(Box::new((*bound).clone())),
                         Some(TypeVarRestriction::Constraint(constraints)) => {
@@ -146,6 +145,25 @@ impl<'a> From<&'a TypeVar<'a>> for TypeParam {
                                 parenthesized: true,
                             })))
                         }
+                        Some(TypeVarRestriction::AnyStr) => {
+                            Some(Box::new(Expr::Tuple(ast::ExprTuple {
+                                range: TextRange::default(),
+                                elts: vec![
+                                    Expr::Name(ExprName {
+                                        range: TextRange::default(),
+                                        id: Name::from("str"),
+                                        ctx: ast::ExprContext::Load,
+                                    }),
+                                    Expr::Name(ExprName {
+                                        range: TextRange::default(),
+                                        id: Name::from("bytes"),
+                                        ctx: ast::ExprContext::Load,
+                                    }),
+                                ],
+                                ctx: ast::ExprContext::Load,
+                                parenthesized: true,
+                            })))
+                        }
                         None => None,
                     },
                     // We don't handle defaults here yet. Should perhaps be a different rule since
@@ -155,12 +173,12 @@ impl<'a> From<&'a TypeVar<'a>> for TypeParam {
             }
             TypeParamKind::TypeVarTuple => TypeParam::TypeVarTuple(TypeParamTypeVarTuple {
                 range: TextRange::default(),
-                name: Identifier::new(name.id.clone(), TextRange::default()),
+                name: Identifier::new(*name, TextRange::default()),
                 default: None,
             }),
             TypeParamKind::ParamSpec => TypeParam::ParamSpec(TypeParamParamSpec {
                 range: TextRange::default(),
-                name: Identifier::new(name.id.clone(), TextRange::default()),
+                name: Identifier::new(*name, TextRange::default()),
                 default: None,
             }),
         }
@@ -178,36 +196,27 @@ struct TypeVarReferenceVisitor<'a> {
 /// Recursively collects the names of type variable references present in an expression.
 impl<'a> Visitor<'a> for TypeVarReferenceVisitor<'a> {
     fn visit_expr(&mut self, expr: &'a Expr) {
+        // special case for typing.AnyStr, which is a commonly-imported type variable in the
+        // standard library with the definition:
+        //
+        // ```python
+        // AnyStr = TypeVar('AnyStr', bytes, str)
+        // ```
+        //
+        // As of 01/2025, this line hasn't been modified in 8 years, so hopefully there won't be
+        // much to keep updated here. See
+        // https://github.com/python/cpython/blob/383af395af828f40d9543ee0a8fdc5cc011d43db/Lib/typing.py#L2806
+        if self.semantic.match_typing_expr(expr, "AnyStr") {
+            self.vars.push(TypeVar {
+                name: "AnyStr",
+                restriction: Some(TypeVarRestriction::AnyStr),
+                kind: TypeParamKind::TypeVar,
+                default: None,
+            });
+            return;
+        }
+
         match expr {
-            // special case for typing.AnyStr, which is a commonly-imported type variable in the
-            // standard library with the definition:
-            //
-            // ```python
-            // AnyStr = TypeVar('AnyStr', bytes, str)
-            // ```
-            //
-            // As of 01/2025, this line hasn't been modified in 8 years, so hopefully there won't be
-            // much to keep updated here. See
-            // https://github.com/python/cpython/blob/383af395af828f40d9543ee0a8fdc5cc011d43db/Lib/typing.py#L2806
-            e @ Expr::Name(name) if self.semantic.match_typing_expr(e, "AnyStr") => {
-                self.vars.push(TypeVar {
-                    name,
-                    restriction: Some(TypeVarRestriction::Constraint(vec![
-                        Expr::Name(ExprName {
-                            range: TextRange::default(),
-                            id: Name::from("bytes"),
-                            ctx: ruff_python_ast::ExprContext::Load,
-                        }),
-                        Expr::Name(ExprName {
-                            range: TextRange::default(),
-                            id: Name::from("str"),
-                            ctx: ruff_python_ast::ExprContext::Load,
-                        }),
-                    ])),
-                    kind: TypeParamKind::TypeVar,
-                    default: None,
-                });
-            }
             Expr::Name(name) if name.ctx.is_load() => {
                 if let Some(var) = expr_name_to_type_var(self.semantic, name) {
                     self.vars.push(var);
@@ -243,7 +252,7 @@ fn expr_name_to_type_var<'a>(
         }) => {
             if semantic.match_typing_expr(subscript_value, "TypeVar") {
                 return Some(TypeVar {
-                    name,
+                    name: &name.id,
                     restriction: None,
                     kind: TypeParamKind::TypeVar,
                     default: None,
@@ -288,14 +297,14 @@ fn expr_name_to_type_var<'a>(
                     Some(TypeVarRestriction::Bound(&bound.value))
                 } else if arguments.args.len() > 1 {
                     Some(TypeVarRestriction::Constraint(
-                        arguments.args.iter().skip(1).cloned().collect(),
+                        arguments.args.iter().skip(1).collect(),
                     ))
                 } else {
                     None
                 };
 
                 return Some(TypeVar {
-                    name,
+                    name: &name.id,
                     restriction,
                     kind,
                     default,
@@ -327,7 +336,7 @@ fn check_type_vars(vars: Vec<TypeVar<'_>>) -> Option<Vec<TypeVar<'_>>> {
     // found on the type parameters
     (vars
         .iter()
-        .unique_by(|tvar| &tvar.name.id)
+        .unique_by(|tvar| tvar.name)
         .filter(|tvar| tvar.default.is_none())
         .count()
         == vars.len())
diff --git a/crates/ruff_linter/src/rules/pyupgrade/rules/pep695/use_pep695_type_alias.rs b/crates/ruff_linter/src/rules/pyupgrade/rules/pep695/use_pep695_type_alias.rs
index c2340c609..1fb63f8be 100644
--- a/crates/ruff_linter/src/rules/pyupgrade/rules/pep695/use_pep695_type_alias.rs
+++ b/crates/ruff_linter/src/rules/pyupgrade/rules/pep695/use_pep695_type_alias.rs
@@ -133,7 +133,7 @@ pub(crate) fn non_pep695_type_alias_type(checker: &mut Checker, stmt: &StmtAssig
         .map(|expr| {
             expr.as_name_expr().map(|name| {
                 expr_name_to_type_var(checker.semantic(), name).unwrap_or(TypeVar {
-                    name,
+                    name: &name.id,
                     restriction: None,
                     kind: TypeParamKind::TypeVar,
                     default: None,
@@ -199,7 +199,7 @@ pub(crate) fn non_pep695_type_alias(checker: &mut Checker, stmt: &StmtAnnAssign)
     // Type variables must be unique; filter while preserving order.
     let vars = vars
         .into_iter()
-        .unique_by(|TypeVar { name, .. }| name.id.as_str())
+        .unique_by(|tvar| tvar.name)
         .collect::<Vec<_>>();
 
     checker.diagnostics.push(create_diagnostic(

Comment on lines +151 to +153
// We don't handle defaults here yet. Should perhaps be a different rule since
// defaults are only valid in 3.13+.
default: None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add a test for UP040 with a type alias that uses a TypeVar with a default (since that rule now shares the logic from this module)?

We might want to see if we can convert the fix there to also use simple string interpolation rather than the ExprGenerator... I feel like it makes sense for all three rules to use basically the same infrastructure. But not for this PR! Just another possible followup for you ;)

Comment on lines +195 to +206
restriction: Some(TypeVarRestriction::Constraint(vec![
Expr::Name(ExprName {
range: TextRange::default(),
id: Name::from("bytes"),
ctx: ruff_python_ast::ExprContext::Load,
}),
Expr::Name(ExprName {
range: TextRange::default(),
id: Name::from("str"),
ctx: ruff_python_ast::ExprContext::Load,
}),
])),
Copy link
Member

@AlexWaygood AlexWaygood Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to make sure here that str and bytes haven't been overridden in an enclosing scope before we add a type parameter that uses (str, bytes) as constraints. E.g. https://mypy-play.net/?mypy=latest&python=3.12&gist=100810bb27581625b327bce54ec1b63b. If they have been overridden in an enclosing scope, we can't offer a fix to replace AnyStr with an inline type parameter. You can check for this using the SemanticModel::is_available() method.

/// Return `true` if `member` is an "available" symbol, i.e., a symbol that has not been bound
/// in the current scope currently being visited, or in any containing scope.
pub fn is_available(&self, member: &str) -> bool {
self.is_available_in_scope(member, self.scope_id)
}

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs look fantastic now! Thanks again for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants