Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option for reportImportCycles to allow circular imports within a TYPE_CHECKING block #960

Open
Wizzerinus opened this issue Dec 18, 2024 · 9 comments
Labels
type checking / linting issues relating to existing diagnostic rules or proposals for new diagnostic rules

Comments

@Wizzerinus
Copy link

So here is the test case:

% pyright b.py a.py
0 errors, 0 warnings, 0 informations 

~/tests
% basedpyright b.py a.py
/home/wizzerinus/tests/b.py
  /home/wizzerinus/tests/b.py:1:8 - warning: Import "a" is not accessed (reportUnusedImport)
/home/wizzerinus/tests/a.py
  /home/wizzerinus/tests/a.py: error: Cycle detected in import chain
    /home/wizzerinus/tests/a.py
    /home/wizzerinus/tests/b.py (reportImportCycles)
  /home/wizzerinus/tests/a.py:4:12 - warning: Import "b" is not accessed (reportUnusedImport)
1 error, 2 warnings, 0 notes

~/tests
1 % cat b.py
import a

~/tests
% cat a.py 
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    import b

Using TYPE_CHECKING to stop import cycles is a relatively common idiom so I'd want to see this supported :) I would attempt contributing but it seems that the import cycles rule is pretty difficult to maintain so not sure if starting with that is the best idea ever.

@DetachHead
Copy link
Owner

DetachHead commented Dec 19, 2024

you can disable the reportImportCycles rule or suppress it with a pyright:ignore comment if you disagree with its behavior, but this is one of the few cases where i agree with eric traut's approach, so i don't think it's a good idea to reduce its strictness, though i'll try to explain what he means by "architectural layering violations" as i understand it.

the purpose of the reportImportCycles rule isn't just to prevent runtime errors caused by circular imports, but to also detect a deeper issue within your your project. when you have a circular dependency, regardless of whether it's a type-only import or not, it very often means there's something wrong with how your project is structured. it means your concerns are not separated.

i know it can be quite difficult to effectively explain and understand exactly why something like this is bad when there're no runtime crash, because i've spent months trying to convince coworkers that the circular dependencies in our codebase, even the TYPE_CHECKING ones, are 100% avoidable and that eliminating them leads to a much more well-structured and maintainable codebase. so i'll give an example from my experience:

at work we had a single module containing a class with hundreds of global variables that can be passed via the CLI. these variables vary in how specific they are. some of them are generic as in they are relevant to pretty much everything in the project. but some of them are highly specific to a single module, meaning this super generic module that gets imported by everything needs to now import types from a highly specific module that is otherwise rarely imported by anything else:

# utilities/variables.py
from __future__ import annotations
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from specific_things.specific_module_1 import Foo
    from specific_things.specific_module_2 import Bar

class Variables:
    def __init__(self):
        ... # logic to parse the variables from the command line

    some_generic_thing: bool
    data_for_some_specific_module: Foo
    data_for_some_other_module: Bar
# specific_things/specific_module_1.py
from utilities.variables import Variables

class Foo:
    ...

def create_something():
    result = Foo()
    Variables.data_for_some_specific_module = result

in this example, while the circular dependency does not exist at runtime because the variables.py module does not actually depend on anything at runtime, there is still a glaring problem with the structure of this project, which is that there's a single, generic module that needs to know about every other module in the project. over time, that module will (and has) grown in size to store hundreds of variables for completely unrelated things that this module should really have no knowledge of. if we ever wanted to split variables.py into a separate project (because it contains useful generic functionality for reading variables from the command line), it would be impossible because it's tightly coupled to everything else.

not to mention that the only reason it doesn't crash at runtime is due to the fact that the types currently aren't required by that module at runtime, but what if that changes? what if we want to update the class to use pydantic to parse and validate the types of each variable? then we can't use the TYPE_CHECKING block anymore so we're screwed.

the solution here is to split it into separate classes, so that only the logic specific to the variables.py module is defined there, and the variables used by individual specific modules are defined within those modules:

# utilities/variables.py
class AbstractVariables:
    """extend this class to define variables"""
    def __init__(self):
        ... # logic to parse the variables from the command line
# specific_things/specific_module_1.py
from utilities.variables import AbstractVariables

class Foo:
    ...

class SpecificVariables(AbstractVariables):
    data_for_some_specific_module: Foo

def create_something():
    result = Foo()
    SpecificVariables.data_for_some_specific_module = result

this, from my understanding, what is meant by "separation of concerns". there are many other beneficial reasons for splitting up concerns like this, some of which vary depending on the language you're using, but in general i consider it to be very important especially when working in a large codebase.

the way i see it, your project is not a collection of modules if it's riddled with circular dependencies, but rather a single module split up arbitrarily into separate files.

@DetachHead DetachHead added awaiting response waiting for more info from the author - will eventually close if they don't respond type checking / linting issues relating to existing diagnostic rules or proposals for new diagnostic rules labels Dec 19, 2024
@Wizzerinus
Copy link
Author

The project I am working on is structured in a pretty peculiar way and I don't think it's circular dependencies, although maybe a lot of projects that need the TYPE_CHECKING crutch are.

So basically there are certain objects with bidirectional access to each other, and I want to typecheck them (this isn't the code of the project but it should probably explain what I'm talking about):

# RootObject.py
class RootObject:
  modules: list[Module]
  # 500 lines of code here

# ModuleObject.py
class Module:  # this handles some specific functionality, not a generic class
  root: RootObject
  # 500 lines of code here

This pattern appears many times in many unrelated places. Even though this is in some ways a dependency, in some places it is not a dependency at runtime, but one has to be introduced when typechecking is needed:

# GenericObject.py
class GenericObject:
  children: list["GenericObject"]
  parent: "GenericObject"
  def addChild(self, child: "GenericObject"): ...

# ComplexState.py
class ComplexState(GenericObject):
  def __init__(self):
    # this is being composed out of several different sub-objects
    # yeah I get it you can make some `createComplexState()` function to add the submodules
    # but at that point we're breaking the principle of each class being individually useful
    self.addChild(SubObject())

# SubObject.py
class SubObject(GenericObject):
  parent: ComplexState   # oh no!

This is a standard composition setup and it does not really require the child nodes knowing about the parent's type, but not having that knowledge just makes typing pretty much impossible. Even if you don't explicitly store the reference, by the way:

class SubObject(GenericObject):
  def executeAction(self, state: "ComplexState"):   # oh no!

How I see it the separation of concerns should be different from separation of code files. I agree about the separation of concerns argument but the examples why this kind of behavior is bad usually consider small code bases, and I believe in my case fixing this would either yield files that are several thousand lines of code long (which just breaks some IDEs like pycharm, which was the entire reason to have such a modular structure) or end up with some unimaginable mess just to support type checking (in which case the dev would probably not support the type checking to avoid breaking other design concerns, which leads to the code deterioration as a whole).

A compromise could be considered as having a config option that finds only import cycles that exist at runtime and can be stopped by being in the branches that always evaluate to false, so we have best of both worlds where you can find design issues separately and runtime issues separately 😊

@DetachHead
Copy link
Owner

thanks for the examples. these are tricky, and it's hard to suggest a full solution without more context for them, but maybe these circular type dependencies can be avoided if they were instead passed as generics?

example 1:

# RootObject.py
class RootObject[T]:
    modules: list[T]
    # 500 lines of code here


# ModuleObject.py
class Module:  # this handles some specific functionality, not a generic class
    root: RootObject["Module"]
    # 500 lines of code here

example 2:

# GenericObject.py
class GenericObject[Parent = "GenericObject", Child = "GenericObject"]:
    children: list[Child]
    parent: Parent

    def addChild(self, child: Child): ...



# ComplexState.py
class ComplexState(GenericObject['ComplexState', SubObject['ComplexState', GenericObject]]):
    def __init__(self):
        self.addChild(SubObject[ComplexState, GenericObject]())


# SubObject.py
class SubObject[Parent, Child](GenericObject[Parent, Child]):
    parent: Parent  # oh no!

doing so may increase type safety as well, though i realize doing stuff like this can make classes pretty complicated to work with, or it may not even work at all depending on what you're doing.

when i've encountered similar scenarios the solution that's worked for me has been to just define them all in one file (the pyright codebase does this too). but as you mentioned, depending on how many classes you have / how large they are i realize this isn't always feasible.

regardless, i'm still reluctant to reduce the strictness of this rule because i've seen the consequences of people relying entirely on what's allowed at runtime to decide when circular dependencies are acceptable.

A compromise could be considered as having a config option that finds only import cycles that exist at runtime and can be stopped by being in the branches that always evaluate to false, so we have best of both worlds where you can find design issues separately and runtime issues separately 😊

i guess we could add an allowTypeOnlyImportCycles option that's disabled by default and adequately documented warning about the dangers of enabling it. though i'd like to become more familiar with scenarios where they're unavoidable before adding such an option.

@Wizzerinus
Copy link
Author

Wizzerinus commented Dec 19, 2024

when i've encountered similar scenarios the solution that's worked for me has been to just define them all in one file (the pyright codebase does this too). but as you mentioned, depending on how many classes you have / how large they are i realize this isn't always feasible.

yeah this is a really good example, actually! this file is 3k lines because of the circular import problem, and such large files usually break IDEs (I had a 5k line file and after modifying it in Pycharm you had to wait ~20 seconds for the Intellisense to update). So you kind of have to choose between "feasibility of file editing" and "have to find runtime errors manually/run all tests every time imports are edited/etc".

After thinking abt your examples I agree that it is probably useful to flag such issues in a lot of codebases, so having a flag like that is a better solution than modifying the importCycles thing by default

@DetachHead
Copy link
Owner

yeah this is a really good example, actually! this file is 3k lines because of the circular import problem, and such large files usually break IDEs (I had a 5k line file and after modifying it in Pycharm you had to wait ~20 seconds for the Intellisense to update). So you kind of have to choose between "feasibility of file editing" and "have to find runtime errors manually/run all tests every time imports are edited/etc".

does the basedpyright language server have the same issue? i have some pretty large files in my projects too but it never seems to cause basedpyright to be noticably slower. if you aren't already, you may want to try the pyright pycharm plugin and see if that's any faster. though idk how to disable pycharm's existing python language features which may interfere with it.

@Wizzerinus
Copy link
Author

I think basedpyright is fast in this regard but I don't know because I don't have huge files anymore :(

@DetachHead
Copy link
Owner

in that case i'm leaning towards closing this issue, but let us know if you do encounter any performance issues as a result of merging modules together to work around this issue and we can either re-open this or raise a separate issue

@DetachHead DetachHead closed this as not planned Won't fix, can't repro, duplicate, stale Dec 21, 2024
@Wizzerinus
Copy link
Author

I'm sorry for bluntness, but this approach sounds kind of stupid honestly? there are people working on the same code base who use pycharm, so attempting to make basedpyright happy makes them unable to work on it 🙃

Not to mention this rule would tend devs towards monolithic code bases which sounds like the worst of both worlds D:

I am fine not enabling the rule for now, but yeah.

@DetachHead
Copy link
Owner

I'm sorry for bluntness, but this approach sounds kind of stupid honestly? there are people working on the same code base who use pycharm, so attempting to make basedpyright happy makes them unable to work on it 🙃

i do try to keep in mind that people use basedpyright with all sorts of different editors, but pycharm uses its own outdated thing for language support instead of a language server, which in my experience is riddled with problems that make it basically unusable for any modern python development. for example its type checking is infinitely worse than pyright and even mypy.

i can re-open this issue and give it some more thought, but since circular dependencies are something i feel very strongly against i just don't feel right adding an option to change this behavior just because pycharm can't handle opening a file with a few thousand lines. its type checking is far less sophisticated than pyright so it really has no excuse to be that slow.

but maybe i'm being biased because i used to be a pycharm user. i paid for jetbrains IDEs for years and watched pycharm slowly rot since they haven't addressed any of these issues in years, barely kept up with new language features and still refuse to accept that language servers are the future (their LSP support is barebones and exclusive to the paid version)

Not to mention this rule would tend devs towards monolithic code bases which sounds like the worst of both worlds D:

as i said earlier i believe when a codebase has separate modules that circularly depend on each other, it's still just one single monolithic module, but just split up into separate files to give the illusion that they are separate modules. this is the case with the codebase i deal with at work and it's an absolute nightmare to work with, it's just as difficult to untangle either way.

@DetachHead DetachHead reopened this Dec 21, 2024
@DetachHead DetachHead removed the awaiting response waiting for more info from the author - will eventually close if they don't respond label Dec 21, 2024
@DetachHead DetachHead changed the title reportImportCycles is still triggered when something is imported with TYPE_CHECKING option for reportImportCycles to allow circular imports within a TYPE_CHECKING block Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type checking / linting issues relating to existing diagnostic rules or proposals for new diagnostic rules
Projects
None yet
Development

No branches or pull requests

2 participants