Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running pyright on a full workspace does not seem to report all existing errors. #9642

Open
rgoya opened this issue Dec 29, 2024 · 12 comments
Open
Labels
addressed in next version Issue is fixed and will appear in next published version bug Something isn't working

Comments

@rgoya
Copy link

rgoya commented Dec 29, 2024

Describe the bug
I am seeing some puzzling behaviour in pyright where the number of errors reported varies depending on how checks are launched. That is, launching pyright on a workspace with ~6K files will miss errors that are found when the same files are checked in small batches or individually. In the tests below, I saw more than 360 errors reported when checking files independently that are missed when launching pyright on the workspace.

The expectation is that a typing error in a file would be reported regardless of whether that file was being checked independently, or as part of a larger list of files.

Code or Screenshots
I apologize in advance that I cannot share details of our code, but I will try my best to describe the symptoms and maybe get some insights in how to debug it further. (I also apologize for the long writeup, I had some time to kill...)

Context
We have a large monorepo with the bulk of code in two folders, let's call them org and org_dev. There are some restrictions in importing relationships:

  • Code in org can import from org, not org_dev.
  • Code in org_dev can import from org and org_dev.
  • Code in org and org_dev import from third party libraries (core python, pandas, numpy, etc).
    The number of python files in org and org_dev are 3600+ and 2400+ respectively.

Originally we were running pyright on everything:

$ pyright org org_dev | grep informations
216 errors, 0 warnings, 0 informations

To speed things up we separted the run into each org and org_dev in parallel:

$ pyright org | grep informations
107 errors, 0 warnings, 0 informations
$ pyright org_dev | grep informations
123 errors, 0 warnings, 0 informations

We got our speed increase, but we also got more errors reported (bulk 216, split 107+123=230). When running pyright directly on the files with new errors reported, pyright indeed reported the errors. (Note: the high number of errors are there because I am testing on 1.1.391 for this report; with our running version, 1.1.365, we went from 0 errors to 20-ish.)

It would seem that some files were not being checked when running pyright on the whole codebase at once, and only checked when fewer files were being given to pyright. Is this expected? Does pyright have an upper limit on files it can check?

I then ran the checks with --verbose --stats and the stats report does suggest that all runs are inspecting the expected number of files (6175 = 3681+2494):

Analysis stats both org org_dev
Errors: 216 107 123
Total files parsed and bound: 16536 11856 11823
Total files checked: 6175 3681 2494

Selecting a file with errors reported only when running on org, and inspecting how it appears in each run log shows something like:

$ # On org and org_dev together
$ grep FILE.py pyright-1.1.391.test.org.org_dev.verbose.stats.txt
38ms: file://FILE.py
$ # On org alone
$ grep FILE.py pyright-1.1.391.test.org.verbose.stats.txt
FILE.py
  FILE.py:94:34 - error: Unnecessary "# type: ignore" comment (reportUnnecessaryTypeIgnoreComment)
39ms: file://FILE.py
$ # Run pyright on file itself confirms report
$ pyright FILE.py
FILE.py
  FILE.py:94:34 - error: Unnecessary "# type: ignore" comment (reportUnnecessaryTypeIgnoreComment)
1 error, 0 warnings, 0 informations

The above seems to suggest that the file is being checked (time of 38ms), but the report is not being output.

More experiments
To dig into this further, I ran three more experiments:

  • Increase the max heap size to detect differences between scanning org, org_dev or both.
    • Result: heap size is not the issue.
  • Spread files between multiple CPUs by using --threads (only on org tree).
    • Result: file check distribution during thread assignment allows pyright to detect more errors.
  • Spread files between multiple pyright runs spread by folders (only on org tree).
    • Result: checking files independently detects 360+ errors missed when running on workspace. Similar error detection than 10 threads.
Heap size test

Heap size

Our production heapsize is set to 8GB by using NODE_OPTIONS="--max-old-space-size=8192". I ran the three checks both (org org_dev), org, and org_dev with heap sizes of 8192, 16384 and 24576. Interestingly, both had a reduction of 6 errors, the rest went unchanged. That is, checking org and org_dev separately always resulted in more errors reported than checking them both in one pyright run:

With heap of 8GB
both     216  errors,  0  warnings,  0  informations
org      107  errors,  0  warnings,  0  informations
org_dev  123  errors,  0  warnings,  0  informations
With heap of 16GB
both     210  errors,  0  warnings,  0  informations
org      107  errors,  0  warnings,  0  informations
org_dev  123  errors,  0  warnings,  0  informations
With heap of 24GB
both     210  errors,  0  warnings,  0  informations
org      107  errors,  0  warnings,  0  informations
org_dev  123  errors,  0  warnings,  0  informations

Although heap size did account for the missing errors, it did make for some interesting plots of memory management:

Memory behaviour with 8GB heap size Image
Memory behaviour with 24GB heap size Image
---
Threads test

Threads

Code block for test
for RUN in 1 2 3
do
    for THREADS in 1 5 10
    do
        echo Run $RUN, $THREADS threads
        pyright --threads $THREADS org > pyright-1.1.391.thread_test.threads_${THREADS}.run_${RUN}.txt
    done
done
echo "---"
echo "Errors by thread group:"
for THREADS in 1 5 10
do
    for FILE in pyright-1.1.391.thread_test.threads_${THREADS}.run_*.txt
    do
        echo $FILE `grep informations $FILE`;
    done
    echo "---"
done | column -t

The results seem puzzling. With one thread, they are consistent at 107 errors reported; but when multithreading, no only do I get more reports than single threaded, but the number of reports can vary between runs! This is the output of the script above:

Run 1, 1 threads
Run 1, 5 threads
Run 1, 10 threads
Run 2, 1 threads
Run 2, 5 threads
Run 2, 10 threads
Run 3, 1 threads
Run 3, 5 threads
Run 3, 10 threads
---
Errors by thread group:
pyright-1.1.391.thread_test.threads_1.run_1.txt   107  errors,  0  warnings,  0  informations
pyright-1.1.391.thread_test.threads_1.run_2.txt   107  errors,  0  warnings,  0  informations
pyright-1.1.391.thread_test.threads_1.run_3.txt   107  errors,  0  warnings,  0  informations
---
pyright-1.1.391.thread_test.threads_5.run_1.txt   354  errors,  0  warnings,  0  informations
pyright-1.1.391.thread_test.threads_5.run_2.txt   358  errors,  0  warnings,  0  informations
pyright-1.1.391.thread_test.threads_5.run_3.txt   398  errors,  0  warnings,  0  informations
---
pyright-1.1.391.thread_test.threads_10.run_1.txt  460  errors,  0  warnings,  0  informations
pyright-1.1.391.thread_test.threads_10.run_2.txt  476  errors,  0  warnings,  0  informations
pyright-1.1.391.thread_test.threads_10.run_3.txt  460  errors,  0  warnings,  0  informations
---

Here are some Venn diagrams showing the differences:

More threads result in more errors reported

Image

In three runs, using one thread had consistent results.

Image

Using five threads resulted in more errors reported with variability between runs.

Image

Using 10 threads resulted in even more errors, but variability did not increase.

Image

---
Check folders and files test

Further separating folders
Since scanning both org and org_dev gave slightly different results. I wanted to know whether splitting org into smaller chunks would have further effects.

I did two runs, one launching pyright targetting each subdirectory of org/ (40+ subdirs, "dirscan") and one targetting every independent file in the org/ tree (3500+ files, "filescan").

In short, org check reported more errors than dirscan, but filescan reported even more errors. Here is a Venn diagram of the overlap:

Image

The large number of errors found with filescan seemed reminiscent of the run with 10 threads. Indeed, most of the errors are found by both, the 10 threads scan and filescan (although not all), here is the Venn diagram of the overlap:

Image

Conclusion
The Venn diagrams describing the overlap between filescan and 10 threads suggest that launching pyright on a workspace is indeed either failing to detect, or report (since --stats does indicate a check time for underreported files), typing errors in the codebase.

VS Code extension or command-line
The tests in this report were run using pyright command line version 1.1.391. Although similar behaviour was seen with previous versions.

@rgoya rgoya added the bug Something isn't working label Dec 29, 2024
@erictraut
Copy link
Collaborator

erictraut commented Dec 29, 2024

Thanks for the detailed analysis.

Here's a few additional questions:

  1. Are you using a pyrightconfig.json or pyproject.toml file with a pyright configuration at the root of your project? If so, are you using an "include", "exclude", "ignore", "extraPaths", or any other config options that potentially affect which files are included in the project or how imports are resolved?
  2. Are you configuring any execution environments in your config file?
  3. Is there more than one pyright config file in your project?
  4. What directory are you running the command-line from? If there is no config file, the working directory defines the "project root". This affects import resolution behaviors and can lead to inconsistent errors if you're running pyright from different directories.
  5. Have you looked in detail at any of the errors that are reported in one run but not another? Do these errors have anything in common? For example, are they all coming from the same diagnostic rule? If you're not able to see any commonality, perhaps you could provide me with a few such errors to see if I can see some commonality.
  6. Do all of the diagnostics that you see appear to be legitimate? In other words, do you suspect that there are false negatives in the runs with fewer diagnostics, or false positives in the runs with more diagnostics?

@erictraut erictraut added the question Further information is requested label Dec 29, 2024
@erictraut
Copy link
Collaborator

Oh, I think I see what's going on here. You mentioned that org_dev can import from org but not the other way around. That means when you tell pyright to include the files in org_dev, it's going to also implicitly include any files in org that are imported by org_dev and are under your project root. It will report errors for these files. If you then check org separately, you'll see the same errors again. This is expected behavior.

My recommendation is that you don't attempt to manually split up the type checking for your repo. Just check the entire repo and use --threads so pyright leverages the multiple cores in your system to reduce type checking times.

If you do split up type checking, you'll need to de-duplicate the resulting diagnostics yourself. You can do this by using the --outputjson command line switch and writing a small script to de-dup diagnostics, then report them in whatever format you prefer.

I'm going to close the issue because I'm pretty confident this explains the behaviors you're seeing, and pyright is working as intended here. If you think that I've missed something, feel free to reply.

@erictraut erictraut closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2024
@erictraut erictraut added as designed Not a bug, working as intended and removed question Further information is requested labels Dec 29, 2024
@rgoya
Copy link
Author

rgoya commented Dec 29, 2024

Oh, I think I see what's going on here. You mentioned that org_dev can import from org but not the other way around. That means when you tell pyright to include the files in org_dev, it's going to also implicitly include any files in org that are imported by org_dev and are under your project root. It will report errors for these files. If you then check org separately, you'll see the same errors again. This is expected behavior.

I considered this reason as well, but I don't think that is the case. Most of the testing was done comparing only launching tests on org. For example, the org_8G/dirscan/filescan and the filescan/t5_r1/t10_r1 Venn diagrams I attached as part of the "Check folders and files test" section were obtained running pyright only on org, its subfolders and files; so no org_dev files were being tested.

If you do split up type checking, you'll need to de-duplicate the resulting diagnostics yourself. You can do this by using the --outputjson command line switch and writing a small script to de-dup diagnostics, then report them in whatever format you prefer.

I had checked for duplicate entries of the sort you suggested and did not find them. Here's the Venn diagram for errors found running pyright on both, org and org_dev:

Image

You can see that running on org and org_dev on their own only catches more errors than running on both.

Additionally, addressing your point, double checking the subtrees where errors are reported confirms that each error is reported only the run that is checking that subtree. During my analysis I loaded all the errors in a data frame and labelled them on which test reported them (boolean columns both_8G, org_8G, org_dev_8G, etc), as well as other features like the tree column which indicates which folder the file is in.

No errors are "crossing boundaries". both reports on both, org on org, and org_dev in org_dev:

> len(df_pivot.query("org_8G == True and tree == 'org'"))
98
> len(df_pivot.query("both_8G == True and tree == 'org_dev'"))
118

> len(df_pivot.query("org_8G == True and tree == 'org'"))
107
> len(df_pivot.query("org_8G == True and tree == 'org_dev'"))
0

> len(df_pivot.query("org_dev_8G == True and tree == 'org_dev'"))
123
> len(df_pivot.query("org_dev_8G == True and tree == 'org'"))
0

Confirming that the tree column matches the filepath (file):

> sum(df_pivot.query("tree == 'org'")["file"].apply(lambda x: x.startswith("org")))
488
> sum(df_pivot.query("tree == 'org'")["file"].apply(lambda x: x.startswith("org_dev")))
0

> sum(df_pivot.query("tree == 'org_dev'")["file"].apply(lambda x: x.startswith("org_dev")))
123
> sum(df_pivot.query("tree == 'org_dev'")["file"].apply(lambda x: x.startswith("org")))
0

I'm going to close the issue because I'm pretty confident this explains the behaviors you're seeing, and pyright is working as intended here. If you think that I've missed something, feel free to reply.

I think there's more to it.

(I'm getting/arranging the data to address the questions in your first response)

@rgoya
Copy link
Author

rgoya commented Dec 30, 2024

Thanks for the very prompt response, @erictraut .

Here's a few additional questions:

  1. Are you using a pyrightconfig.json or pyproject.toml file with a pyright configuration at the root of your project? If so, are you using an "include", "exclude", "ignore", "extraPaths", or any other config options that potentially affect which files are included in the project or how imports are resolved?

We are using a sole pyproject.toml file at the root of the project. I don't believe any of the options we have there would affect imports:

[tool.pyright]
extraPaths = [
  "third_party/folder",
]

exclude = [
  "**/.ipynb_checkpoints",
  "**/__pycache__",
  ".mypy_cache",
  ".conda*",
  ".git",
  ".jupyterlab",
  "**/node_modules",
  "bazel*"
]

typeCheckingMode = "basic"
strictParameterNoneValue = false
reportUnusedExpression = false
reportPrivateImportUsage = false
reportUnnecessaryTypeIgnoreComment = true
stubPath = "third_party/stubs"
  1. Are you configuring any execution environments in your config file?

No.

  1. Is there more than one pyright config file in your project?

No.

  1. What directory are you running the command-line from? If there is no config file, the working directory defines the "project root". This affects import resolution behaviors and can lead to inconsistent errors if you're running pyright from different directories.

I run pyright at the root folder, and made sure all tests ran to gather the data for the issue report were run from the root folder.

  1. Have you looked in detail at any of the errors that are reported in one run but not another? Do these errors have anything in common? For example, are they all coming from the same diagnostic rule? If you're not able to see any commonality, perhaps you could provide me with a few such errors to see if I can see some commonality.

The bulk of the errors reported only when checking files independently (filescan) are of type ReportArgumentType. That said, the enrichment of this type of error seems to be only coincidental (see bottom).

Here is the table summing grouping all errors in org found by folder, dirscan and filecan, and counting how many of each type are detected:

> df_pivot.query("org_8G == True | dirscan == True | filescan == True")[
    ["report_class", "org_8G", "dirscan", "filescan"]
].value_counts().sort_index()

report_class                        org_8G  dirscan  filescan
reportArgumentType                  False   False    True        345
                                            True     True          2
                                    True    False    True         23
                                            True     True          5
reportAttributeAccessIssue          True    True     True         40
reportCallIssue                     False   False    True          7
                                    True    True     True          2
reportGeneralTypeIssues             True    True     True          1
reportIndexIssue                    True    True     True          1
reportMissingImports                True    True     True          1
reportOptionalCall                  True    True     True          2
reportOptionalSubscript             True    True     True          1
reportReturnType                    True    True     True          1
reportUnnecessaryTypeIgnoreComment  False   True     True          2
                                    True    True     True         30

Note: like in my previous reply, these runs check only files in org, so there is no org vs org_dep duplicates.

Looking at the actual files from where the errors were reported shows 108 files had errors reported only when scanning independently. Here's a Venn diagram of the following data:

> df_pivot.query("org_8G == True | dirscan == True | filescan == True")[
    ["file", "org_8G", "dirscan", "filescan"]
].drop_duplicates()
Image
  1. Do all of the diagnostics that you see appear to be legitimate? In other words, do you suspect that there are false negatives in the runs with fewer diagnostics, or false positives in the runs with more diagnostics?

This is what ultimately made me run pyright on each file independently (filescan test in main issue). They do seem to make sense.

The large number of reportArgumentType errors seems to be related to an ongoing issue we are having related to this report, but I also see errors of this tyoe when checking both, just fewer of them.

[Edit:] I found something interesting in 5 files:

  • I have a file A.py that dirscan misses entirely (no errors reported), filescan finds 8 errors, and org_8G does report it, but only reports 2 errors out of the 8.
  • All errors are of the same general type reportArgumentType, some are duplicated in different lines, meaning there is a total of 4 different errors of counts [3, 1, 2, 2]
  • The 2 errors that org_8G detects are of the same type.

@erictraut
Copy link
Collaborator

Thanks for the additional details. Reopening because it does appear there's something unexpected going on here.

@erictraut erictraut reopened this Dec 30, 2024
@erictraut erictraut removed the as designed Not a bug, working as intended label Dec 30, 2024
@rgoya
Copy link
Author

rgoya commented Jan 3, 2025

I've tracked down one source of variability. This doesn't explain the full behaviour, only chips away at it.

I was bugged by the variability found during the threads-test described in the opening issue. That is, when launching pyright --threads 10 org/, some runs had their own set of errors. More precisely, the second run of the threads=10 test (t10_r2) found 14 errors that the other runs didn't. From the original post:
Image
I had a closer look and found that these 14 errors all came from the same file (let's call it A.py), yet only that t10_r2 test found them. Running pyright A.py reported no errors; however, I had been trying to debug other errors and had found a pyright github issue that mentioned that typechecking objects in one file can affect how typing is reported on another file. This led me to believe that there must be at least another file B.py in org/ that is triggering errors in A.py.

Hypothesis: there exist at least one file B.py that when checked along A.py causes pyright to report errors in A.py that would otherwise not be reported.

Experiment: for every file B.py in the org/ tree, launch a typecheck pyright B.py A.py.

cd $PROJECT_ROOT
cat > pairwise_test.sh << EOF
echo -e "Testing with B=\$1\n\`pyright \$1 org/A.py\`"
EOF
find org/ -type f -name "*.py" | xargs -n 1 -P 14 bash pairwise_test.sh > pairwise_test.out

Result: indeed, out of ~3600 pairwise-tested B.py files, 14 triggered the the reporting of errors in A.py when checked together. For all B.py hits, the following behaviour was observed:

$ pyright org/A.py
0 errors, 0 warnings, 0 informations
$ pyright org/B.py
0 errors, 0 warnings, 0 informations
$ pyright org/B.py org/A.py
org/A.py
  org/A.py:376:35 - error: Cannot access attribute "x" for class "Literal['data']"
    Attribute "x" is unknown (reportAttributeAccessIssue)
  org/A.py:376:35 - error: Cannot access attribute "x" for class "Literal['layout']"
    Attribute "x" is unknown (reportAttributeAccessIssue)
  org/A.py:376:35 - error: Cannot access attribute "x" for class "Literal['frames']"
    Attribute "x" is unknown (reportAttributeAccessIssue)
  org/A.py:376:60 - error: Cannot access attribute "xaxis" for class "Literal['data']"
    Attribute "xaxis" is unknown (reportAttributeAccessIssue)
  org/A.py:376:60 - error: Cannot access attribute "xaxis" for class "Literal['layout']"
    Attribute "xaxis" is unknown (reportAttributeAccessIssue)
  org/A.py:376:60 - error: Cannot access attribute "xaxis" for class "Literal['frames']"
    Attribute "xaxis" is unknown (reportAttributeAccessIssue)
  org/A.py:380:41 - error: Unnecessary "# type: ignore" comment (reportUnnecessaryTypeIgnoreComment)
  org/A.py:396:39 - error: Cannot access attribute "y" for class "Literal['data']"
    Attribute "y" is unknown (reportAttributeAccessIssue)
  org/A.py:396:39 - error: Cannot access attribute "y" for class "Literal['layout']"
    Attribute "y" is unknown (reportAttributeAccessIssue)
  org/A.py:396:39 - error: Cannot access attribute "y" for class "Literal['frames']"
    Attribute "y" is unknown (reportAttributeAccessIssue)
  org/A.py:396:64 - error: Cannot access attribute "yaxis" for class "Literal['data']"
    Attribute "yaxis" is unknown (reportAttributeAccessIssue)
  org/A.py:396:64 - error: Cannot access attribute "yaxis" for class "Literal['layout']"
    Attribute "yaxis" is unknown (reportAttributeAccessIssue)
  org/A.py:396:64 - error: Cannot access attribute "yaxis" for class "Literal['frames']"
    Attribute "yaxis" is unknown (reportAttributeAccessIssue)
  org/A.py:403:45 - error: Unnecessary "# type: ignore" comment (reportUnnecessaryTypeIgnoreComment)
14 errors, 0 warnings, 0 informations

Interestingly, inverting the order of the files in the pyright call results in no reported errors:

$ pyright org/A.py org/B.py
0 errors, 0 warnings, 0 informations

I assume when pyright typechecks B.py first, it builds its types database and then that affects the typing results when typechecking A.py.

Digging into the relationships between all B.py and A.py I found the common pattern was usage of plotly and I was able to distill the code to replicate this behaviour.

Setup trigger and target:

cat > B.trigger.py << EOF
import pandas as pd
import plotly.express as px

df = pd.DataFrame({"x_coord": [0, 1, 2, 3, 4], "y_coord": [2, 3, 4, 5, 1]})
fig = px.scatter(data_frame=df, x="x_coord", y="y_coord")

# The following line is enough to trigger the error
trigger = tuple(fig.data)
# Note that fig.data is already a touple, so this is not changing anything
assert trigger == fig.data
print(f"{(trigger == fig.data) = }")
EOF
cat > A.target.py << EOF
import pandas as pd
import plotly.graph_objs as go
import plotly.express as px

df = pd.DataFrame({"x_coord": [0, 1, 2, 3, 4], "y_coord": [2, 3, 4, 5, 1]})
fig = px.bar(df, x="x_coord", y="y_coord")

assert isinstance(fig.layout, go.Layout)
axes_data = [f.x for f in fig.data if f.xaxis == "x"]
print(f"{axes_data = }")
# The above code doesn't make much sense in this context, but I use it to
# replicate the original error. Accessing fig.data[0].xaxis will suffice
# to get the similar A.py/B.py effect, with a different error
# Note: I had to add the assert() in place of the original code, without it
# pyright will throw an error on this file alone. It is unclear to me why.
EOF

Then test:

$ python A.target.py
axes_data = [array([0, 1, 2, 3, 4])]
$ python B.trigger.py
(trigger == fig.data) = True
$ pyright A.target.py
0 errors, 0 warnings, 0 informations
$ pyright B.trigger.py
0 errors, 0 warnings, 0 informations
$ pyright A.target.py B.trigger.py
0 errors, 0 warnings, 0 informations
$ pyright B.trigger.py A.target.py
A.target.py
  A.target.py:9:16 - error: Cannot access attribute "x" for class "Literal['data']"
    Attribute "x" is unknown (reportAttributeAccessIssue)
  A.target.py:9:16 - error: Cannot access attribute "x" for class "Literal['layout']"
    Attribute "x" is unknown (reportAttributeAccessIssue)
  A.target.py:9:16 - error: Cannot access attribute "x" for class "Literal['frames']"
    Attribute "x" is unknown (reportAttributeAccessIssue)
  A.target.py:9:41 - error: Cannot access attribute "xaxis" for class "Literal['data']"
    Attribute "xaxis" is unknown (reportAttributeAccessIssue)
  A.target.py:9:41 - error: Cannot access attribute "xaxis" for class "Literal['layout']"
    Attribute "xaxis" is unknown (reportAttributeAccessIssue)
  A.target.py:9:41 - error: Cannot access attribute "xaxis" for class "Literal['frames']"
    Attribute "xaxis" is unknown (reportAttributeAccessIssue)
6 errors, 0 warnings, 0 informations
$

(I believe the above behaviour could go into its own issue, let me know if you want me to submit it separately.)

A question still stands related to the main issue. If there are multiple B.py files that trigger reports in A.py, and all of them are supposed to be checked by pyright org/ and other tests, why are A.py errors not being triggered? I can think of some options:

  1. In all but the t10_r2 run, A.py was checked before any B.py file, and thus the A.py errors were not triggered. This depends on how pyright scans files when given a folder. The actual names of A.py and B.py files ordered alphabetically puts one B.py file before A.py and the rest of the B.py after A.py.
  2. A.py errors were not reported along with the other few hundred errors detected by filescan, but not by pyright org/. I lean towards this mainly given the overall pattern of files not being reported when everything is being checked in a single pyright org/ call (although this specific case does match better with (1.))
  3. There are other combinations of file-checking that "defuse" the B.py-triggering of errors. This seems far fetched, but without being familiar on pyright internals I don't know how impossible it is.

erictraut added a commit that referenced this issue Jan 3, 2025
…type inference that involves recursion. This addresses #9642.
erictraut added a commit that referenced this issue Jan 3, 2025
…type inference that involves recursion. This addresses #9642. (#9658)
@erictraut
Copy link
Collaborator

Thanks for the repro steps. I was able to repro the issue and determine the underlying cause. As you suspected, it was related to code within plotly. This library is unfortunately not annotated, so pyright needs to attempt to infer function and method return types. The problem occurs in the BaseFigure.__getitem__ method, which has a code path that eventually invokes the BaseFigure.data property, which in turn invokes BaseFigure.__getitem__. This type of recursion means that there's a cycle in the call graph, and without return type annotations, there's no way to deterministically infer return types for all of the nodes in the graph. The results depend on where you enter the cycle.

I've updated pyright's return type inference logic so it's more deterministic in cases like this. It's still not 100% deterministic because there are hard-coded limits on recursion depth that are required for performance and stability reasons (to prevent hangs or stack overflows). If one of these limits are hit, which should rarely if ever happen in non-contrived cases, it's still possible to observe order-dependent type evaluation phenomena like the one you reported. I'm going to ignore that edge case because I don't see a good way to address it.

The issue you reported will be addressed in the next release.

@erictraut erictraut added the addressed in next version Issue is fixed and will appear in next published version label Jan 3, 2025
@rgoya
Copy link
Author

rgoya commented Jan 3, 2025

Thanks for the explanation on the source of that plotly problem, @erictraut . Definitely a head scratcher.

Just to clarify the "addressed in next version" flag relates only to the plotly's A.py/B.py fix, or do you suspect this logic will address the generality of pyright org/ underreporting errors vs threads and filescan?

@erictraut
Copy link
Collaborator

The next release will address the case that you highlighted in your repro. If you are able to repro other cases, please file separate bugs.

@rgoya
Copy link
Author

rgoya commented Jan 3, 2025

OK, I'll submit other cases I find as separate bugs. It'd be great if this issue could stay open, since it has the main "workspace" vs "file" error differences which still highlights something weird happening with the way pyright is checking large codebases vs files.

@erictraut
Copy link
Collaborator

I leave issues open until the next release when the bug is addressed. If you find additional bugs, please open new issues. It's fine to reference this issue if you find that useful.

@rgoya
Copy link
Author

rgoya commented Jan 6, 2025

Ugh, I just found a case of this sort:

pyright org/ : shows errors in A.py
pyright A.py : shows errors in A.py
pyright B.py A.py : clears errors in A.py
pyright C.py B.py A.py: shows errors in A.py

A.py is a test script using (generic) classes from B.py and C.py, while C.py is defines common test classes based off of classes in B.py.

At first sight it might seem to follow the same graph cycle problem found for plotly, so I will hold on until the next pyright release to see if it also fixed this. I will try to distill a repro then if the problem still exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addressed in next version Issue is fixed and will appear in next published version bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants