Skip to content

Remove iostream include for C backend #1400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 19, 2025

Conversation

maresb
Copy link
Contributor

@maresb maresb commented May 12, 2025

Description

The iostream include seems to be unused, and it's actually leading to #1398, so an alternative resolution is to remove the include.

We'll see if the test suite still passes...

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

📚 Documentation preview 📚: https://pytensor--1400.org.readthedocs.build/en/1400/

@ricardoV94
Copy link
Member

Why can't we include it and are you sure it's not used?

@maresb
Copy link
Contributor Author

maresb commented May 13, 2025

We can include it, but at the cost of a discrepancy on Windows between the libgcc v13 and v15 symbols that can lead to #1398. If we remove it, then I think we no longer have to worry about #1398, at least for now.

I am not absolutely sure it's unused. How confident are you in the test suite? There's one usage here that's commented out:

//std::cerr << (unit/256) MOD 16 << (unit / 16) MOD 16 << unit MOD 16<< '\\n';

@lucianopaz
Copy link
Member

lucianopaz commented May 13, 2025

I couldn’t find any calls to the iostream functions in pytensor. I hope that we aren’t using any symbols from namespaces that are included indirectly by iostream.

@ricardoV94
Copy link
Member

We can include it, but at the cost of a discrepancy on Windows between the libgcc v13 and v15 symbols that can lead to #1398.

What does that mean? I'm more worried about the underlying breaking change than the specific test we happened to see failing since we don't test extensively on windows. Wonder if we're just sweeping it under the rug.

Do we need to invalidate Pytensor cache for windows users for instance? How does including iostream suffice to break compilation? What's special about it vs other standard libraries?

@maresb
Copy link
Contributor Author

maresb commented May 13, 2025

What does that mean?

I don't know enough about C development to know myself. In particular I don't know what type of bug this is: I'm not sure if the DLL is just missing symbols that should be present, or if the symbols were deliberately removed. I also don't understand why there is a mismatch in the versions between the dll.a used at compile-time and the .dll used at runtime. As I wrote in Zulip under #general > random cmake test failures all of a sudden @ 💬,

I'm poking around on a horrendously slow Windows VM to diagnose an issue we're hitting in PyTensor that seems to be identical. I don't personally understand C development, but based on my conversations with ChatGPT, it seems that we compile (PyTensor compiles at runtime) against Library\lib\gcc\x86_64-w64-mingw32\13.3.0\libstdc++.dll.a which promises that libstdc++-6.dll should contain _ZSt21ios_base_library_initv, but it's missing in the DLL provided by v15.1.

Someone else (perhaps @lucianopaz) may be able to make more sense of that thread and conda-forge/ctng-compilers-feedstock#174.

@lucianopaz
Copy link
Member

lucianopaz commented May 14, 2025

Someone else (perhaps @lucianopaz) may be able to make more sense of that thread and conda-forge/ctng-compilers-feedstock#174.

That thread flied way over my head! What I managed to follow from this is that the symbol _ZSt21ios_base_library_initv from the iostream include headers is missing one leading underscore in its name. The object file that gets linked later has that symbol with two leading underscores. That made the windows linker (lld apparently) say that the requested symbol was missing, but the GNU linker (GNU ld?) somehow managed to find the symbol anyway and link it even though it wasn't really there. The rest of the thread was impossible to parse for me, but it looks like they discussed why the symbol had a different name, and how to alias it, or patch it, or make the linker find it. So as far as I can tell, the standard c++ library has this problem for all OSes, but for some magic coincidence, linux managed to cope and handle it, while windows actually did the sane thing and errored out. I was wrong, the thread actually says that the symbol is missing from the i386/windows environment only. So in the end, this problem was only present for windows and not linux.

@maresb
Copy link
Contributor Author

maresb commented May 14, 2025

Thanks @lucianopaz for the input!!!

The common fix I implemented in both conda-forge/pytensor-suite-feedstock#158 and conda-forge/conda-forge-repodata-patches-feedstock#1024 blocks only libstdcxx v15.1.0 via the constraint libstdcxx !=15.1.0. So that fix was more of just a band-aid, and the problem is likely to resurface with future releases.

Removing the unused iostream in would in contrast be a more sustainable workaround for this bug.

If you want to keep iostream, then one possible alternative that would require a lot of work would be to try and check the toolchain vs runtime versions for known conflicts and raise an informative error, or to inspect the imports of the compiled ops and verify that they're present in the runtime.

@maresb maresb force-pushed the remove-iostream branch from b136fe0 to aa114f7 Compare May 19, 2025 13:55
@maresb
Copy link
Contributor Author

maresb commented May 19, 2025

I want to gently advocate for getting this merged.

I'm fairly surprised that something seemingly so simple and innocuous like including iostream would lead to an incompatibility between libgcc v13 and v15, and I don't understand the interface/guarantees, in particular whether or not this is a bug. But it seems to me that since it's being caused by dead code, we should just remove the dead code (in this case an unused include).

Copy link

codecov bot commented May 19, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.10%. Comparing base (5229feb) to head (aa114f7).
Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1400   +/-   ##
=======================================
  Coverage   82.10%   82.10%           
=======================================
  Files         208      208           
  Lines       49576    49576           
  Branches     8791     8791           
=======================================
  Hits        40704    40704           
  Misses       6699     6699           
  Partials     2173     2173           
Files with missing lines Coverage Δ
pytensor/link/c/cmodule.py 61.10% <100.00%> (ø)
pytensor/tensor/blas.py 73.55% <100.00%> (ø)
pytensor/tensor/special.py 96.85% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lucianopaz lucianopaz merged commit 5ffe17a into pymc-devs:main May 19, 2025
73 checks passed
@maresb maresb deleted the remove-iostream branch May 19, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ImportError: DLL load failed while importing when libgcc 15.1.0 is installed
3 participants