Skip to content

Update libraries later if possible #582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

oraluben
Copy link
Contributor

@oraluben oraluben commented May 9, 2025

Fixes: #561, pytorch/pytorch#149471

Fixes that:

  1. When we have two dependencies: a.so -> b.so -> c.so and a.so -> c.so where b.so do not have rpath for c.so but a.so has, and if we process b.so first, c.so won't be found;
  2. The order to precess dependencies of a.so is random (for from a set):
    for soname in needed:

by:

  1. use sorted(set) and ordereddict to iterate through libraries deterministically
  2. try to update realpath for libraries if they're not found yet

Also fix an issue when we exclude based on path but excluded library may still present in DynamicExecutable.libraries with empty platform field, which breaks this check:

assert dependency.platform is not None

Copy link

codecov bot commented May 9, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.84%. Comparing base (93f45cc) to head (ef63c60).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #582      +/-   ##
==========================================
+ Coverage   92.82%   92.84%   +0.01%     
==========================================
  Files          21       21              
  Lines        1771     1774       +3     
  Branches      333      334       +1     
==========================================
+ Hits         1644     1647       +3     
  Misses         77       77              
  Partials       50       50              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@oraluben oraluben force-pushed the update-found-libs branch from 9465b6d to 66ef898 Compare May 10, 2025 12:39
@oraluben oraluben force-pushed the update-found-libs branch from 66ef898 to 4272b53 Compare May 10, 2025 12:40
@oraluben oraluben force-pushed the update-found-libs branch from 282cce2 to 40c9b98 Compare May 10, 2025 14:50
@oraluben
Copy link
Contributor Author

oraluben commented May 10, 2025

cython issue: cython/cython#6861

@oraluben oraluben force-pushed the update-found-libs branch from 16f06ed to 7993e1d Compare May 11, 2025 01:29
@auvipy auvipy requested a review from Copilot May 11, 2025 05:30
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses issues with unpredictable dependency processing and missing library resolutions by introducing deterministic library ordering and updating RPATHs.

  • Update build commands in tests/integration/testrpath/setup.py to compile and patch new library d/libd.so.
  • Modify test expectations in tests/integration/test_manylinux.py to validate the new dependency ordering and output.
  • Update lddtree.py to use OrderedDict and sorted iteration for consistent dependency processing.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/integration/testrpath/setup.py Updated compilation commands and adjusted linker flags for new library d/libd.so.
tests/integration/testrpath/d/d.h & d/d.c Added new header and source for library d.
tests/integration/testrpath/b/b.c Updated fb() to incorporate dependency on fd() from library d.
tests/integration/testrpath/a/a.c Updated fa() to aggregate values from fb() and fd().
tests/integration/test_manylinux.py Modified test assertions to account for the new dependency ordering and output changes.
src/auditwheel/lddtree.py Refactored dependency resolution to use OrderedDict and sorted iteration.
Comments suppressed due to low confidence (1)

tests/integration/test_manylinux.py:616

  • Relying on an exact substring in log output to validate behavior may be fragile; consider asserting on a structured error code or exit status to make the test more robust against log message changes.
assert 'lddtree:Could not locate libd.so, skipping' in repair_output, (

@oraluben oraluben force-pushed the update-found-libs branch from b118006 to 6b8e49f Compare May 14, 2025 07:29
@oraluben oraluben force-pushed the update-found-libs branch from 6b8e49f to 76b78ac Compare May 14, 2025 07:38
@mayeut
Copy link
Member

mayeut commented May 17, 2025

Thanks for digging into this.
#586 will fix the non-deterministic search order & revert the behavior to what it was before. Would that be enough to fix the PyTorch issue (I certainly hope so since it seemed to be a regression in 6.3.0) ?
If not, the way the "realpath" is updated is probably not enough w.r.t. excluded libraries (we'd probably need a second pass to filter once again previously analyzed shared objects after adding an exclusion).

@oraluben
Copy link
Contributor Author

Would that be enough to fix the PyTorch issue (I certainly hope so since it seemed to be a regression in 6.3.0) ?

it's enough for upstream PyTorch but 6.3.0 also introduced a new issue which could be fixed by: eb8773a, i.e. this:

Also fix an issue when we exclude based on path but excluded library may still present in DynamicExecutable.libraries with empty platform field, which breaks this check:

assert dependency.platform is not None

And after removing the library from _all_libs this should be fine?

(we'd probably need a second pass to filter once again previously analyzed shared objects after adding an exclusion)

I suspect a second pass may not be enough for general case. For example we have n libraries from a.so to n.so, a.so has rpath for all other libraries while all others do not.
a.so -> b.so -> c.so -> ... -> n.so
a.so -> c.so -> ... -> n.so
...
a.so -> n.so

In this case we need a n-pass iteration to find all libraries unless we use a BFS search? I can imagine there's other issue. How's libc/ld handle this?

@mayeut
Copy link
Member

mayeut commented May 18, 2025

a new issue which could be fixed

The need for this commit is based on realpath update.
If we don't update the realpath later on, then platform should always be valid when realpath is valid (and we don't reach this assert if realpath is None).
With the order fixed to respect DT_NEEDED entries, we're doing the same as ld which was broken in 6.3.0
What remains IMHO lies outside lddtree, that's the order in which we call lddtree on the various ELF objects.

@oraluben
Copy link
Contributor Author

It should be fine then? The reason is I'm using sorted(needed) rather than the DT_NEEDED order, so now it should just work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

track pytorch issue https://github.com/pytorch/pytorch/pull/149471
3 participants