Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance drop for pefile.PE() #420

Open
mschwoer opened this issue Aug 27, 2024 · 5 comments
Open

Performance drop for pefile.PE() #420

mschwoer opened this issue Aug 27, 2024 · 5 comments
Assignees

Comments

@mschwoer
Copy link

Hi!

After upgrading from 2023.2.7 to 2024.8.26 the calls to pefile.PE() done by pyinstaller==6.10.0 take considerably longer on a Windows github action runner.
This is observed in pyinstaller's "Performing binary vs. data reclassification" step, which takes orders of magnitudes longer now.

logs:
2023.2.7: https://github.com/MannLabs/alphapeptdeep/actions/runs/10524645976/job/29161830971
"reclassification" step takes 3s

rerunning the same exact job through the github UI today pulled the latest version of pefile:
2024.8.26: https://github.com/MannLabs/alphapeptdeep/actions/runs/10524645976/job/29307704805
(reclassification step is still in progress at time of writing this ;-))

similar job with some extra debug logs in pyinstaller (proof that is the actual call to pefile.PE() for each single file):
https://github.com/MannLabs/alphapeptdeep/actions/runs/10578467189/job/29308669020

similiar job with downgrade to pefile==2023.2.7 (proof that 2024.8.26 is the root cause):
https://github.com/MannLabs/alphapeptdeep/actions/runs/10578920267/job/29310144178

happy to share any more details if required..

best,
Magnus

@rokm
Copy link
Contributor

rokm commented Aug 27, 2024

This performance hit is caused by 91fb851; every time pefile.PE is closed (either explicitly via close method, or due to use of context, or if the loading fails with exception), garbage collection is forced. Which is not particularly performant in loops that try to process large number of files.

@erocarrera
Copy link
Owner

Thanks @rokm !
@mschwoer can you verify that the garbage collection is what's causing the performance drop?

@rokm
Copy link
Contributor

rokm commented Aug 28, 2024

@erocarrera In case it helps with confirmation/verification:

# perftest.py
import sys
import time
import numpy.core
import pefile

import numpy as np

mode = sys.argv[1] if len(sys.argv) > 1 else 'binary'
if mode == 'binary':
    # Test with valid PE file
    file = numpy.core._multiarray_umath.__file__
else:
    # Test with non-PE file
    file = numpy.core.__file__

print(f"Test file: {file}")

durations = []
for i in range(10000):
    time_start = time.time()
    try:
        with pefile.PE(file, fast_load=True) as pe:
            pass
    except Exception:
        pass
    elapsed = time.time() - time_start 
    durations.append(elapsed)

print(f"Total: {np.sum(durations):.2f} sec")
print(f"Mean: {1000*np.mean(durations):.4f} msec")

Running with pefile 2023.2.7

>python perftest.py binary
Test file: [...]\venv\\lib\site-packages\numpy\core\_multiarray_umath.cp310-win_amd64.pyd
Total: 7.54 sec
Mean: 0.7535 msec
>python perftest.py data
Test file: [...]\venv\\lib\site-packages\numpy\core\__init__.py
Total: 2.05 sec
Mean: 0.2047 msec

Running with pefile 2024.8.26

>python perftest.py binary
Test file: [...]\venv\\lib\site-packages\numpy\core\_multiarray_umath.cp310-win_amd64.pyd
Total: 44.89 sec
Mean: 4.4892 msec
>python perftest.py data
Test file: [...]\venv\\lib\site-packages\numpy\core\__init__.py
Total: 37.94 sec
Mean: 3.7937 msec

Running with 2024.8.26 and gc.collect() line removed:

>python perftest.py binary
Test file: [...]\venv\\lib\site-packages\numpy\core\_multiarray_umath.cp310-win_amd64.pyd
Total: 8.72 sec
Mean: 0.8716 msec
>python perftest.py data
Test file: [...]\venv\\lib\site-packages\numpy\core\__init__.py
Total: 2.77 sec
Mean: 0.2769 msec

@mschwoer
Copy link
Author

mschwoer commented Aug 28, 2024

@erocarrera I can confirm it's the garbage collection:
https://github.com/MannLabs/alphadia/actions/runs/10595833207/job/29362473745
(only difference to current master is the gc commented out in l. 2964 of pefile.py and an added print statement)

@akaszynski
Copy link

akaszynski commented Aug 30, 2024

Also seeing a huge performance drop when freezing via pyinstaller with pefile==2024.8.26. Issue goes away with:
pefile<2024.8.26.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@erocarrera @rokm @akaszynski @mschwoer and others