-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Encryption may generate functions that cause Python to crash after 1024 or 2048 calls. #1491
Comments
Try to upgrade latest Pyarmor 7.x version: 7.7.4 It's better to provide sample script could reproduce in my side |
Also try to remove option |
Thanks for the reply. I have tried what you pointed out.
I've created a reproduction code below. AbstractThis is a procedure that creates 10000 files that may cause problems and then runs and tests them all. File structureroot Files marked with * do not require encryption; the problem reproduces without encryption. File Contentsrun.pyimport main
if __name__ == '__main__':
print("start")
main.run()
print("end") main.pyfrom concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor, as_completed
import sys
import re
def submit(n):
import sub
sub.run(n)
def execute(n):
with ProcessPoolExecutor(max_workers=1) as e:
future = e.submit(submit, n)
future.result()
def run():
max_workers = 12
n_list = list(range(10000))
continue_on_error = False
for arg in sys.argv[1:]:
if matched := re.search(re.compile(r"-*worker=(?P<num>\d+)"), arg):
max_workers = int(matched.group("num"))
elif matched := re.search(re.compile(r"-*range=(?P<begin>\d+)[,:-](?P<end>\d+)"), arg):
n_list = n_list[int(matched.group("begin")):int(matched.group("end"))]
elif matched := re.search(re.compile(r"-*continue-on-error"), arg):
continue_on_error = True
else:
print(arg)
e = ThreadPoolExecutor(max_workers=max_workers)
futures = [e.submit(execute, n) for n in n_list]
try:
for future in as_completed(futures):
n = n_list[futures.index(future)]
try:
future.result()
except Exception as ex:
print(f"{n},failed: {ex}")
if not continue_on_error:
raise
else:
print(f"{n},success")
finally:
for future in futures:
future.cancel()
e.shutdown(wait=False) sub.pyThe reproduction probability will decrease if there is not some complex processing before calling the function in mod.py. import pandas as pd
import importlib
class A:
def a(self, a, b):
pass
def b(self, t):
pass
def run(n):
mod = importlib.import_module(f"suspects.mod{n:04}")
for i in range(1025):
if i > 1015:
aaa = [pd.DataFrame({j: [0.0] for j in range(97)}).astype("category") for _ in range(4)]
for a in aaa:
for s in "abcdefg":
a[s] = ""
bbb = [a.to_numpy() for a in aaa]
c = pd.DataFrame({"": [0.0]}).astype("category")
d = c.to_numpy()
mod.f(A(), A())
if i % 100 == 0:
print(f"{n},{i}") mod.pyI couldn't find a way to reproduce it in a shorter description. from cls import C
def f(a, c):
l = list()
n = ""
l.append(n)
if c is None:
t = C(C(a=l).x({n: None}))
else:
t = c
a.a(a=True, b=True)
r = C(a.b(t))
r.x = l
return r cls.pyclass C:
def __init__(self, a):
pass duplicate.pyimport shutil
import os
os.makedirs("suspects", exist_ok=True)
for i in range(10000):
shutil.copy(
r"mod.py",
rf"suspects\mod{i:04}.py",
) Reproduction environment
Reproduction procedures
The following is no longer necessary because sub.py could be made faster (updated November 1). It takes about an hour.
The default setting is to stop when one problematic file is found, but if you want to get all results without stopping, specify the options as follows.
In my environment, I found about 1 or 2 files per 1000 files with errors. The error does not always occur even in problematic files, and some files have only a 1% or less chance of error (for the current sub.py description). The probability of occurrence varies from file to file. |
@jondy Could you please reopen it? |
Sorry, in test scripts, there is third-party library And pyarmor obfuscated scripts has different frame, for example, And someone else has reported the issues about pandas to use So there are 2 solutions, first patch pandas, there are some examples for Pyarmor 8, but it also works for Pyarmor 7 The other solution is to use |
It seems that sys._getframe is never called in this code. I will continue to investigate. |
We have investigated the cause. sub.py ver.2import importlib
from concurrent.futures import ThreadPoolExecutor
class A:
def a(self, a, b):
pass
def b(self, t):
pass
def a(c):
b = ThreadPoolExecutor()
return ThreadPoolExecutor()
def run(n):
mod = importlib.import_module(f"suspects.mod{n:04}")
for i in range(1025):
if i > 1015:
ccc = [a(ThreadPoolExecutor()) for _ in range(512)]
mod.f(A(), A()) sub.py ver.3import importlib
import sqlite3
class A:
def a(self, a, b):
pass
def b(self, t):
pass
def a(c):
b = sqlite3.connect(":memory:")
return sqlite3.connect(":memory:")
def run(n):
mod = importlib.import_module(f"suspects.mod{n:04}")
for i in range(1025):
if i > 1015:
ccc = [a(sqlite3.connect(":memory:")) for _ in range(512)]
mod.f(A(), A()) No error seems to occur when using "--obf-code 0". However, considering the results so far, it is a little difficult because all codes must be set to "--obf-code 0". (If the conditions that cause the problem are met, a remote encrypted file that is completely unrelated to those conditions will cause an error.) We also checked the behavior with version 8 (trial). old: pyarmor-7 obfuscate run.py --recursive If there is no problem with the new command, that's enough, but I would appreciate it if you could add it and find out what the problem was. |
Although the CPU is different, there appear to be many similarities in the conditions of occurrence and problems with issue #885. It may be the same case. (Confirmed that advanced 2 and restrict 2 options have nothing to do with reproducing the problem) |
@irreg If Python > 3.6, try it with Pyarmor 8. |
I have tried it on Pyarmor 8.4.3. I can reproduce the problem by executing the encrypted file with the following command.
I could not reproduce the problem when using the new command as shown below.
However, the new command does not work with Restrict Mode 100+ we were using. Because of this, We are seriously looking for a replacement for this feature. |
We further investigated the conditions under which the problem occurs in version 7 and found that the problem can be reproduced with any content of the mod.f function as long as the bytecode before encryption is 16, 48, 64, 192 or 448 ... instructions. Therefore, it will also occur in the following cases mod.py (48 Instructions)def f(a, c):
if None is not None:
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None
x = None |
How about only obfuscate First obfuscate all the scripts with normal options, save to Thanks for your efforts, but now Pyarmor 8 still has tasks, pyarmor.man, Refine RFT mode, enhancement for BCC mode, after that it's Pyarmor 7 bugs. The prefer solution is to upgrade Pyarmor to 8+ This issue may be as knonwn issue for Pyarmor 7 |
Thanks for the replies and suggestions. In this way, it seems most likely to move to the version 8 license, although there are some challenges in the transition. |
@irreg, Hi. Have you managed to figure out a way to detect builds that will (or are likely to) crash? Also, if you have any new details to share about this bug, I'd be happy to know, as I'm having the same issue. |
@Thoufak Compared to the above, it is still more practical to check if a function meets the conditions for a problem to occur. import target # Modules you want to check
def check_callable(name, obj):
if callable(obj) and hasattr(obj, "__code__"):
inst_len = len(obj.__code__.co_code) // 2
if inst_len in (16, 48, 64, 192, 448):
print(f"found: {name}")
print(inst_len)
def check_class(obj):
if isinstance(obj, type):
for inner_name, inner_obj in obj.__dict__.items():
check_class(inner_obj)
check_callable(inner_name, inner_obj)
for name, obj in target.__dict__.items():
check_class(obj)
check_callable(name, obj)
|
@irreg, thank you very much for the response. I don't think I will switch to pyarmor 8 any time soon. So, it seems that I could write a tool that would check the instructions count across all my codebase and, if It finds something with an unwanted number of them, it can append one meaningless instruction. Quite tedious, but I'm glad there's finally at least some hope for this issue to be gone (I've been living with it for over a year now). By the way, is this a complete list or there may be more values to be discovered? if inst_len in (16, 48, 64, 192, 448): |
@Thoufak search.batpython -m venv .venv
call .venv\Scripts\activate.bat
pip install pyarmor==*.*.*
rem echo Change the search range if necessary
for /L %%i in (3,1,512) do (
python gen_mod.py %%i
python duplicate.py
cd test
rd dist /s /q
pyarmor obfuscate run.py --recursive
cd dist
python run.py > ..\..\%%i_result.txt
cd ../../
)
pause gen_mod.pyimport sys
i = int(sys.argv[1])
with open("mod.py", "w") as f:
f.write("def f(a, c):\n")
if i == 130:
extend = True
i -= 2
else:
if i > 130:
i -= 1
extend = False
if i <= 3:
f.write(" pass\n")
sys.exit()
i -=2 # Correct the number of instructions due to return statement
if i < 6:
if i > 3:
f.write(" x = None\n")
i -= 2
if i == 3:
f.write(" list()\n")
else:
f.write(" x = None\n")
sys.exit()
f.write(" if None is not None:\n")
i -= 4
while i > 0:
if i == 3:
f.write(" list()\n")
break
f.write(" x = None\n")
i -= 2
if extend:
f.write(" x = None\n") run.py, main.py, sub.py ver.2, duplicate.py in the above example are required for execution |
I just do a test with
It broken with this exception:
Is it same as your case? |
Same error contents. other than 16 instructions
16 instructions
|
Got it, thanks. |
I spent almost 2 days on this issue, still not found the reason. But if using the python option
Without With And when I try to test plain script with Python 3.9
It seems it's related to circular import. And there is no problem for Python 3.10
Now I put it aside until I have new idea, maybe it's only failed in Python 3.8 |
In python 3.10, the bytecode seems to have changed slightly from previous versions. mod.py (16 Instructions for python 3.10 or later)def f(a, c):
if None is not None:
x = None
x = None
x = None
x = None gen_mod.py (for python3.10 or later)import sys
i = int(sys.argv[1])
with open("mod.py", "w") as f:
f.write("def f(a, c):\n")
if i == 258:
extend = True
i -= 2
else:
if i > 258:
i -= 1
extend = False
if i <= 3:
f.write(" pass\n")
sys.exit()
i -=2 # Correct the number of instructions due to return statement
if i < 8:
if i > 5 :
f.write(" x = None\n")
i -= 2
if i > 3:
f.write(" x = None\n")
i -= 2
if i == 3:
f.write(" list()\n")
else:
f.write(" x = None\n")
sys.exit()
f.write(" if None is not None:\n")
i -= 6
while i > 0:
if i == 3:
f.write(" list()\n")
break
f.write(" x = None\n")
i -= 2
if extend:
f.write(" x = None\n")
f.write(" x = None\n") |
Encryption may generate functions that cause Python to crash with a memory access violation (0xc0000005) after 1024 or 2048 calls.
The exe is generated from source code containing about 5000 functions, and this has occurred 4 times in about 200 encryptions. (In other words, a problematic function is generated with low probability, and if we generate it again with the same py file, it will not occur.)
The function that causes the problem appears to change randomly each time it is generated.
The command at the time of generation is as follows
The above command generates a single exe file, but the problem also occurs when the exe file is disassembled back into a .py file and executed
I tried to find out as much as possible about one of the files where the problem occurred.
It seemed that the error occurred at the moment of calling a particular encrypted function from a particular encrypted function in another file. I replaced one of the two files with an unencrypted one and the problem no longer occurs.
However, I have not been able to get any detailed results because of the encryption.
Is there any possible known or resolved cause?
I will add more if I find out anything else.
OS: Windows 10 x64
The text was updated successfully, but these errors were encountered: