-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perl_sv_clear: unrolled SvREFCNT_dec and sv_free2....isn't that?! #22798
Comments
@iabyn, it appears that much of this section of code entered the codebase back in 2010 in this commit:
Can you take a look? |
I've never liked svrefcountdec's or sv_free2()'s prototype, Its been on my mind since forever that it looks sketchy, and its putting too much logic and bloat in its one billion caller fns. Why can't it read the refcount field itself? why do all callers have to push that refcount arg on the reg-stack? Yes by chance AMD64 register R8 sneaks into the CPU's Win64 prototype and passes thru to callee frame without a dedicated mov op, but that just one cpu and one os. and doesn't apply for Win32 X86. Not going to git dig it, but I think that badly designed 3th arg, is only used for process-exiting, yes proc termination, why do all the caller's need that 3rd arg to be inlined into them? |
Another thought, perl does
Intel is playing catch up adding 3 part operators to x64 in 2023 https://en.wikipedia.org/wiki/X86#APX_(Advanced_Performance_Extensions) So perl's sv_ref_dec() should really be doing current blead
better not perfect
perfection would be just like MS COM ABI does its ref counting since x86/x64 is CISC and includes a free store to RAM, inside
If sv_free2() needs to ++ it for perl api global memory sanity, oh well just ++ again, don't burden the one billion callers that inline sv_ref_dec() . The ++ is meaningless for perf if your about to call the fat on any OS libc Your not saving anything perf wise trying to skip the assignment to RAM. The cache line is already sitting in secret registers, the cache line addr in DDR ram is already SMP lock-held in the "northbridge". Writes from a core to DDR are already async for decades. |
The interp has an addiction to malloc mem and mortal stack when C autos are appropriate. examples https://github.com/Perl/perl5/blob/blead/malloc.c#L2127 [ Why on earth is this not # ifdef rmved on Win32 when Perl's Unix putenv just calls Perl Win32's P5P controlled backend putenv at https://github.com/Perl/perl5/blob/blead/win32/win32.c#L2415 ? why does the backend win32 P5P putenv malloc and copy the string before the searching it? not even knowing if it will ever use the modified 2nd copy !!! why does the perl api act like perl's front end p5p putenv() must accept RO stored const strings when we knows its RW memory and trappable perl exceptions cant happen in syscalls? why does Win32 Perl, even bother converting from Microsoft-ese PP Native putenv API ( Perl's malloc addiction is is made exponentially worse by khw's semi-recent fixes to serialize and de-race condition Unix and Win32 libc locale API vs interp's locale API vs OS getcwd() and friends. example https://github.com/Perl/perl5/blob/blead/locale.c#L5206 khw's fixes are constantly making new malloc buffers nested as layers of strings process tools/fn calls get applied to get the final correct behavior needed. All these new malloc buffers are obviously added to save stack or mortal, then tossed/libc free() within a dozen microseconds. Note khw's code is implementing Unix On my Win32 blead perl, if I look through a process memory dump of Another really big perl XS C api design problem is, perl's keywords are correctly POSIX analogs, but over the last 25 years perl is gaining more and more bug fixes and new features to the PP keywords. Perl's middle layer C code, and XS author facing code/API/func call prototypes, keep adding 1970 Unix, C prototype arguments in new no-CPAN not exported functions, and sometimes in the CPAN-approved API. 1970 Unix API mandates 2 horrible API requirements, #1 all incoming char *s, are assumed to be immutable RO C strings owned by the fn's caller. #2 its sacrilegious desecration of a holy book to spend a precious 1, 2 or 4 bytes to record a strings length in RAM or spinning rust. Pascal is hellbent on genocide of the unix people, its a struggle for survival. Over the last 25 years Perl's middle C layer keeps implementing/adding more and more code, using 1970 Unix C strings, instead of moving around SV *s, which SOLVE all those 1970 problems. Its just bad to keep extending this, instead of leaving creation of the Perl C API does need some more thinking tho on SV* API for "RO" caller owned SV* arg-mode vs ownership takeover SV* arg-mode. ISO C I've profiled gmake as spending 15% of all CPU usage in libc strlen(), and 24% of all CPU usage in its string hash algorithm loop. gmake's code will never write I'd advise for all P5 core devs to once in a while, get a hot cup of coffee, start your C debugger, disable profiling/stepping of libc.so in your C debugger, put a rock on key F11, and watch what perl C code flashes by for a good 3 or 5 minutes while sipping something. And think are those lines of C code "justified" or not. You never know where that train will take you in perl C core. Thats how I write all my misc core PRs. |
I really want this issue to just be about the code following To achieve changes on those other topics, focussed Issues or Pull Requests (or topics on the mailing list) seem like better vehicles to me.
It's not immediately obvious that this is safe. From looking at |
I think that section of code made sense at the time. Changes to |
FWIW This builds and the test harness passes, but it might not be the best way to update this call site:
|
+1
For me this passed on an unthreaded build on Linux using both Tail of
|
Those compiler error both indicate that 2 arguments were passed where 3 were expected. Basically, Perl_sv_free2(aTHX_ sv, 0); |
Description
Towards the bottom of
Perl_sv_clear
, which is frequently a very hot function, there is the following comment:/* unrolled SvREFCNT_dec and sv_free2 follows: */
Sounds good, makes sense that it would unroll
SvREFCNT_dec
andsv_free2
.But that's not actually what it does:
At least nowadays,
sv_free
means callingPerl_sv_free
:SvREFCNT_dec
is an inline function insv_inline.h
:So instead of unrolling
SvREFCNT_dec
andsv_free2
, it calls a function (probably inlined)to call
SvREFCNT_dec
which likely will callsv_free2
. That's not unrolled at all!The likely impact of this is some SV freeing is taking longer than it strictly has to. Looking at
gcov coverage when running the test harness, there are 35099459 calls to
Perl_sv_free
and797851111 calls to
Perl_sv_free2
, so maybe ~4% of cleared SVs.Expected behavior
I haven't sat down to figure this out. It seems like the status quo in
Perl_sv_clear
must bewrong though and either the comment should be amended, (more likely) the call to
sv_free
is intended to be a recursive call back to
sv_free2
, or some other code change should happen.Besides
Perl_sv_clear
, only ext/Opcode/Opcode.xs and dist/Storable/Storable.xs seemto call
sv_free
directly. Probably they should be usingSvREFCNT_dec
or callingsv_free2
instead.
Perl configuration
blead
The text was updated successfully, but these errors were encountered: