-
Notifications
You must be signed in to change notification settings - Fork 584
pp_ref() builtin_pp_reftype(): strlen()+Newx()+memcpy()->100% pre-made COWs #23391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
…e COWs -ref() PP keyword has extremely high usage. Greping my blead repo shows: Searched "ref(" 4347 hits in 605 files of 5879 searched -The strings keyword ref() returns are part of the Perl 5 BNF grammer. This is not up for debate. Changing their spelling or lowercasing them is not for debate, or i18n-ing them dynamically realtime against glibc.so's current OS process global locale is not up for debate or wiring, or wiring inotify/kqueue into the runloop to monitor /etc or /var so this race condition works as designed in a unit test: $perl -E "dire('hello')" Routine indéfinie &cœur::dire aufgerufen bei -e Zeile 1 -sv_reftype() and sv_ref() have very badly designed prototypes, and the first time a new Perl in C dev reads their source code, they will think these 2 will cause infinite C stack recursion and a SEGV. Probably most automated C code analytic tools will complain these 2 functions do infinite recursion too. -The 2 functions don't return a string length, forcing all callers to execute a libc strlen() call on a string, that could be 8 bytes, or 80 MB. -The 2 functions don't split, parse, cat, or glue multiple strings to create their output. All null term-ed strings that they return, are already sitting in virtual address space. Either const HW RO, or RCed HEK*s from the PL_strtab pool, that were found inside something similar to a GV*/HV*/HE*/CV*/AV*/GP*/OP*/SV* in a OP*(no threads). -COW 255 buffers from Newx() under 9 chars can't COW currently by policy. CODE is 4, SCALAR is 6. HASH is 4. ARRAY is 5. But very short SV HEK* COWs will COW propagate without problems. -PP code "if(ref($self) eq 'HASH') {}" should never involve all 3-4 calls Newx()/Realloc()/strlen()/memcpy(). So this fix all of this, and make pp_ref()/PP KW ref() be closer in speed to C/C++/Asm style object type checking, which is almost always going to be 1 or 2 or 3 ptr equality tests against C constant &sum_vtbl_sum_class, or in Microsoft ecosystem SW, its a equality test of a 16 byte GUID in memory, against a 16 byte SSE literal stored in a SSE opcode (TLDR ver). Just convert backends sv_ref()/sv_reftype() to HEK* retvals, and convert the front end pp_*() ops to fetch HEK*s and return SV*s with POK_on SvPVX()== HEK*. In all likely hood, if right side of PP code is "if (ref($self) eq 'HASH') {}", during the execution of memcpy(pv1, pv2, len) as part of pp_eq, pv1 and pv2 are the same mem addr. But I didn't single step eq operator to verify that yet. -inside PP(pp_reftype) previously the branch sv_setsv(TARG, &PL_sv_undef); did not fire SMG, after this commit it does, IDK why it wasnt firing before, or consequences of SMG firing now on sv_set_undef(rsv); path. -I suspect "sv_setsv(TARG, &PL_sv_undef);" and "sv_set_undef(rsv);" are not perfect behavior copies of each other, in extreme/bizzare/user error and bad CPAN XS code situtations but I haven't found any side effects of the switch from sv_setsv(TARG, &PL_sv_undef); to sv_set_undef(rsv) Untested typothetical cases like sv_setsv(gv_star, &PL_sv_undef); sv_setsv(hv_star, &PL_sv_undef); sv_setsv(svt_regexp_star, &PL_sv_undef); sv_setsv(svt_invlist_star, &PL_sv_undef); sv_setsv(svt_object_star, &PL_sv_undef); sv_setsv(svt_io_star, &PL_sv_undef); -sv_sethek() has a severe pathologic performance problem, if args SV* dsv and HEK* src_hek, test true for if(SvPVX(dsv) == HEK_KEY(src_hek)) {}. But its still better than a strlen()/Newx()/memcpy()/push_save_stack()/ delayed_Safefree(); cycle. Any fix for this would be for the future. -these 2 functions are experimental for now, hence undocumented and not public API, if they are made public, arg "const int ob" should be removed because of its confusing faux-infinite recursion but not real life infinite recursion. The fuctions are exported so P5P hackers and CPAN XS devs (unsanctioned by P5P) can benchmark and research these 2 new functions using Inline::C/EU::PXS. -future improvements not done here, make sv_reftype() and sv_ref() wrappers around their HEK* counterparts. Note the HEK* must be RC++ed and stuffed in a new SV*, or a PAD TARG SV*, before the rpp_replace_1_1_NN(TARG); call because in artificial situations/fuzzing, strange things can happen during a SvREFCNT_dec_NN(); call, and the HEK* sitting in a C auto might get freed during the SvREFCNT_dec_NN(); -another improvement, sv_sethek(rsv, hek); is somewhat heavy, and doesn't have a shortcut, to RC-- an existing SVPV HEK* COW itself, instead it uses SV_THINKFIRST_***() and sv_force_normal***() to RC-- an existing SVPV HEK* COW. If the SV* PAD TARG, is being used over and over by ref() opcode, its always going to have a stale HEK* SVPVX() that needs to be RC--ed. -another improvement, check if(sv_reftypehek() == SvPVX(targ)) before calling sv_sethek(rsv, hek); -another improvement, beyond scope for me, make into 1 OP*/opcode: if(ref($self) eq 'HASH') and if(ref($self) eq 'ARRAY') -another improvement, dont deref my_perl->Iop/PL_ptr many times in a row. I didn't do any CPU opcode/instruction stripping in this commit. Thats for a future commit. -another improvement, investigate if most of large switch() inside Perl_sv_reftypehek() can be turned into a const I8 arr_of_PL_sv_consts_idxs[]; with a couple tiny special cases. -todo invert "if (!rsv) {" branch, so hot path (yes cached in PL_sv_consts). comes first in machine code/asm order.
On Sun, Jun 29, 2025 at 03:56:07PM -0700, bulk88 wrote:
-ref() PP keyword has extremely high usage. Greping my blead repo shows:
[ snip 100 further lines]
Please try to use meaningful commit summary lines and messages.
I tried to read the commit message. I had no idea what what the commit was
about, apart from something to do with a badly designed sv_ref/sv_reftype
API perhaps?. Looking at the actual diff I *guess* the commit is about
adding two new functions, sv_refhek and sv_reftypehek and then making use
of them to speed up pp_ref() etc.? And perhaps adding some new SV
constants? Who knows?
The commit also seems to have snuck in an unrelated change to pp_const().
…--
31 Dec 1661: "I have newly taken a solemne oath about abstaining from plays".
1 Jan 1662: "And after ... we went by coach to the play".
-- The Diary of Samuel Pepys
|
All tech decisions are documented with rational. Read them bullet point by bullet point. If I am the only Subject Matter Expert who knows the Perl VM C code, I can't really help out a React JSX SME or Go SME guru who tries to review the Perl C VM code. At that point I would have to offer a 6 hour pre-conference class at a TPRC or YAPCEU event on P5 VM C level design/optimization/O(n) complexity of interp internals to my students. Not a joke.
Correct. I didn't invent Current Utf8 isn't original to P5, but those 2 can't return a yes/no utf8 flag either. Also the backing storage and lifetime of those Returning HEKs always with the 2 new fns fixes pretty much every design problem I can think of. Returning new SV heads with RC=1, or new SV heads with RC=1+mortal, or accepting an in Also I decided returning the global/permanent SV*s, back to callers is a bad idea, I would have to mark the
Now what? Line 2 fatal errored. But if its a SVPV holding a HEK, it is silently decowed on line 2 without problems. Thats why the new API returns HEKs and doesn't use SV APIs. The SpiderMonkey JS engine's src code's initial commit is 1 year or max 2 years, after Perl 5's initial commit. So SM JS engine and Perl 5 engine are the same exact age. Since Netscape's/Mozilla's/Firefox's JS engine is very well used, tried, true, and tested for decades, borrowing design choices from it, can not be a bad idea. Perl's Rest of this is FF JS VM vs P5 VM management of CC/link time constants and how they appear on a C runtime level and at a ECMAScript/PP level. Spidermonkey calls them "Atoms", Perl calls them "HEK *"s or "U32 hash"s. Spidermonkey uses words like "Pinned" and "JSExternalString", to mean Perl's https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/String.h Here is a list of what Spidermonkey says are critical "" string/token/identifier literals that are required to run the JS engine. https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/CommonPropertyNames.h https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/Keywords.h https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/jsatom.cpp#L56 Spidermonkey Immortals https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/Id.cpp Spidermonkey has C global RW HEK*s structs baked into the engine (libperl.so or libspidermonkey.so) at CC time. Notice Spidermonkey has 1 byte long (latin 1) Immortal Currently in Perl, splitting a 8+24+16+16=64 bytes, detailed math: 8 SV* in AV* + 24 SV head + 16 XPV body + 8 OS malloc header + 16 min buf alloc rule of newSVpvn = 72 bytes offtopic: stolen buzzword/tech word from Perl VM lol https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/SelfHosting.cpp#L247 Not how SM burns in/attaches/binds https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/SelfHosting.cpp#L793 SM's analog of Here is your (davem's) short string experiment perl branch , as production code in SM https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/String-inl.h#L46 I think machine integers 0-99, can be converted to https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/String.h#L1043 This is for another ticket, but SM decided on > 1/4th unused space, or 75% mark, to do a https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/StringBuffer.cpp#L30 offtopic, the JS stack, internally is the OS's C stack with some tiny Asm tricks, generic RISC and stack grows up HPUX PARISC compliant https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/jsnativestack.cpp
It is unrelated, but a tiny meaningless change, not worth a PR on its own, and then 2 lines long commit in the P5P repo. I can BP that line now to see what is inside the SV*. It makes no machine code difference in -O1/-O2 before and after. But I can now set a BP on the line, and see what is inside the SV* struct. If someone doesn't like the change, it means they don't know what a C debugger is, or how to use one, and can't call themselves a professional C dev if the only C level diag tool they know how to use is |
Let me summarise the above: "I am very smart." |
ref() PP keyword has extremely high usage. Greping my blead repo shows:
Searched "ref(" 4347 hits in 605 files of 5879 searched
The strings keyword ref() returns are part of the Perl 5 BNF grammer.
This is not up for debate. Changing their spelling or lowercasing them
is not for debate, or i18n-ing them dynamically realtime against
glibc.so's current OS process global locale is not up for debate or
wiring, or wiring inotify/kqueue into the runloop to monitor /etc or /var
so this race condition works as designed in a unit test:
sv_reftype() and sv_ref() have very badly designed prototypes, and the
first time a new Perl in C dev reads their source code, they will think
these 2 will cause infinite C stack recursion and a SEGV. Probably most
automated C code analytic tools will complain these 2 functions do
infinite recursion too.
The 2 functions don't return a string length, forcing all callers to
execute a libc strlen() call on a string, that could be 8 bytes, or 80 MB.
The 2 functions don't split, parse, cat, or glue multiple strings to
create their output. All null term-ed strings that they return, are
already sitting in virtual address space. Either const HW RO, or
RCed HEK*s from the PL_strtab pool, that were found inside something
similar to a GV*/HV*/HE*/CV*/AV*/GP*/OP*/SV* in a OP* (no threads).
COW 255 buffers from Newx() under 9 chars can't COW currently by policy.
CODE is 4, SCALAR is 6. HASH is 4. ARRAY is 5. But very short SV HEK* COWs
will COW propagate without problems.
PP code
if(ref($self) eq 'HASH') {}
should never involve all 3-4 callsNewx()/Realloc()/strlen()/memcpy().
So this fix all of this, and make pp_ref()/PP KW ref() be closer in speed
to C/C++/Asm style object type checking, which is almost always going to
be 1 or 2 or 3 ptr equality tests against C constant &sum_vtbl_sum_class,
or in Microsoft ecosystem SW, its a equality test of a 16 byte GUID in
memory, against a 16 byte SSE literal stored in a SSE opcode (TLDR ver).
Just convert backends sv_ref()/sv_reftype() to HEK* retvals, and convert
the front end pp_*() ops to fetch HEK*s and return SV*s with
POK_on SvPVX()== HEK*. In all likely hood, if right side of PP code is
if (ref($self) eq 'HASH') {}
, during the execution ofmemcpy(pv1, pv2, len) as part of pp_eq, pv1 and pv2 are the same mem addr.
But I didn't single step eq operator to verify that yet.
inside PP(pp_reftype) previously the branch sv_setsv(TARG, &PL_sv_undef);
did not fire SMG, after this commit it does, IDK why it wasnt firing
before, or consequences of SMG firing now on sv_set_undef(rsv); path.
I suspect "sv_setsv(TARG, &PL_sv_undef);" and "sv_set_undef(rsv);" are
not perfect behavior copies of each other, in extreme/bizzare/user error
and bad CPAN XS code situtations but I haven't found any side effects of
the switch from sv_setsv(TARG, &PL_sv_undef); to sv_set_undef(rsv)
Untested typothetical cases like
sv_setsv(gv_star, &PL_sv_undef); sv_setsv(hv_star, &PL_sv_undef);
sv_setsv(svt_regexp_star, &PL_sv_undef);
sv_setsv(svt_invlist_star, &PL_sv_undef);
sv_setsv(svt_object_star, &PL_sv_undef);
sv_setsv(svt_io_star, &PL_sv_undef);
sv_sethek() has a severe pathologic performance problem, if args
SV* dsv
andHEK* src_hek
, test true forBut its still better than a strlen()/Newx()/memcpy()/push_save_stack()/
delayed_Safefree(); cycle. Any fix for this would be for the future.
these 2 functions are experimental for now, hence undocumented and not
public API, if they are made public, arg
const int ob
should be removedbecause of its confusing faux-infinite recursion but not real life
infinite recursion. The fuctions are exported so P5P hackers and
CPAN XS devs (unsanctioned by P5P) can benchmark and research these 2 new
functions using Inline::C/EU::PXS.
future improvements not done here, make sv_reftype() and sv_ref() wrappers
around their HEK* counterparts. Note the HEK* must be RC++ed and stuffed
in a new SV*, or a PAD TARG SV*, before the rpp_replace_1_1_NN(TARG); call
because in artificial situations/fuzzing, strange things can happen during
a SvREFCNT_dec_NN(); call, and the HEK* sitting in a C auto might
get freed during the SvREFCNT_dec_NN();
another improvement, sv_sethek(rsv, hek); is somewhat heavy, and doesn't
have a shortcut, to RC-- an existing SVPV HEK* COW itself, instead it
uses SV_THINKFIRST_***() and sv_force_normal***() to RC-- an existing
SVPV HEK* COW. If the SV* PAD TARG, is being used over and over by ref()
opcode, its always going to have a stale HEK* SVPVX() that needs to be
RC--ed.
another improvement, check
if(sv_reftypehek() == SvPVX(targ))
beforecalling sv_sethek(rsv, hek);
another improvement, beyond scope for me, make into 1 OP*/opcode:
and
another improvement, dont deref my_perl->Iop/PL_ptr many times in a row.
I didn't do any CPU opcode/instruction stripping in this commit. Thats
for a future commit.
another improvement, investigate if most of large switch() inside
Perl_sv_reftypehek() can be turned into a
const I8 arr_of_PL_sv_consts_idxs[];
with a couple tiny special cases.todo invert
if (!rsv) {
branch, so hot path (yes cached in PL_sv_consts).comes first in machine code/asm order.