You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The intrinsic function _rdtsc() doesn't serialize the processor, so you'll get even more 'unstable' readings from the timescamp counter... It is prudent to use your own as in:
In newer processors (Sandy Bridge or superior, if I'm not mistaken), a single REP STOSB is faster than the combination of REP STOSD and REP STOSB... And even faster than using SIMD... So, your bzero() routine can be a single macro as:
If you like, this is my implementation based on your bzero approach:
#include<stddef.h>// This is the exported symbol for our function.void (*_bzero)(void*, size_t);
staticvoidenhanced_bzero(void*ptr, size_tsize)
{
__asm____volatile__ (
"xorb %%al,%%al\n\t""rep; movsb" : : "D" (ptr), "c" (size)
);
}
staticvoidmy_bzero(void*ptr, size_tsize)
{
// Store as many dwords as possible.__asm____volatile__ (
"rep; movsl" : "+D" (ptr) : "c" (size&-4), "a" (0)
);
// Store the remaining (maximum 3) bytes.__asm____volatile__ (
"rep; movsb" : : "D" (ptr), "c" (size&3), "a" (0)
);
}
// This will be called only on program initialization, nowhere else.
__attribute__((constructor))
staticvoidbzero_init(void)
{
intb;
// The CPU has the REP MOVSB/STOSB enhancement?__asm____volatile__ (
"cpuid" : "=b" (b) : "a" (7), "c" (0) :
#ifdef__x86_64"rdx"#else"edx"#endif
);
if (b& (1 << 9))
_bzero=enhanced_bzero;
else_bzero=my_bzero;
}
The intrinsic function _rdtsc() doesn't serialize the processor, so you'll get even more 'unstable' readings from the timescamp counter... It is prudent to use your own as in:
In newer processors (Sandy Bridge or superior, if I'm not mistaken), a single REP STOSB is faster than the combination of REP STOSD and REP STOSB... And even faster than using SIMD... So, your bzero() routine can be a single macro as:
[]s
Fred
The text was updated successfully, but these errors were encountered: