Skip to content

Conversation

bjosv
Copy link
Collaborator

@bjosv bjosv commented Jun 12, 2025

We have found that there are a couple of issues when building with the UndefinedBehaviorSanitizer.
Some checkers in the UBSan arsenal are recoverable and just gives a printout to stderr, i.e. ctest does not fail.
Adding -fno-sanitize-recover=all triggers tests to exit instead, and issues are visible.

While enabling this a couple issues surfaces, like unaligned access warnings in 10 test binaries, and most are fixed by a change in pprintint.h and punaligned.h.

There is an open question about how to solve the issue in monster_test.c, I'm not sure if there is proper fix.
This warning is only seen on macOS and 32bit builds on ubuntu.
Any feedback/ideas?

Weekly test runs OK.

/* The UBSan will warn about storing to misaligned address. */
#if !(defined(__has_feature) && __has_feature(undefined_behavior_sanitizer)) && \
(defined(__i386__) || defined(__x86_64__) || defined(_M_IX86) || defined(_M_X64))
#define __print_unaligned_copy_16(p, q) (*(uint16_t*)(p) = *(uint16_t*)(q))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to avoid the special handling for the sanitizer here and use memcpy as default instead, but I see this was an active choice.
An alternative would be to add a build options similar to PORTABLE_UNALIGNED_ACCESS to let users choose here too. We would then have to define theses in CI to get the test target to pass which might be suboptimal..

@mikkelfj
Copy link
Contributor

mikkelfj commented Jul 17, 2025

Thanks for digging into this.

I'll need to look a bit closer to better understand the issue.

In general we don't want #if whatever, if can be avoided, except for dedicated portable/* files.

punligned.h is the ultimate reference so clearly this need to work. I'm not sure using sanitizer flags is right here, rather we need to understand the core issue and maybe adapt to the problematic platform using another branch in the file perhaps.

pprintint.h really should be using punaligned, but that might be an unnecessary dependency, but it depends on what we can come up with.

The general solution is to use memcpy as you mention, but it is not always optimized. Once did made a larger effort to ensure all runtime used memcpy, but it was involved, and resulted in some poor performance occasionally, but that was a while back.

I don't have any answers yet, and need to look into it a bit more. I'm also not very familiar with sanitizer thingy.

Another thought:

If we get this problem in monster_test.c, then other users are going to get it in their own code, so we definitely should not fix it in the test unless there is an actual bug.

@mikkelfj
Copy link
Contributor

mikkelfj commented Jul 17, 2025

Wrt. monstertest:

I think the problem may be that we reuse the pointer after changing the underlying content:

vec3 = ns(Vec3_as_root(buffer));
/* Convert buffer to native in place - a nop on native platform. */
v = (ns(Vec3_t) *)vec3;
ns(Vec3_from_pe(v));

I don't recall the exact interface of Vec3_from_pe(v) - whether it needs a typed pointer or not, but we need a new pointer cast from char * or maybe void * before we attempt to access the data.

// declare v_pe earlier ...
v_pe = (ns(Vec3_t) *)vec3;
ns(Vec3_from_pe(v_pe));
v = (nv(Vec3_t) *)(char *)vec3;

I don't have test setup handy atm.

It may also be that the problem is inside the operation: Vec3_from_pe. That is a bit more messy because then we have to dig into the runtime access interface:

https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/flatcc_accessors.h

@mikkelfj
Copy link
Contributor

mikkelfj commented Jul 17, 2025

wrt printint:

I think we should just use memcpy here, it's an isolated case and probably fast anywhere JSON printer performance matters.

EDIT: probably better to use mem_copy_word from pmemaccess.h and then just have one place to deal with.

Maybe we can do the same with punaligned, but I'm a bit concerned about performance.

bjosv added 4 commits July 24, 2025 10:51
Some checks are recoverable and will only print an error log to stderr.
Set build flag -fno-sanitize-recover=all so a testprocess halts with an
error status, triggering ctest runs to fail.

Signed-off-by: Björn Svensson <[email protected]>
@bjosv
Copy link
Collaborator Author

bjosv commented Jul 24, 2025

I have re-done some parts of the PR and investigated a bit more.

I used mem_copy_word in portable/pprintint.h as suggested, and found that the stack allocated buffer used in monster_test.c was unaligned on some targets.
Stack allocated buffers on s390x and -m32 builds are understandable not 16-byte aligned, but I'm not sure why its seen on Apple Silicon..

The last warnings comes from the json_parser optimizations, which I temporarily disabled.
I'll see if I can run some benchmarks to get their benefit.

Another finding is that we probably can get rid of portable/punaligned.h (or most of its content). I'ts only PORTABLE_UNALIGNED_ACCESS that is needed to set FLATCC_ALLOW_UNALIGNED_ACCESS in flatcc_unaligned.h.
The unaligned_xx-functions were only used by the now replaced cmetrohash, unless I have missed something..

@mikkelfj
Copy link
Contributor

mikkelfj commented Jul 25, 2025

som thoughts:

  • maybe we should add mem_copy_word_2 explicitly since we likely won't get around to token pasting optimizations. That way we can better control and fix potential loss of performance.

  • even if we don't use punaligned much, I consider the portable library a goto reference, if not used directly then to copy snippets from, clearly flatcc use case is primary.

  • I don't think we should assume anything about stack alignment at all. That sounds like a bug, and you seem to have fixed it.
    New users often get it wrong with char buf[BUFSIZE] to hold a flatbuffer, which is of course not aligned, but it seems like I also made that mistake. Note that instead of alignas(16) I think it is also possible to alignas(<type>) which might be more appropriate for struct buffers, but alignas(16) is also fine as long as the struct doesn't align more than that. The type version should be more robust if the struct were ever to be modified.

Background:
C11 stdalign (essentially alignas) is used widely in generated code, and since at the time, it was not widely supported, pstdalign provides the abstraction necessary. This should be used in any stack allocatoin that needs to be aligned:
https://github.com/dvidelabs/flatcc/blob/be/include/flatcc/portable/pstdalign.h

  • on json: you mention temporary change wrt.:
    if (n >= 8 && ((uintptr_t)buf % 8) == 0) {

sure, but for production the %8 test would defeat any kind of optimization attempted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants