Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove glibc dependency on and misuse of strerror_r(3) #8379

Closed
altimeter-130ft opened this issue Jan 15, 2024 · 5 comments
Closed

Remove glibc dependency on and misuse of strerror_r(3) #8379

altimeter-130ft opened this issue Jan 15, 2024 · 5 comments

Comments

@altimeter-130ft
Copy link

altimeter-130ft commented Jan 15, 2024

Description

Case A
src/flb_network.c uses strerror_r(3) in the way depending on glibc, which breaks the build on the non-glibc platform.

Case B
Some of the strerror_r(3) calls that compile but may give unexpected results because of misuses.

Environment

Case A
Any platform without glibc.
Confirmed on FreeBSD 14.0-RELEASE.

Case B
Any platform with glibc. (Virtually all Linux distros.)

Reproduction Steps

Case A
Build Fluent Bit on the platform without glibc, eg FreeBSD.

Case B
Call strerror_r(3) with a valid errno(2) as errnum.
(NB not covered by any existing tests)

Expected Behaviour

Case A
The build succeeds.

Case B
strerror_r(3) fills in the supplied buffer.

Observed Behaviour

Case A
The build fails with src/flb_network.c.

The logs hereafter are taken out of master as of f8bd0ce.

tanimura@pkgfactory2:/home/pkgfactory2/tanimura/work/fluentbit-git/fluent-bit/build % make
[  0%] Creating directories for 'backtrace'
[  0%] No download step for 'backtrace'
[  0%] No update step for 'backtrace'

(snip)

[ 37%] Building C object src/CMakeFiles/fluent-bit-static.dir/flb_network.c.o
/home/pkgfactory2/tanimura/work/fluentbit-git/fluent-bit/src/flb_network.c:526:17: error: incompatible integer to pointer conversion assigning to 'char *' from 'int' [-Wint-conversion]
            str = strerror_r(error, so_error_buf, sizeof(so_error_buf));
                ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
*** Error code 1

Stop.
make[2]: stopped in /home/pkgfactory2/tanimura/work/fluentbit-git/fluent-bit/build
*** Error code 1

Stop.
make[1]: stopped in /home/pkgfactory2/tanimura/work/fluentbit-git/fluent-bit/build
*** Error code 1

Stop.
make: stopped in /home/pkgfactory2/tanimura/work/fluentbit-git/fluent-bit/build
tanimura@pkgfactory2:/home/pkgfactory2/tanimura/work/fluentbit-git/fluent-bit/build % 

Case B
It is expected that the undefined content in the supplied buffer is returned as is.

Please refer to the analysis for the detail.

Analysis

Fluent Bit mixes the uses of strerror_r(3) in both the POSIX.1-2001 and glibc ways.

Case A
The use of strerror_r(3) in src/flb_network.c assumes glibc:

tanimura@pkgfactory2:/home/pkgfactory2/tanimura/work/fluentbit-git/fluent-bit/build % grep -Iwr strerror_r ../src/
../src/flb_network.c:            str = strerror_r(error, so_error_buf, sizeof(so_error_buf));
../src/flb_log.c:    strerror_r(errnum, buf, sizeof(buf) - 1);
../src/flb_io.c:                    strerror_r(error, so_error_buf, sizeof(so_error_buf) - 1);
tanimura@pkgfactory2:/home/pkgfactory2/tanimura/work/fluentbit-git/fluent-bit/build % 

The rest of the calls expect POSIX.1-2001 because their return values are not used.

Case B
The glibc implementation of strerror_r(3) returns the pointer to an internal const string without copying its content to the supplied buffer if errnum is valid.

The glibc strerror_r(3) implementation excerpt, taken out of the GNU Hurd glibc repository:

/* Return a string describing the errno code in ERRNUM.  */
char *
__strerror_r (int errnum, char *buf, size_t buflen)
{
  if (__builtin_expect (errnum < 0 || errnum >= _sys_nerr_internal
                        || _sys_errlist_internal[errnum] == NULL, 0))
    {
      /* snip, fill in buf with the message for an undefined errnum. */
      return buf;
    }

  return (char *) _(_sys_errlist_internal[errnum]);
}
weak_alias (__strerror_r, strerror_r)
libc_hidden_def (__strerror_r)

strerror_r(3) Difference Between POSIX.1-2001 and glibc

Aspect POSIX.1-2001 glibc
Return Type int char * (which is actually const)
Return Value Zero or errno(2) The pointer to a string, genrally not equal to the supplied buffer.
Supplied Buffer Always has the string. Untouched when errnum is valid.

Proposed Fix

PR: #8390

Supply the wrapper of strerror_r(3) with the POSIX.1-2001 signature and semantics. This solution gives the expected standard compatibility as well as the simple implementation to the caller. In addition, the wrapper can be implemented in a simple way under both the glibc and non-glibc libraries.

Remark

glibc changes the strerror_r(3) definition to POSIX.1-2001 if (_POSIX_C_SOURCE >= 200112L) && ! _GNU_SOURCE evaluates to true upon the preprocess. I do not recommend that as the fix because the definition of _POSIX_C_SOURCE has the risk of an unwanted side effect in the future.

@altimeter-130ft
Copy link
Author

The callers of strerror_r(3) in src use both the POSIX.1-2001 and glibc styles. This shows the lack of the unified coding style.

The strerror_r(3) symbols on Debian 12 x86-64:

tanimura@berget:~/work/fluentbit-git/fluent-bit/build$ nm src/CMakeFiles/fluent-bit-shared.dir/flb_io.c.o | grep strerror_r
                 U __xpg_strerror_r
tanimura@berget:~/work/fluentbit-git/fluent-bit/build$ nm src/CMakeFiles/fluent-bit-shared.dir/flb_log.c.o | grep strerror_r
                 U __xpg_strerror_r
tanimura@berget:~/work/fluentbit-git/fluent-bit/build$ nm src/CMakeFiles/fluent-bit-shared.dir/flb_network.c.o | grep strerror_r
                 U strerror_r
tanimura@berget:~/work/fluentbit-git/fluent-bit/build$ 

Such the workaround is so confusing to maintain. When a new call to strerror_r(3) has to be added to another source file, one has to check which standard should be followed in it. Also, it is not straightforward that each source follows the distinct standard on its own.

@altimeter-130ft
Copy link
Author

An update out of the CI fix upon #8390:

flb_strerror_r() has to cover strerror_s(3) of C11 as well, used by MSVC. This poses the following semantics change to flb_strerror_r():

  • flb_strerror_r() returns zero (success) when the returned string is truncated because the supplied buffer is too short.

This limitation comes from the boundary check design of C11. Instead of treating the string truncation as a success, it supplies strerrorlen_s() which returns the expected string length. POSIX.1-2001 does not have any equivalent functions, so flb_strerror_r() cannot support that, either.

The guide for the practical use of flb_strerror_r():

  • Prepare the buffer sufficiently long for the use after the call.
  • Accept that the returned string may be truncated.

@altimeter-130ft
Copy link
Author

Open issue:

  • An invalid errnum value.
    • This may happen when the kernel returns an errno not supported by the userland.
    • strerror_r(3) returns a valid string like Unknown error xx.
    • strerror_s(3) returns EINVAL.
    • Solution candidate:
      • Always behave in the strerror_r(3) style.
        • Under strerror_s(3), flb_strerror_r() generates the string on its own.

Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Apr 19, 2024
Copy link
Contributor

github-actions bot commented May 1, 2024

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant