Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Envoy build with newer versions of LLVM toolchain (e.g., clang-18) #37911

Open
krinkinmu opened this issue Jan 7, 2025 · 7 comments
Open
Assignees
Labels
area/build enhancement Feature requests. Not bugs or questions.

Comments

@krinkinmu
Copy link
Contributor

Title: Allow Envoy builds using newer toolchain versions

Description:

Currently Envoy CI image uses clang-14, I also observed that Envoy builds with clang-15 (though might need to disable some compiler warnings here and there). It would be good to be able to build using newer toolchain versions as well.

Specifically, in Microsoft, we've been a bit limited in the versions of clang available in our internal release pipeline and had to use clang-18/llvm-18. We were able to patch Envoy to build using clang-18 successfully and would like to contribute those changes to the upstream.

I'm filing this bug to track upstreaming of the changes. Most of the changes are to the Envoy dependencies, so it would be good to contribute them to the upstream repositories and just bump versions of the dependencies. Some of the changes are in Envoy build configs and may need a bit more digging to come up with a better fix than the one we have.

Here is the summary of the changes I made:

  1. In V8 disable a couple of compiler warnings: -Wno-deprecated-this-capture and -Wno-deprecated-declarations
  2. In protobuf disable a compiler warning: -Wno-deprecated-declarations
  3. Fix cel-cpp builds due to incomplete types: bump cel-cpp version to include 56d2baec1be480a5186bed5f094526e64025a144 and 18cf5685e93fb8c07c807d913fabe9d1b54db87f - addressed by c7d0d5a
  4. In ippcp/crypto_mb fix/replace the usage of __INLINE macro, basically in C and C++ (I think) all identifiers starting with two underscores (like __INLINE) are reserved for the compiler and standard library implementations; ippcp/crypto_mb library defines and uses __INLINE macro, but, unfortunately, newer versions of libc++ also have __INLINE macro for internal use and the __INLINE macro defined by the toolchain breaks ippcp/crypto_mb builds. I worked around that using this change krinkinmu@673c43d, but a proper fix, I think, is to stop using a reserved identifiers in ippcp code.
  5. In hessian2-codec there seem to be an issue in the Bazel configs, I don't know why this issue does not affect current Envoy builds, but I sent a PR to the hessian2-codec repo to address it: Use def_ref_codec_lib instead of including def_ref_code.{cc|hpp} as sources alibaba/hessian2-codec#37, we should wait until it will get merged and update the version of hessian2-code that Envoy uses
  6. In librdkafka there is a known issue with newer versions of LLVM tools (see librdkafka cannot be built with LLD linker confluentinc/librdkafka#4593), it's possible to work around it locally (see krinkinmu@9376599), but probably we should wait for the upstream fix and update the version we use
  7. colm and ragel dependencies use -lstdc++ flag for builds, for not very clear reasons, in my case it resulted in an attempt to statically link with libstdc++.so - I need to dig a bit more into this. I worked around that using krinkinmu@c8dc3ed, but maybe the problem is that I didn't have static libstdc++ installed on my system. Regardless, it's a bit confusing to have libstdc++ as a flag unconditionally, even for libc++ builds.
  8. Disable a new(?) compiler warning --cxxopt=-Wno-thread-safety-reference-return (I need to double check where this warning causes an issue exactly and maybe we can fix it without disabling the warning)
  9. Make sure that dwp tool from LLVM toolchain is actually used to aggregate debugging information - this might be specific to our setup, but for some reason in my tests the GNU version of dwp tool was used and that version couldn't handle debugging information produced by clang (I suspect that it's because GNU tools don't support fission that well). I worked around the issue by just redirecting dwp symlink to the llvm-dwp, but maybe in Bazel we should just have a CC toolchain configuration that picks the right dwp. This seem to be the most confusing issue from all the problems I had with clang-18 builds - more work is needed to verify if this issue was specific to Microsoft build setup or not and, if it's not specific to Microsoft build pipelines, what the proper fix should look like.

With those changes I was able to build Envoy + Envoy contrib (everything that ci/do_ci.sh release.server_only builds) with clang-18 and libc++.

+cc @phlax and @mathetake

@krinkinmu krinkinmu added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Jan 7, 2025
@zuercher zuercher added area/build and removed triage Issue requires triage labels Jan 7, 2025
@krinkinmu
Copy link
Contributor Author

I think the issue with ippcp/crypto_mb was already addressed upstream: intel/cryptography-primitives@ea7cd15#diff-698959b78bd16fcdcd662478d98aa000904c8da4a5574b244fea0dcdcb881258. We just need to update the dependency to [ippcp_2021.12.1](https://github.com/intel/cryptography-primitives/releases/tag/ippcp_2021.12.1) or later.

I'll check if Envoy has any issues with newer version of ippcp and will prepare a PR if it works fine (or will see how to fix things if it does not).

@mathetake
Copy link
Member

thank you for the detailed summary @krinkinmu I think we should prioritize this as clang-14's EOL seems almost there

jmarantz pushed a commit that referenced this issue Jan 13, 2025
Commit Message:

On newer version of Clang those generate a warning and with the settings
we use ultimately result in a compilation failure.

We probably should either disable the warning or just adjust the test.
Looking at the history of the test, it seems that these constants have
never been used, so I prefer to just drop them.

Additional Description:

It's related to #37911, it appears that the compiler warning that causes
this test fail to compile on clang 18 didn't exist in clang 14 that we
are currently using.

Risk Level: Low
Testing: by running changed test
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Mikhail Krinkin <[email protected]>
@krinkinmu
Copy link
Contributor Author

/assign @krinkinmu

@krinkinmu
Copy link
Contributor Author

alibaba/hessian2-codec#37 got merged to hessian2 upstream repo, so I'm going to prepare a version bump PR for Envoy.

krinkinmu added a commit to krinkinmu/envoy that referenced this issue Jan 14, 2025
The new version includes alibaba/hessian2-codec#37
that contains a fix for one of the issues listeed in
envoyproxy#37911 (number 5 on the list).

TL;DR: the issue is that in hessian2-codec the same cc file is used as
a source for two different libraries. The code that depends on both of
those libraries at the same time might have an issue because of the same
source used several times.

Signed-off-by: Mikhail Krinkin <[email protected]>
@krinkinmu
Copy link
Contributor Author

PR for the hessian2 dependency version refresh is out for review, once it's reviewed and hopefully merged, that should close item 5 on the list.

@krinkinmu
Copy link
Contributor Author

I poked confluentinc/librdkafka#4593 and related PR about the issue 6 on the list above, it seems that the PR to fix it was out for review since Oct 2024. It looks like it fell through the cracks, so I thought it might be appropriate to remind folks of that PR and the bug.

@phlax
Copy link
Member

phlax commented Jan 14, 2025

cc @adamkotwasinski

RyanTheOptimist pushed a commit that referenced this issue Jan 14, 2025
Commit Message:

The new version includes
alibaba/hessian2-codec#37 that contains a fix
for one of the issues listeed in
#37911 (number 5 on the list).

TL;DR: the issue is that in hessian2-codec the same cc file is used as a
source for two different libraries. The code that depends on both of
those libraries at the same time might have an issue because of the same
source used several times.

Additional Description: Related to #37911

Risk Level: Low

Testing:

Tested that Envoy builds with introduced changes, additionally run unit
tests in `//test/extensions/common/dubbo/...`
`//test/extensions/filters/network/dubbo_proxy/...` and
`//test/extensions/filters/network/generic_proxy/codecs/dubbo/...`.
dubbo is the extension that uses hessian2 codec (the full test suite
will run as part of the PR checks as well).

Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

---------

Signed-off-by: Mikhail Krinkin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build enhancement Feature requests. Not bugs or questions.
Projects
None yet
Development

No branches or pull requests

4 participants