Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] SampleAsyncProducer causes core dump #435

Open
1 of 2 tasks
xiaoliu1019 opened this issue Jul 18, 2024 · 12 comments
Open
1 of 2 tasks

[Bug] SampleAsyncProducer causes core dump #435

xiaoliu1019 opened this issue Jul 18, 2024 · 12 comments
Assignees

Comments

@xiaoliu1019
Copy link

Search before asking

  • I searched in the issues and found nothing similar.

Version

pulsar cpp client 3.5.1

Minimal reproduce step

run the example: SampleAsyncProducer.cc

What did you expect to see?

it will produce the messages

What did you see instead?

it will have the coredump

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@BewareMyPower
Copy link
Contributor

Please provide more info:

  • Your OS and compiler
  • How did you use the library? Installed from the official pre-built libraries or built from source? If built from source, please provide the detailed steps.

@BewareMyPower BewareMyPower changed the title [Bug] [Bug] SampleAsyncProducer causes core dump Jul 18, 2024
@xiaoliu1019
Copy link
Author

The OS isTencentOS developed based on centos
I use the official pre-built libraries and my compiler is g++8.5.0
If I want to compile successfully, I must add compilation options --copt=-D_GLIBCXX_USE_CXX11_ABI=0
then this will result in coredump,when call the callback of sendAsync

@BewareMyPower BewareMyPower self-assigned this Jul 18, 2024
@BewareMyPower
Copy link
Contributor

BewareMyPower commented Jul 18, 2024

Oh, this issue can be reproduced. Assigned it to me first.

(gdb) bt
#0  std::_Function_handler<void (pulsar::Result, pulsar::MessageId const&), void (*)(pulsar::Result, pulsar::MessageId const&)>::_M_invoke(std::_Any_data const&, pulsar::Result&&, pulsar::MessageId const&) (__functor=..., 
    __args#0=<error reading variable>, __args#1=...) at /usr/include/c++/8/bits/std_function.h:297
#1  0x0000ffff820a89a8 in std::function<void (pulsar::Result, pulsar::MessageId const&)>::operator()(pulsar::Result, pulsar::MessageId const&) const (this=<optimized out>, __args#0=<optimized out>, __args#1=...)
    at /usr/include/c++/4.8.2/functional:2471
#2  0x0000ffff82079c04 in std::function<void (pulsar::Result, pulsar::MessageId const&)>::operator()(pulsar::Result, pulsar::MessageId const&) const (__args#1=..., __args#0=pulsar::ResultOk, this=0xffff7c006760)
    at /usr/include/c++/4.8.2/functional:2471
#3  pulsar::completeSendCallbacks (id=..., result=pulsar::ResultOk, callbacks=std::vector of length 1, capacity 1 = {...}) at /usr/src/debug/apache-pulsar-client-cpp-3.5.1/lib/MessageAndCallbackBatch.cc:94
#4  pulsar::MessageAndCallbackBatch::__lambda5::operator() (id=..., result=pulsar::ResultOk, __closure=0xffff7c004f10) at /usr/src/debug/apache-pulsar-client-cpp-3.5.1/lib/MessageAndCallbackBatch.cc:100

@BewareMyPower
Copy link
Contributor

It seems to be the libstdc++ incompatibility in GCC 4.8.

It should already be fixed by #428. Could you try the RPM packages in https://github.com/BewareMyPower/pulsar-client-cpp/actions/runs/9535942883

@xiaoliu1019
Copy link
Author

Oh, I use the new pre-built libraries you gave https://github.com/BewareMyPower/pulsar-client-cpp/actions/runs/9535942883
When I run this program a core dump

2024-07-19 15:42:50.879 INFO  [140737353004672] ClientConnection:187 | [<none> -> pulsar://21.6.118.142:6650] Create ClientConnection, timeout=2000
2024-07-19 15:42:50.879 INFO  [140737353004672] ConnectionPool:124 | Created connection for pulsar://21.6.118.142:6650-pulsar://21.6.118.142:6650-0
[New Thread 0x7fffdc8ca700 (LWP 58501)]
2024-07-19 15:42:50.881 INFO  [140736901986048] ClientConnection:403 | [21.6.92.133:51892 -> 21.6.118.142:6650] Connected to broker
Missing separate debuginfos, use: dnf debuginfo-install bash-4.4.20-4.tl3.tencentos.x86_64 brotli-1.0.6-3.tl3.x86_64 cyrus-sasl-lib-2.1.27-6.tl3.x86_64 glibc-2.28-225.tl3.6.x86_64 keyutils-libs-1.5.10-9.tl3.x86_64 krb5-libs-1.18.2-22.tl3.x86_64 libcom_err-1.45.6-5.tl3.x86_64 libcurl-7.61.1-33.tl3.x86_64 libgcc-8.5.0-18.tl3.x86_64 libidn2-2.2.0-1.tl3.x86_64 libnghttp2-1.33.0-5.tl3.x86_64 libpsl-0.20.2-6.tl3.x86_64 libselinux-2.9-8.tl3.x86_64 libssh-0.9.6-10.tl3.x86_64 libstdc++-8.5.0-18.tl3.x86_64 libxcrypt-4.1.1-6.tl3.x86_64 pcre2-10.32-3.tl3.x86_64 zlib-1.2.11-21.tl3.x86_64
--Type <RET> for more, q to quit, c to continue without paging--

Thread 61 "Pulsar_producer" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffdd0cb700 (LWP 58500)]
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000000aa8c48 in google::protobuf::MessageLite::SerializePartialToArray (this=0x7fffdd0acd60, data=0x7fffcc001fb8, size=41)
    at external/protobuf_archive/src/google/protobuf/message_lite.cc:489
#2  0x0000000000aa97ba in google::protobuf::MessageLite::SerializeToArray (this=0x7fffdd0acd60, data=0x7fffcc001fb8, size=41)
    at external/protobuf_archive/src/google/protobuf/message_lite.cc:481
#3  0x00007ffff6f37305 in pulsar::Commands::writeMessageWithSize(pulsar::proto::BaseCommand const&) () from /lib/libpulsar.so
#4  0x00007ffff6f37f3a in pulsar::Commands::newConnect(std::shared_ptr<pulsar::Authentication> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, pulsar::Result&) () from /lib/libpulsar.so
#5  0x00007ffff6eec8b8 in pulsar::ClientConnection::handleHandshake(std::error_code const&) () from /lib/libpulsar.so
#6  0x00007ffff6eef0ab in pulsar::ClientConnection::handleTcpConnected(std::error_code const&, asio::ip::basic_resolver_iterator<asio::ip::tcp>) () from /lib/libpulsar.so
#7  0x00007ffff6ef059b in pulsar::ClientConnection::handleResolve(std::error_code const&, asio::ip::basic_resolver_iterator<asio::ip::tcp>)::{lambda(std::error_code const&)#2}::operator()(std::error_code const&) const () from /lib/libpulsar.so
#8  0x00007ffff6ef0a04 in asio::detail::reactive_socket_connect_op<pulsar::ClientConnection::handleResolve(std::error_code const&, asio::ip::basic_resolver_iterator<asio::ip::tcp>)::{lambda(std::error_code const&)#2}, asio::any_io_executor>::do_complete(void*, asio::detail::scheduler_operation*, std::error_code const&, unsigned long) ()
   from /lib/libpulsar.so
#9  0x00007ffff6f024f6 in asio::detail::epoll_reactor::descriptor_state::do_complete(void*, asio::detail::scheduler_operation*, std::error_code const&, unsigned long) ()
   from /lib/libpulsar.so
#10 0x00007ffff6f01e32 in asio::detail::scheduler::run(std::error_code&) () from /lib/libpulsar.so
#11 0x00007ffff6f7a37a in pulsar::ExecutorService::start()::{lambda()#1}::operator()() const [clone .isra.240] () from /lib/libpulsar.so
#12 0x00007ffff762cf43 in execute_native_thread_routine () from /lib/libpulsar.so
#13 0x00007ffff68e91ca in start_thread () from /lib64/libpthread.so.0
#14 0x00007ffff5a22e73 in clone () from /lib64/libc.so.6

Here are my compilation options:

build --compilation_mode=dbg
build --cxxopt="--std=c++17"
build --copt=-O2
test --cache_test_results=no --test_output=errors

@BewareMyPower
Copy link
Contributor

It's suspicious about the path external/protobuf_archive/src/google/protobuf/message_lite.cc. The library was built via vcpkg and I cannot find any directory named protobuf_archive. Could you check the link paths of the dynamic library and the executable via ldd?

BTW, could you try building in release mode?

@xiaoliu1019
Copy link
Author

I can build in release mode,but the same error occurs when it runs
the ldd:

 ldd ./bazel-bin/example/Pulsar_producer_client
        linux-vdso.so.1 (0x00007ffc4fbbc000)
        /$LIB/libonion.so => /lib64/libonion.so (0x00007f04a54d0000)
        libcurl.so.4 => /lib64/libcurl.so.4 (0x00007f04a501e000)
        libpulsar.so => /lib/libpulsar.so (0x00007f04a3fe0000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f04a3dc0000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f04a3a3e000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f04a383a000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f04a34a5000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f04a328d000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f04a2ec8000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f04a52ad000)
        libnghttp2.so.14 => /lib64/libnghttp2.so.14 (0x00007f04a2ca1000)
        libidn2.so.0 => /lib64/libidn2.so.0 (0x00007f04a2a83000)
        libssh.so.4 => /lib64/libssh.so.4 (0x00007f04a2813000)
        libpsl.so.5 => /lib64/libpsl.so.5 (0x00007f04a2602000)
        libssl.so.1.1 => /lib64/libssl.so.1.1 (0x00007f04a2366000)
        libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f04a1e74000)
        libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007f04a1c1f000)
        libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007f04a1935000)
        libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007f04a171e000)
        libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f04a151a000)
        libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007f04a5470000)
        liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x00007f04a545e000)
        libbrotlidec.so.1 => /lib64/libbrotlidec.so.1 (0x00007f04a544f000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f04a1302000)
        libunistring.so.2 => /lib64/libunistring.so.2 (0x00007f04a0f81000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f04a0d79000)
        libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007f04a0b68000)
        libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007f04a5446000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f04a0950000)
        libsasl2.so.3 => /lib64/libsasl2.so.3 (0x00007f04a5424000)
        libbrotlicommon.so.1 => /lib64/libbrotlicommon.so.1 (0x00007f04a5401000)
        libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f04a0725000)
        libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f04a53d6000)
        libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f04a04a1000)

it looks relatively normal

@BewareMyPower
Copy link
Contributor

This binary links so many unrelated dynamic libraries. Most of them are from libcurl and OpenSSL.

Which library did you use? Currently, it would be better to link libpulsar.so or libpulsarwithdeps.a. It seems that you're using libpulsar.a and link to 3rd party dependencies from your system.

Besides, what is your compiler toolchain? Generally, if you're building directly via g++ like the guide here, it should work.

@xiaoliu1019
Copy link
Author

I use libpulsar.so from https://github.com/BewareMyPower/pulsar-client-cpp/actions/runs/9535942883,
It doesn't work just when I switch the version of libpulsar.so from 3.5.1 to 3.6.0.
And , i can work directly via g++ but can not wort when i use bazel(I use the default compiler toolchain)
Could it be that the bazel external dependency and libpulsar.so use different versions of external/protobuf_archive/src/google/protobuf/message_lite.cc?

@BewareMyPower
Copy link
Contributor

Could it be that the bazel external dependency and libpulsar.so use different versions of external/protobuf_archive/src/google/protobuf/message_lite.cc?

Yeah it's right. So I believe it's something wrong with your Bazel project. I have experiences with Bazel a few years ago. Could you share a minimum reproducible Bazel project?

@xiaoliu1019
Copy link
Author

xiaoliu1019 commented Aug 1, 2024

why the lib/Commands.cc include <pulsar/Version.h> ,but There is no such file
I can't compile this source code

@BewareMyPower
Copy link
Contributor

BewareMyPower commented Aug 1, 2024

This header was generated by CMake:

configure_file(templates/Version.h.in include/pulsar/Version.h @ONLY)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants