Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Arrow] Another try at building Arrow #5425

Merged
merged 22 commits into from
Nov 6, 2022

Conversation

evetion
Copy link
Contributor

@evetion evetion commented Sep 4, 2022

Previous attempts at #918 and #1645. This is a long shot, but I really like to have #5423 working with GeoParquet.

@giordano giordano added the long shot 🏹 This is going to be fun label Sep 4, 2022
A/Arrow/build_tarballs.jl Outdated Show resolved Hide resolved
A/Arrow/build_tarballs.jl Outdated Show resolved Hide resolved
@evetion
Copy link
Contributor Author

evetion commented Sep 5, 2022

It seems xsimd and re2 are required libraries that are not available on Yggdrasil (yet), so it will try to download and compile them on the fly, loosing our toolchain context in the process.

edit: I've added a RE2 build. xsimd seems to be header only, so that should "just work".

@evetion evetion mentioned this pull request Sep 7, 2022
@evetion
Copy link
Contributor Author

evetion commented Sep 11, 2022

Current state:

Windows builds failing on linking with:

[15:48:20] /opt/bin/x86_64-w64-mingw32-libgfortran5-cxx11/x86_64-w64-mingw32-g++ --sysroot=/opt/x86_64-w64-mingw32/x86_64-w64-mingw32/sys-root/  -Wno-noexcept-type  -fdiagnostics-color=always -O3 -DNDEBUG  -Wa,-mbig-obj -Wall -fno-semantic-interposition -mxsave  -O3 -DNDEBUG -Wl,--version-script=/workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/src/arrow/symbols.map -shared -o ../../release/libarrow.dll -Wl,--out-implib,../../release/libarrow.dll.a -Wl,--major-image-version,900,--minor-image-version,0 -Wl,--whole-archive CMakeFiles/arrow_shared.dir/objects.a -Wl,--no-whole-archive @CMakeFiles/arrow_shared.dir/linklibs.rsp
[15:48:32] /opt/x86_64-w64-mingw32/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/bin/ld: /opt/x86_64-w64-mingw32/x86_64-w64-mingw32/sys-root/lib/../lib/libmsvcrt.a(lib64_libmsvcrt_os_a-wcrtomb.o): in function `wcsrtombs':
[15:48:32] /workspace/srcdir/mingw-w64-v7.0.0/mingw-w64-crt/misc/wcrtomb.c:58: multiple definition of `wcsrtombs'; /workspace/destdir/lib/libzstd.dll.a(d000591.o):(.text+0x0): first defined here
[15:48:32] /opt/x86_64-w64-mingw32/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/bin/ld: CMakeFiles/arrow_shared.dir/objects.a(scalar_string_utf8.cc.obj):scalar_string_utf8.cc:(.text+0x625): undefined reference to `__imp_utf8proc_category'
[15:48:32] /opt/x86_64-w64-mingw32/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/bin/ld: CMakeFiles/arrow_shared.dir/objects.a(scalar_string_utf8.cc.obj):scalar_string_utf8.cc:(.text+0x10cc): undefined reference to `__imp_utf8proc_get_property'
[15:48:32] /opt/x86_64-w64-mingw32/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/bin/ld: CMakeFiles/arrow_shared.dir/objects.a(scalar_string_utf8.cc.obj):scalar_string_utf8.cc:(.text+0x1129): undefined reference to `__imp_utf8proc_category'
[15:48:32] /opt/x86_64-w64-mingw32/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/bin/ld: CMakeFiles/arrow_shared.dir/objects.a(scalar_string_utf8.cc.obj):scalar_string_utf8.cc:(.text+0x1229): undefined reference to `__imp_utf8proc_get_property'
[15:48:32] /opt/x86_64-w64-mingw32/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/bin/ld: CMakeFiles/arrow_shared.dir/objects.a(scalar_string_utf8.cc.obj):scalar_string_utf8.cc:(.text+0x1308): undefined reference to `__imp_utf8proc_category'
...

FreeBSD failing (and some linux musl builds) on

[15:45:25] cd /workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/build_dir/src/arrow && /usr/bin/ccache /opt/bin/x86_64-unknown-freebsd12.2-libgfortran5-cxx11/x86_64-unknown-freebsd12.2-clang++ --sysroot=/opt/x86_64-unknown-freebsd12.2/x86_64-unknown-freebsd12.2/sys-root/ -DARROW_EXPORTING -DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_WITH_BACKTRACE -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DBOOST_ALL_NO_LIB -DURI_STATIC_BUILD -I/workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/build_dir/src -I/workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/src -I/workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/src/generated -isystem /workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/thirdparty/flatbuffers/include -isystem /workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/thirdparty/hadoop/include -isystem /workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/build_dir/xsimd_ep/src/xsimd_ep-install/include -Qunused-arguments -fcolor-diagnostics -O3 -DNDEBUG  -Wall -Wno-unknown-warning-option -Wno-pass-failed  -O3 -DNDEBUG -fPIC -pthread -std=c++11 -MD -MT src/arrow/CMakeFiles/arrow_objlib.dir/compute/exec/sink_node.cc.o -MF CMakeFiles/arrow_objlib.dir/compute/exec/sink_node.cc.o.d -o CMakeFiles/arrow_objlib.dir/compute/exec/sink_node.cc.o -c /workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/src/arrow/compute/exec/sink_node.cc
[15:45:27] /workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/src/arrow/util/cpu_info.cc:303:11: error: use of undeclared identifier '_SC_LEVEL1_DCACHE_SIZE'
[15:45:27]           _SC_LEVEL1_DCACHE_SIZE,
[15:45:27]           ^
...

macOS and some Linux builds are failing on trying to configure jemalloc, so it wasn't found.

Some Linux arm builds failing on finding BZip2.

@giordano
Copy link
Member

giordano commented Sep 11, 2022

If you want to make everybody's life easier, when you show linking errors you also want to show the compiler invocation line. Because the undefined reference to error means that the linker can't find the mentioned symbol in any of the linked libraries, which often simply means that the library which is supposed to implement the missing symbol wasn't passed as argument. But if I don't see the compiler invocation I can't tell you "see, that's the problem".

@evetion
Copy link
Contributor Author

evetion commented Sep 11, 2022

If you want to make everybody's life easier, when you show linking errors you also want to show the compiler invocation line.

Yep 👍🏻, thanks for the feedback, updated my post and linked to the logs.

@giordano
Copy link
Member

Yeah, the compiler invocation

[15:48:20] /opt/bin/x86_64-w64-mingw32-libgfortran5-cxx11/x86_64-w64-mingw32-g++ --sysroot=/opt/x86_64-w64-mingw32/x86_64-w64-mingw32/sys-root/  -Wno-noexcept-type  -fdiagnostics-color=always -O3 -DNDEBUG  -Wa,-mbig-obj -Wall -fno-semantic-interposition -mxsave  -O3 -DNDEBUG -Wl,--version-script=/workspace/srcdir/arrow-apache-arrow-9.0.0/cpp/src/arrow/symbols.map -shared -o ../../release/libarrow.dll -Wl,--out-implib,../../release/libarrow.dll.a -Wl,--major-image-version,900,--minor-image-version,0 -Wl,--whole-archive CMakeFiles/arrow_shared.dir/objects.a -Wl,--no-whole-archive @CMakeFiles/arrow_shared.dir/linklibs.rsp

doesn't link utf8proc. Don't ask me how on earth this is supposed to work at all (unless they, wrongly, assume to do static linking everywhere).

@giordano giordano mentioned this pull request Sep 15, 2022
@evetion evetion marked this pull request as ready for review October 30, 2022 14:32
@evetion
Copy link
Contributor Author

evetion commented Oct 30, 2022

Fixed previous problems by just excluding zstd and utf8proc. Also disabled building utilities. Upgraded to Arrow v10.

Comment on lines 65 to 69
# CMake is doubling the suffixes...
if [[ "${target}" == *-mingw32 ]]; then
ln -s ${prefix}/lib/libthrift.dll.a ${prefix}/lib/libthrift.a.dll.a
ln -s ${prefix}/lib/libutf8proc.a ${prefix}/lib/libutf8proc.dll.a.a
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like something to fix in CMake, rather than creating files with wrong extensions which go into the tarball (at very least you should delete them at the end)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still of the opinion this should be fixed in CMake, not worked around in this way. If that's of any help, you want to look at to CMAKE_FIND_LIBRARY_SUFFIXES.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I will look into it fixing/upstreaming (also the other bugs).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now removed, turned out to be unnecessary in Arrow v10.

A/Arrow/build_tarballs.jl Outdated Show resolved Hide resolved
A/Arrow/bundled/patches/cxxflags.patch Outdated Show resolved Hide resolved
A/Arrow/bundled/patches/thrift.patch Show resolved Hide resolved
A/Arrow/build_tarballs.jl Outdated Show resolved Hide resolved
A/Arrow/build_tarballs.jl Outdated Show resolved Hide resolved
Comment on lines 65 to 69
# CMake is doubling the suffixes...
if [[ "${target}" == *-mingw32 ]]; then
ln -s ${prefix}/lib/libthrift.dll.a ${prefix}/lib/libthrift.a.dll.a
ln -s ${prefix}/lib/libutf8proc.a ${prefix}/lib/libutf8proc.dll.a.a
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still of the opinion this should be fixed in CMake, not worked around in this way. If that's of any help, you want to look at to CMAKE_FIND_LIBRARY_SUFFIXES.

A/Arrow/build_tarballs.jl Outdated Show resolved Hide resolved
A/Arrow/build_tarballs.jl Outdated Show resolved Hide resolved
@evetion
Copy link
Contributor Author

evetion commented Nov 6, 2022

Made the following issues:

A/Arrow/build_tarballs.jl Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
long shot 🏹 This is going to be fun
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants