Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Allow simple fixed library names (as used in pkgsrc) #4252

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

drhpc
Copy link
Contributor

@drhpc drhpc commented Oct 8, 2023

This is the current state of OpenBLAS library naming for the pkgsrc packages (NetBSD and cross-platform).

The idea is to be able to

  1. be able to have a stable library name to expect from the build
  2. install multiple variants next to each other (single-threaded, parallel)

For this, the new switch FIXED_LIBNAME is introduced. The build uses

make FIXED_LIBNAME=1 LIBNAMESUFFIX=openmp

In the case of 64 bit indices, it uses

make FIXED_LIBNAME=1 LIBNAMESUFFIX=openmp INTERFACE64=1 LIBSONAMEBASE=openblas64

This results in what I assumed to be the intention of the name variables to begin with, namely primary predictable library names like

libopenblas_openmp.so
libopenblas64_openmp.so

(As pkgsrc keeps a list of expected file names resulting from a build on the user machine, too funky names with CPU model in there and symlinks are too funky.)

I see that the naming scheme of the 64 bit indices lib is still under discussion, along with symbol suffix. So far, we use the names without renamed symbols. They are all installable at the same time and user software can choose which one to use, also using the pkg-config file (which could be handled more elegantly). Without an established convention, I didn't think like we should make the call. Depends on what downstream packages use at some point.

I don't say that I do not cause more mess than necessary, or that I fully understand the already pesent mess. Just throwing this suggestion out there. Maybe my the FIXED_LIBNAME switch or something along that can be included. Or we work out something better that fulfills the demands of a pkgsrc installation.

See the github mirror of pgksrc for the patches in use.

Also, I hope I didn't break any other configuration.

And, lastly, one idea: If we settle on OpenMP always and the ILP64 symbols with suffix, we could just get away with building one libopenmp.so (with libtool versioning, perhaps, libopenblas.so.0.4.33 and symlinks …). That would also be great and lots of Makefile machinery could go. Until then, please consider my hack on top;-)

This is the current state of OpenBLAS library naming for the pkgsrc
packages (NetBSD and cross-platform).

The idea is to be able to

1) be able to have a stable library name to expect from the build
2) install multiple variants next to each other (single-threaded, parallel)

For this, the new switch FIXED_LIBNAME is introduced. The build uses

make FIXED_LIBNAME=1 LIBNAMESUFFIX=openmp

In the case of 64 bit indices, it uses

make FIXED_LIBNAME=1 LIBNAMESUFFIX=openmp  INTERFACE64=1 LIBSOBASENAME=openblas64

This results in what I assumed to be the intention of the name variables to
begin with, namely primary predictable library names like

libopenblas_openmp.so
libopenblas64_openmp.so

(As pkgsrc keeps a list of expected file names resulting from a build on the user
machine, too funky names with CPU model in there and symlinks are too funky.)

I see that the naming scheme of the 64 bit indices lib is still under discussion, along
with symbol suffix. So far, we use the names without renamed symbols. They are
all installable at the same time and user software can choose which one to use, also
using the pkg-config file (which could be handled more elegantly). Without an established
convention, I didn't think like we should make the call. Depends on what downstream
packages use at some point.

I don't say that I do not cause more mess than necessary, or that I fully understand
the already pesent mess. Just throwing this suggestion out there. Maybe my the
FIXED_LIBNAME switch or something along that can be included. Or we work out
something better that fulfills the demands of a pkgsrc installation.

See the github mirror of pgksrc for the patches in use.

https://github.com/NetBSD/pkgsrc/tree/trunk/math/openblas
https://github.com/NetBSD/pkgsrc/tree/trunk/math/openblas64
https://github.com/NetBSD/pkgsrc/tree/trunk/math/openblas_openmp
https://github.com/NetBSD/pkgsrc/tree/trunk/math/openblas64_openmp
https://github.com/NetBSD/pkgsrc/tree/trunk/math/openblas_pthread
https://github.com/NetBSD/pkgsrc/tree/trunk/math/openblas64_pthread

Also, I hope I didn't break any other configuration.

And, lastly, one idea: If we settle on OpenMP always and the ILP64 symbols with suffix, we
could just get away with building _one_ libopenmp.so (with libtool versioning, perhaps,
libopenblas.so.0.4.33 and symlinks …). That would also be great and lots of
Makefile machinery could go. Until then, please consider my hack on top;-)
@rgommers
Copy link
Contributor

Thanks for sharing the pkgsrc needs and details @drhpc, interesting.

I tried this PR with the given instructions above - the build works but the install doesn't; the static library name isn't yet handled correctly it looks like:

$ make FIXED_LIBNAME=1 LIBNAMESUFFIX=openmp
$ make PREFIX=$PWD/install-dir install
make -f Makefile.install install
make[1]: Entering directory '/home/rgommers/code/tmp/OpenBLAS'
Generating openblas_config.h in /home/rgommers/code/tmp/OpenBLAS/install-dir/include
Generating f77blas.h in /home/rgommers/code/tmp/OpenBLAS/install-dir/include
Generating cblas.h in /home/rgommers/code/tmp/OpenBLAS/install-dir/include
Copying LAPACKE header files to /home/rgommers/code/tmp/OpenBLAS/install-dir/include
Copying the static library to /home/rgommers/code/tmp/OpenBLAS/install-dir/lib
install: cannot stat 'libopenblas_skylakexp-r0.3.24.dev.a': No such file or directory
make[1]: *** [Makefile.install:66: install] Error 1
make[1]: Leaving directory '/home/rgommers/code/tmp/OpenBLAS'
make: *** [Makefile:411: install] Error 2

LIBSOBASENAME=openblas64

This should be LIBSONAMEBASE rather than LIBSOBASENAME it looks like (maybe edit your comment?).

be able to have a stable library name to expect from the build

You can already achieve this by just renaming the library after it's built, right? From my point of view, having to support a lot of ad-hoc naming conventions that different distros already make (I need to support auto-detection and linking against OpenBLAS in NumPy, SciPy and Meson), a free-form set-name-to-xxx switch is not great. I'd much rather extend the current naming scheme with fixed conventions and a boolean flag to say "give me those".

The current convention is openblas.so for LP64 and openblas64.so for ILP64. And now you want to extend that with:

install multiple variants next to each other (single-threaded, parallel)

That seems very reasonable. I think any fixed naming works. MKL has a very structured scheme that I think is clear; doing something similar but using pthread instead of MKL's sequential (and -seq in .pc files) makes sense. For OpenMP, one question that comes to mind is if it should distinguish between GNU/LLVM/Intel OpenMP?

If we settle on OpenMP always and the ILP64 symbols with suffix

OpenMP isn't always an option - for example on PyPI, where it's not possible to distribute an OpenMP runtime as a separate package, and vendoring it is not pretty (so we prefer pthreads instead for PyPI).

@martin-frbg
Copy link
Collaborator

Actually I'd prefer that distributions rename or softlink the current, limited set of library names according to their needs, rather than add another option. And I assume the current situation must be bad enough for third parties like flexiblas

@drhpc
Copy link
Contributor Author

drhpc commented Oct 12, 2023

@rgommers About static libs: You missed giving the setttings also to the install make process. This is transient variables — that is why you could write it into Makefile.rule to persist.

~/src/OpenBLAS$ make FIXED_LIBNAME=1 LIBNAMESUFFIX=openmp -j6
[…]
 OpenBLAS build complete. (BLAS CBLAS LAPACK LAPACKE)

  OS               ... Linux             
  Architecture     ... x86_64               
  BINARY           ... 64bit                 
  C compiler       ... GCC  (cmd & version : cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0)
  Fortran compiler ... GFORTRAN  (cmd & version : GNU Fortran (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0)
  Library Name     ... libopenblas_openmp.a (Multi-threading; Max num-threads is 12)

To install the library, you can run "make PREFIX=/path/to/your/installation install".

Note that any flags passed to make during build should also be passed to make install
to circumvent any install errors.
~/src/OpenBLAS$ make FIXED_LIBNAME=1 LIBNAMESUFFIX=openmp PREFIX=/dev/shm/openblas install
make -f Makefile.install install
[…]
Copying the static library to /dev/shm/openblas/lib
Copying the shared library to /dev/shm/openblas/lib
Install OK!
[…]
~/src/OpenBLAS$ find /dev/shm/openblas/
/dev/shm/openblas/
/dev/shm/openblas/bin
/dev/shm/openblas/lib
/dev/shm/openblas/lib/libopenblas_openmp.so
/dev/shm/openblas/lib/libopenblas_openmp.so.0
/dev/shm/openblas/lib/libopenblas_openmp.a
/dev/shm/openblas/lib/pkgconfig
/dev/shm/openblas/lib/pkgconfig/openblas_openmp.pc
/dev/shm/openblas/lib/cmake
/dev/shm/openblas/lib/cmake/openblas
/dev/shm/openblas/lib/cmake/openblas/OpenBLASConfigVersion.cmake
/dev/shm/openblas/lib/cmake/openblas/OpenBLASConfig.cmake
/dev/shm/openblas/include
/dev/shm/openblas/include/lapacke_utils.h
/dev/shm/openblas/include/lapacke_mangling.h
/dev/shm/openblas/include/lapacke_config.h
/dev/shm/openblas/include/lapacke.h
/dev/shm/openblas/include/lapack.h
/dev/shm/openblas/include/cblas.h
/dev/shm/openblas/include/f77blas.h
/dev/shm/openblas/include/openblas_config.h

This obviously misses OPENBLAS_INCLUDE_DIR and OPENBLAS_CMAKE_DIR settings, but both libs are there (sans soversioning like usual system libs). I'll reply to other points in another comment.

@drhpc
Copy link
Contributor Author

drhpc commented Oct 12, 2023

Compiler-specific variants next to each other: Oh, wow. I can even be more complex! @rgommers are you referring only to the OpenMP runtime itself or also having OpenBLAS built with a differing compiler? In any case, I think that's too far. I don't see other projects branching out installs for differing OpenMP runtimes. It is already a rare nuisance that a library has serial and parallel variants (right now, I can only think of fftw as another example, with similar naming issues … threads, MPI, single, double quad precision). I'd rather have that space reduced than enlarged.

For a packager, it is weird to re-build a library for a differing API/ABI subset and have use cases for both variants in the same system. I know where it comes from. My own atmospheric model code of course knows a switch to build with single or double precision. That's what scientists play with. But once such code enters mainstream system installs, it's weird. Offering the same API and ABI for differing data types and other conditions is a nightmare for keeping track of who is crashing because of which code.

That's why a symbol suffix is discussed/supported for ILP64 but not a standard from the beginning, sadly. With the suffix there, one could easily roll the 64 bit interface into the same library binary and just offer libopenblas without confusion. One could imagine adding openmp and pthread API into the same binary, for the somewhat ugly sideeffect that a serial or pthread application now links with openmp runtime for the shared lib.

(PyPI cannot ship an OpenMP runtime!? I did not think about that angle before. I must admit that openmp runtime as separate thing only entered my mind with pkgsrc, which caters for NetBSD installs that indeed miss libgomp. My world starts with a compiler toolchain, MPI library on top … and naturally includes an OpenMP lib. But of course, the world is wonderfully messy and colourful.)

Now, if there is a standard switch for non-conflicting library names and I can predict, I'd use it. It will be a annoying to switch to names that differ from what we settled on, but most pkgsrc installs use plain netlib where the user doesn't care, and people dealing with scientific code are used to some pain.

@martin-frbg The problem I had with the current names is that they are not (easily) predictable. If building with DYNAMIC_ARCH=0, I get

  Library Name     ... libopenblas_haswellp-r0.3.24.dev.a (Multi-threading; Max num-threads is 12)
[..]
~/src/OpenBLAS$ cat /proc/cpuinfo 
[…]
model name	: Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz
[…]

I can derive the r$VERSION part, I can even derive the p part (though pthread or openmp?). But how should a non-trivial build script know that a native build on a Core i7-10710 will cause the library to be named after Haswell?

If there were a fixed scheme that

  1. encodes ilp64 and parallelization differences to unique names,
  2. does not encode CPU variant, and
  3. has or has not library version (release tarball version) in the name,

I could use it for a build in pkgsrc that wants to know the resulting file names in advance. I won't die on the hill about the exact naming scheme. I don't like introducing additional switches, but I saw all that complicated naming machinery and saw how I could bend it to my will. It would have been fine would it always just be libopenblas.so.0 for me to rename.

But thinking about libopenblas.so.0

$ LANG=C readelf -d libopenblas_haswellp-r0.3.24.dev.so|grep SONAME
 0x000000000000000e (SONAME)             Library soname: [libopenblas.so.0]
$ LANG=C readelf -d /dev/shm/openblas/lib/libopenblas_openmp.so | grep SONAME
 0x000000000000000e (SONAME)             Library soname: [libopenblas_openmp.so.0]

If all variants have SONAME libopenblas.so.0 by default, simple renaming and symlinking the library just won't do. There can be only one libopenblas.so.0 that is found at runtime, regardless of which variant the binary was linked to! I'm glad that my hack at least gets that right. An application wanting libopenblas_openmp.so does get the respective library at runtime.

So I add to my list above: The standard build please have an SONAME that matches the primary library name. Binary distros that offer one choice of BLAS at a time via alternatives symlinks already hack the builds to always provide SONAME=libblas.so, right?

$ for blas in blas blas64; do l=$(readlink $(readlink /usr/lib/x86_64-linux-gnu/lib$blas.so)); echo "$blas: $l $(LANG= readelf -d $l|grep SONAME|awk '{print $5}')"; done
blas: /usr/lib/x86_64-linux-gnu/blas/libblas.so [libblas.so.3]
blas64: /usr/lib/x86_64-linux-gnu/openblas64-openmp/libblas64.so [libblas64.so.3]

Ah, libblas.so.3 it is … to match what current netlib does. That number is a mess in itself, having to ensure that the ABI really matches. But this is not the way we choose in pkgsrc, and more detailed in my installs of it in HPC environments. I focus on a build environment where people can choose which toolchain to use and which library to link into their applications. Each user does that by choosing environments with environment modules, but also via specific BLAS library names. Providing a separate environment for all possible choices wouldn't scale. As long as not everybody is shipping their own binary container, these two approaches to installing BLAS should prevail. One wants to clearly tell variants apart, but offer them at the same time, the other wants to blur the distinction to be able to fling binary symlinks around on a per-user (single-user Linux system) install.

@martin-frbg
Copy link
Collaborator

superseded by my #4485 , I think, though my approach may be less elegant ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants