Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cmake build system #1049

Merged
merged 8 commits into from
Sep 7, 2023
Merged

Conversation

trws
Copy link
Member

@trws trws commented Jul 12, 2023

This is an experiment to see if we can use a fake configure script to get the benefits of both cmake, and all the CI tooling flux projects rely on.

@trws
Copy link
Member Author

trws commented Jul 13, 2023

Hey @grondo, I'm playing with this a bit as a compromise alternative. There are still some rough edges on it, but what would you think of an approach like this that can still be used as though it had an autoconf-generated configure script? Non-trivial upside is it's already almost fully compatible with the CI scripts without modifying them.

@grondo
Copy link
Contributor

grondo commented Jul 13, 2023

I was just going to try this out and I can't figure out how to get it to configure against a side installed flux-core. The autotools build configures flux-sched using the flux-core corresponding to the first flux found in PATH. Therefore you can do

$ /path/to/flux-core/bin/flux ./configure ...

To configure against a flux-core with prefix /path/to-flux-core/. How can I do the same with cmake?

I tried PKG_CONFIG_PATH=/tmp/grondo/lib/pkgconfig/ cmake ., but get the following error that I can't make heads nor talis of:

$ PKG_CONFIG_PATH=/tmp/grondo/lib/pkgconfig/ cmake .
-- Checking for module 'flux-taskmap'
--   Found flux-taskmap, version 0.52.0-1-ga35e212a0
CMake Error at cmake/FindFluxCore.cmake:19 (add_library):
  add_library cannot create ALIAS target "flux::core" because target
  "PkgConfig::FLUX_CORE" is IMPORTED.
Call Stack (most recent call first):
  CMakeLists.txt:48 (find_package)


CMake Error at cmake/FindFluxCore.cmake:20 (add_library):
  add_library cannot create ALIAS target "flux::hostlist" because target
  "PkgConfig::FLUX_HOSTLIST" is IMPORTED.
Call Stack (most recent call first):
  CMakeLists.txt:48 (find_package)

@grondo
Copy link
Contributor

grondo commented Jul 13, 2023

note that if we go this route another TODO will be to update the RPM spec file in the TOSS build farm.

Also, not sure we can drop the bionic CI builds since we haven't dropped support for that distro yet. Was there something that doesn't work there?

@trws
Copy link
Member Author

trws commented Jul 13, 2023 via email

@trws
Copy link
Member Author

trws commented Jul 13, 2023 via email

@trws
Copy link
Member Author

trws commented Jul 14, 2023

I was just going to try this out and I can't figure out how to get it to configure against a side installed flux-core. The autotools build configures flux-sched using the flux-core corresponding to the first flux found in PATH. Therefore you can do

$ /path/to/flux-core/bin/flux ./configure ...

Is this supposed to work? I just tried it, and sched's autoconf doesn't configure under those conditions, it seems to look for flux-core only in PKG_CONFIG_PATH and the selected PREFIX. I just pushed an update that matches the ax_flux_core macro in the cmake search, it also now checks for the correct cmake version. Given anything 3.12 or higher basic cases like that or with compiler or prefix flags should work. Coverage, sanitizers, and "--with" options would still be TODO.

@grondo
Copy link
Contributor

grondo commented Jul 14, 2023

Yes, it works for me (I tend to use it most of the time to configure against a side-installed flux-core). E.g. for flux-core installed with prefix=/tmp/grondo:

$ /tmp/grondo/bin/flux ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
[snip]
checking for FLUX_CORE... yes
checking for flux... /tmp/grondo/bin/flux
checking for FLUX_IDSET... yes
checking for FLUX_SCHEDUTIL... yes
checking for FLUX_OPTPARSE... yes
checking for FLUX_HOSTLIST... yes
[snip]
  flux-sched version 0.28.0
  Prefix...........: /tmp/grondo
  Debug Build......: 
  C Compiler.......: gcc
  C++ Compiler.....: g++
  CFLAGS...........: -g -O2
  CPPFLAGS.......... 
  CXXFLAGS.......... -g -O2
  FLUX.............: /tmp/grondo/bin/flux
  FLUX_VERSION.....: 0.52.0-1-ga35e212a0
  FLUX_CORE_CFLAGS.: -I/tmp/grondo/include
  FLUX_CORE_LIBS...: -L/tmp/grondo/lib -lflux-core
  LIBFLUX_VERSION..: 0.52.0-1-ga35e212a0
  FLUX_PREFIX......: /tmp/grondo
  LDFLAGS..........: 
  LIBS.............: 
  Linker...........: /usr/bin/ld -m elf_x86_64

@grondo
Copy link
Contributor

grondo commented Jul 14, 2023

On TOSS4 systems, PYTHON=/usr/bin/python3 ./configure is required to successfully configure flux-core and flux-sched (the TCE python doesn't have the correct python deps). How do we do this with cmake? It seems to ignore PYTHON when run as cmake . or ./configure.

@trws
Copy link
Member Author

trws commented Jul 14, 2023

I'll add support for the env var, the cmake way is to do one of these:

-D Python_EXECUTABLE=<path to exe>
-D Python_ROOT_DIR=<path to install prefix>

@grondo
Copy link
Contributor

grondo commented Jul 14, 2023

Thanks, that worked!

@trws
Copy link
Member Author

trws commented Jul 14, 2023 via email

@grondo
Copy link
Contributor

grondo commented Jul 14, 2023

Ok, I had forgotten how this works. The main support was introduced in

b4bd5f0

which sets the default --prefix to the first flux(1) found in PATH :

commit b4bd5f09a2743600bd96d44f763d01e4d8ced501
Author: Mark A. Grondona <[email protected]>
Date:   Fri Feb 28 10:07:40 2020 -0800

    build: set --prefix to same prefix as flux(1) by default
    
    Use AC_PREFIX_PROGRAM([flux]) to have the default prefix of flux-sched
    set to the same prefix as the flux executable in PATH. For example,
    for a system with flux installed as /usr/bin/flux, prefix will default
    to /usr instead of /usr/local for flux-sched, unless
    
     ./configure --prefix=/usr/local
    
    is used.
    
    This also allows flux-sched to be configured against side-installed
    flux with simply:
    
     /path/to/side/install/bin/flux env ./configure

Then the stuff in ax_flux_core.m4 which prepends to PKG_CONFIG_PATH based on prefix takes over.

So, if you are using --prefix=/somewhere/else then that overrides the default prefix and the convenience hack to configure against side-installed flux-core is overridden.

@trws
Copy link
Member Author

trws commented Jul 14, 2023

Thanks @grondo, that helps a lot! I think I was using a wrapper script that set prefix and expecting it to still work because of the flux in path. I can replicate that in cmake pretty easily. Also tempted to add something to the cmake to find flux in path, then add $(dirname $(dirname $FLUX))/lib/pkgconfig to the PKG_CONFIG_PATH, that way even if they're in a separate prefix we get the "use the flux-core in path" behavior.

@grondo
Copy link
Contributor

grondo commented Jul 14, 2023

Sounds great to me!

@trws
Copy link
Member Author

trws commented Jul 14, 2023

Ok, finding flux-core by finding flux and matching prefix if it isn't explicitly set implemented. This allows flux-core to be found by path even when a different prefix is selected if the prefix for flux-core isn't explicitly specified. The order of priority is: FLUX_CORE_PREFIX > PKG_CONFIG_PATH > flux in path > default locations

@trws trws force-pushed the cmake-revert-checks branch 2 times, most recently from adfadf3 to c8dfa91 Compare July 24, 2023 18:35
@trws trws requested review from grondo and milroy July 25, 2023 20:24
@trws
Copy link
Member Author

trws commented Jul 25, 2023

Ok, this is now all there. I want to add a better C++ test system (Catch2 probably unless there are objections), include-what-you-use for CI, and a few other things, but this is a coherent build system and set of fixes. Given where it is, and the issues with boost I keep running into on some platforms, I would like to remove the autotools setup at some point, but not now. If we have a successful release using the cmake wrappers and everything, that's when I'd like to do it if that works for everyone.

@trws trws mentioned this pull request Jul 26, 2023
trws added a commit to trws/flux-core that referenced this pull request Aug 8, 2023
With gcc12+ the boost version we get in bookworm and jammy pops out a
maybe-uninitialized warning from one of its headers.  This seems to be
an issue with the sched build system, and should be fixed by the cmake
PR, but work around it for flux-sched check here until that's fixed
upstream.  Cross-linking to flux-framework/flux-sched#1049
@trws trws force-pushed the cmake-revert-checks branch 6 times, most recently from 7884632 to 9bbfe18 Compare August 8, 2023 16:52
@trws trws force-pushed the cmake-revert-checks branch 3 times, most recently from a1997a4 to 4e4e234 Compare August 22, 2023 17:33
@milroy
Copy link
Member

milroy commented Sep 5, 2023

I'm in the process of reviewing the PR, but I ran into a small issue in my bookworm Docker build:

milroy1@docker-desktop:/usr/src$ git config --global --add safe.directory /usr/src
milroy1@docker-desktop:/usr/src$ ./autogen.sh 
milroy1@docker-desktop:/usr/src$ ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var
-- VER 37a272b7
CMake Error at CMakeLists.txt:11 (project):
  VERSION "37a272b7" format invalid.

It looks like this is caused by the fork not containing the upstream tags. To get the project to build and test successfully, I needed to set the version to something like 0.28.0 in a new file (flux-sched.ver).

You might want to allow a user-specified environment variable that overrides the version returned by git describe --always ....

Copy link
Member

@milroy milroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge improvement to the build system in terms of speed, and I like the maintainability improvements, too. Thanks again for taking the time to walk me through the contributions last week!

I identified what I think is a minor problem with builds on Fluxion forks without tags, as well as a few small issues and nits related to commit messages. In general, the commit messages need to be wrapped, and in the standard Flux Framework Problem: ... format.

From a high-level, it's somewhat difficult to follow the PR because files are updated and lines changes several times along the course of the commits. This PR should be restructured so that the refactor commits are fixups that can be squashed in a rebase. I think that will condense the changes and help make them more understandable.

.github/workflows/main.yml Outdated Show resolved Hide resolved
CMakeLists.txt Show resolved Hide resolved
@@ -22,8 +22,15 @@ file(APPEND ${CONFIG_H} "/* Define _GNU_SOURCE so that we get all necessary prot
file(APPEND ${CONFIG_H} "#define _GNU_SOURCE 1 \n\n ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: wrap the commit body at 72 or 80 characters (can't remember which is preferred for git).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are ending up on lines of code rather than on commits, not sure if there's anything to do about that but it would be good to find a way to make the link to the commit more obvious.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has always been annoying that there's no way to comment on a commit message itself. However, going through the commits in the "Commits" tab will only show the comments pertaining to that commit (though it still appears attached to a line of code)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep hoping we can do something better for that. Tools like stacked git, jujutsu, and even facebook's sapling with their experimental example stacked PR UI for github look promising, but nothing first party. 😢

Short of going as far as reviewing with a third-party UI or switching to gerritt it feels like we're a bit stuck.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the UI I was referring to from facebook: https://reviewstack.dev/ it also works with ghstack, if only

CMakeLists.txt Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
.configure-custom.sh Show resolved Hide resolved
CMakeLists.txt Show resolved Hide resolved
codecov.yml Show resolved Hide resolved
resource/planner/c/CMakeLists.txt Show resolved Hide resolved
@@ -10,9 +10,10 @@ RUN sudo apt-get -qq install -y --no-install-recommends \
libboost-system-dev \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level comment on this commit: it's not clear why the bionic, el7, and fedora33 dockerfiles are deleted in this commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an artifact of this not having been rebased on top of the build-system update PR yet. This should evaporate after the rebase I just did. Will post after some cleanup.

@grondo
Copy link
Contributor

grondo commented Sep 6, 2023

One random other suggestion. We typically autogenerate release notes from PR titles (since they are used by default as the merge commit message), so before a merge the PR should possibly be titled something other than "Cmake revert checks" (which admittedly I'm not sure I understand what that means). Maybe just "switch build to cmake"

@trws
Copy link
Member Author

trws commented Sep 6, 2023 via email

@trws trws changed the title Cmake revert checks Add cmake build system Sep 6, 2023
@trws trws force-pushed the cmake-revert-checks branch 2 times, most recently from 2e2b763 to dc8f6c8 Compare September 7, 2023 00:41
@trws
Copy link
Member Author

trws commented Sep 7, 2023

Ok, that was one heck of a rebase... This may be the first time I've actually saved a backup of the origin branch just in case something got munged up in all the rewriting, but there are a lot less commits now and should be less files actually changed in this PR.

@milroy
Copy link
Member

milroy commented Sep 7, 2023

I started taking a look at the updates. The monster rebase definitely cleaned up the commits!

I ran into a problem with the update to CMakeLists.txt:

$ ./configure
-- VER dc8f6c88
CMake Error at CMakeLists.txt:14 (string):
  string sub-command REGEX, mode REPLACE needs at least 6 arguments total to
  command.
CMake Warning at CMakeLists.txt:15 (project):
  VERSION keyword not followed by a value or was followed by a value that
  expanded to nothing.
-- Building flux-sched version dc8f6c88
-- Boost version: 1.74.0
-- Configuring incomplete, errors occurred!
See also "/usr/src/CMakeFiles/CMakeOutput.log".

Which you don't hit unless all three are true: undefined FLUX_SCHED_VER environment variable, no VER_FILE, and no tags present in the fork. Quoting the final REGEX argument fixes the issue:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 07b32932..6c299b01 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -11,7 +11,7 @@ else()
   execute_process(COMMAND sh -c [=[git describe --always | awk '/.*/ {sub(/^v/, ""); printf "%s",$1; exit}']=] OUTPUT_VARIABLE FLUX_SCHED_VER)
 endif()
 message(STATUS "VER ${FLUX_SCHED_VER}")
-string(REGEX REPLACE "-.*$" "" FLUX_SCHED_VER_NOGIT ${FLUX_SCHED_VER})
+string(REGEX REPLACE "-.*$" "" FLUX_SCHED_VER_NOGIT "${FLUX_SCHED_VER}")
 project(flux-sched VERSION ${FLUX_SCHED_VER_NOGIT} LANGUAGES CXX C)
 message(STATUS "Building flux-sched version ${FLUX_SCHED_VER}")

@trws
Copy link
Member Author

trws commented Sep 7, 2023

Thanks for that @milroy, patched up and push incoming. I'll set MWP after that. If you get through a review and you're happy, approving should send it through.

@trws trws added the merge-when-passing mergify.io - merge PR automatically once CI passes label Sep 7, 2023
@milroy milroy removed the merge-when-passing mergify.io - merge PR automatically once CI passes label Sep 7, 2023
Copy link
Member

@milroy milroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first commit "ensure superceded PR workflows are cancelled" needs to wrap at 72 characters. Beyond that this PR is great and ready to merge. I dropped the MWP label and am approving this PR.

@trws, you can re-add MWP when you fix first commit message.

Add a concurrency specifier so that github actions cancels matching PR
pushes and similar, saves a lot of compute when iterating on commit
style and similar with force pushes.
It's unclear if this was causing problems, but it made the yaml fail
validation so it seems prudent to clean it up.
problem: sched's containers, matrix and check scripts were out of date
compared to core

solution: Pull in the new bookworm and fedora38 containers and multiarch
build setup from core, along with updated check files and fixes. We now
get arm64 containers out the other end.  I tried getting 32-bit also,
but it is very, very broken, and also complicated by a bug in qemu-user
and/or the linux kernel depending on who you ask.  Would be really nice
to have that extra check, but it's got to be a different PR at this
point. Add cmake and ninja to all containers for the new build setup.
While not strictly required, ninja is tiny and makes the builds and
rebuilds faster when used.
lots of noise gets generated by cmake and ctest without some ignores
This is a big one, that was originally developed as a series of commits,
major decisions are summarized below, but the short version is we have a
cmake-based build system that can be configured and built as though it
still had autotools.  To use the autotools build system, manually run
autoreconf to replace the configure script, and then build as usual.

We're adding a fake configure script here so that it has approximately
the same ergonomics as the autotools build system.  The only thing that
causes some breakage is that the autogen.sh script is a no-op, so
autoreconf needs to be run manually to do the autotools side of the
build system.

To mimic the autotools build system:
1. respect PYTHON env var

  This is a QOL improvement to allow specifying the python version the
  autotools way rather than with the cmake defines.

2. find flux-core by finding flux, match prefix if none specified

  @grondo pointed me to a feature I didn't know we had, where if
  flux-core is in path it's automatically found.  This does that in the
  cmake build system both for the way autotools does it, by checking the
  install prefix, and by checking path as well.

Note on yaml-cpp, while they provide an upstream cmake config that I
would prefer to use, not all of our docker containers actually offer
that support, so we're still using pkg-config for now.
This is custom, but replicates exactly what we do in autotools to
generate both manpages and html.
Problem: If no tag is available, the git describe could fail

Solution: Users can now specify either the cmake define FLUX_SCHED_VER
or the environment variable FLUX_SCHED_VERSION to override the version
detection and set it to whatever
problem: when none of the explicit version strings are set and no tags
are available, the regex command used to strip the sha from the end of
the version to make it pass as cmake's native version caused expansion
errors

solution (courtesy of @milroy): quote the last argument to `REGEX
REPLACE`
@trws
Copy link
Member Author

trws commented Sep 7, 2023

wrapped and pushed, resetting MWP

@codecov
Copy link

codecov bot commented Sep 7, 2023

Codecov Report

Merging #1049 (f9efce9) into master (c619fbe) will decrease coverage by 1.9%.
The diff coverage is n/a.

❗ Current head f9efce9 differs from pull request most recent head 5ecbd64. Consider uploading reports for the commit 5ecbd64 to get more accurate results

@@           Coverage Diff            @@
##           master   #1049     +/-   ##
========================================
- Coverage    73.4%   71.6%   -1.9%     
========================================
  Files          98      89      -9     
  Lines       11889   11569    -320     
========================================
- Hits         8737    8289    -448     
- Misses       3152    3280    +128     

see 19 files with indirect coverage changes

@trws trws added the merge-when-passing mergify.io - merge PR automatically once CI passes label Sep 7, 2023
@mergify mergify bot merged commit cc781dd into flux-framework:master Sep 7, 2023
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-passing mergify.io - merge PR automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants