Skip to content
Max Horn edited this page Feb 29, 2016 · 3 revisions

Preface

This document is meant to be a design guideline for the new GAP build system. To this end, it

  • summarizes requirements the new build system should satisfy
  • discusses various off-the-shelf build tools out there
  • ponders how to deal with package (including those which contain kernel extensions or standalone binaries)
  • ...

The thoughts in here then will be used to drive the implementation of the actual new build system (and in turn, this development will lead to updates in this document).

Any questions, concerns or other feedback should for now be directed at Max Horn.

General Requirements

The following minimal requirements were assumed for the new build system. We do not present a rationale for them; if you have doubts about any, please bring them up on the mailing list or as an issue.

Portability requirements

  • MUST work flawlessly on major Linux distributions (Ubuntu, Fedora, ...), major *BSD variants, Mac OS X
  • SHOULD work on any POSIX compatible system
  • SHOULD work on Windows with Cygwin
  • MAY work on Windows without Cygwin (we really would like good windows support overall, but without any experienced Windows developers to help out, this may be difficult anyway; thus non-Cygwin Windows support is not a priority right now)

Tooling requirements

  • MAY require that user and developers have:

    • a working C compiler supporting ISO C99 or newer
    • a working C++ compiler
    • basic POSIX environment, including sh, make, rm, mkdir, ...
    • GNU make (if so, make sure it works with gmake on *BSD)
    • Python
    • ... ?
  • MAY require that developers install certain additional common tools, such as autoconf, ruby, ...

  • SHOULD NOT require users to install additional "exotic" tools.

    • That is, tools which are not shipped with the system resp. are easily available via a systems standard package management procedures. Thus,
      • for OS X, tools should ideally be supplied with Xcode;
      • for Ubuntu, they should be installable for the most recent LTS release, which is 14.04;
      • and so on).
    • This means that e.g. cmake is at least problematic, because it requires users to have cmake in order to compile GAP, not just developers
    • However, we could considerably weaken this requirement if we managed to consistently provide up-to-date pre-built GAP binaries to users for at least Linux, Mac OS X and Windows. (Bonus point for additional Unix variants). Also, pre-made virtual machines and docker images.

Other requirements

  • Support for out-of-tree builds.

    • So that one can compile multiple configurations from the same GAP sources (e.g. 32 and 64 bit build; with and without readline; builds for different architectures; ...)
  • Support parallel builds (e.g. via make -jN)

    • must be correct (not always the case, e.g. for naive Makefiles)
  • Provide proper dependency tracking

    • between .c and .h files
    • but also if e.g. parts of the build system get changed, the right thing should happen
  • changing build options and then building must work correctly.

    • build options could be e.g. debug mode on/off; HPC-GAP mode; etc.
    • Here, "work correctly" means that the equivalent of typing make needs to rebuild all files that are affected by the change; i.e. the result of make and make clean && make should be fully equivalent.
  • Maintainability, ease-of-use: It should be easy or "obvious" to add new source file, and perform other routine changes on the build system. Ideally, also more complicated changes shouldn't be too hard (or at least be thoroughly documented).

  • It should be easy to build GAP against system-wide installations of GMP, readline, and other future dependencies

  • Reimer also wants (as far as I understood) support for doing make CFLAGS='-some custom flags' which should automatically detect the changed flags, and rebuild everything accordingly.

    • While this would be nice to have, there are alternatives, e.g. having multiple out-of-tree build directories, one for each set of configuration one wants (typically, I would have one "release" build dir with default settings, and a "debug" build dir, and perhaps also "release32" and "debug32" for debug builds; with HPC-GAP merged, there might also be "hpcgap", "hpcgap-debug", ... )
    • If we want this, then the build system of git does something like this purely based on GNU make.

Non-requirements

  • Replicating the current system of "build configurations" exactly

    • In the current system, one can build multiple GAP "configs" inside a single GAP source directory, for multiple architectures, in multiple configurations *Instead, out-of-tree builds (the "industry standard" solution for this problem) will be used.
  • Guessing "perfect" build options, in particular optimizer settings

    • this is a doomed effort. Instead, make it as simple as possible (and in particular: as "standard" as possible) for users to customize these.
    • So there would be no replacement for the GP_CFLAGS macro from cnf/aclocal.m4
  • Support for multiple architectures "in the same directory". This might be annoying for people who use a single GAP source directory on different machines with different architectures, e.g. shared over a network drive (e.g. Laurent seems to be doing that). However, it seems difficult to achieve such a thing with standard tools. As a workaround, affected users (who I'd consider to be "power users") could use different out-of-tree builds for the various architectures, and then e.g. a shell script which invokes the right GAP build based on uname.

Some off-the-shelf-tools to consider

GNU Autotools

This refers to the combination of the following three tools:

  • GNU autoconf - configuration detection
  • GNU automake - generates Makefiles, requires POSIX shell and m4
  • GNU libtool - portable generation of shared libraries and loadable modules

Pros:

  • widely available, most package managers provide these (not bundled on Mac OS X, yet easily installable by hand, or via Fink/Hombrew/MacPorts)
  • for users, the requirements are minimal: basically, only bourne shell is needed (possibly also GNU make, but it checks for that)
  • support a great variety of systems, including rather exotic ones
  • has been the "industry standard" for at least two decades

Cons:

  • widely disliked for being "too complicated" (however, this is often based on outdated facts, as e.g. documentation has improved considerably in recent years; see e.g. Autotools Mythbuster)
  • the build system input files (Makefile.am, configure.ac) may be relatively clear and concise, but they are then turned in big convoluted files (Makefile.in, configure) which are hard to understand and debug.

cmake

Pros:

  • Windows support
  • ... TODO

Cons:

  • requires users to install the 'cmake' tool
  • ... TODO

scons

Pros:

  • ... TODO

Cons:

  • ... TODO

Pros:

  • ... TODO

Cons:

  • ... TODO

Make plus autoconf

?

Make plus hand written shell script

?

TODO: take inspiration from other projects, e.g. git TODO: also look at what I wrote for ScummVM and pentagram?

Others

We considered a great variety of other build system tools. The tools we considered were based on

However, a great number of them were immediately removed from further consideration because they were

  • apparently unmaintained,
  • had an unclear future (rule of thumb: don't expect a 1 year old project to be still maintained in 1 year),
  • were lacking in documentation,
  • were lacking in portability (and worryingly, often even a concern or understanding for it),
  • were too low-level,
  • conflicted with one of the must-have requirements
  • or a combination thereof.

However, if you think there is a build tool that really should be considered, please let me know.

Implementation strategy and requirements

Whatever tools we pick, the new build system should preserve various current features, and also additional features that we are badly missing in the current system.

In the following subsections, I try to summarize these concerns. This will of course influence the final choice of tools, but also is a good reference list for whoever implements the build system, to make sure they are not forgetting something by accident.

Packages

Packages without compiled code

  • TODO: discuss how "plain" packages without anything compiled fit in. in particular, how does GAP locate them. Probably exactly the same way as now?

Packages with compiled code

The new GAP build system has implications for GAP packages which build kernel extensions (and, to a lesser degree, packages which build binaries).

  • Packages that provide a kernel extension include:
    1. Browse
    2. cvec
    3. datastructures
    4. digraphs
    5. edim
    6. ferret
    7. float
    8. Gauss
    9. io
    10. json
    11. linboxing
    12. NormalizInterface
    13. orb
    14. profiling
    15. zeromq
  • Packages that provide and use a standalone executable include:
    1. ace
    2. anupq
    3. grape
    4. guava
    5. homology
    6. kbmag
    7. nq
    8. xgap

Thus, we should think about ...

  1. ... providing a compatibility mode which somehow allows existing, unmodified, packages to be used with the new GAP build system;
  2. ... documenting a migration strategy for package authors to upgrade their package's build system to work natively with the new GAP build system, and even take advantage of its improvements.

Compatibility mode

Providing a compatibility mode would allow for an incremental migration to the new system. There are quite some packages with kernel extensions, and I don't think it is realistic that they all switch to the new system at once. Moreover, without an incremental strategy, when exactly should package authors update their build system? It would have to be done after we switched to the new build system, but before there was an actual release. This would require an extraordinary amount of coordination for all involved.

Hence, I think we really should go for a compatibility mode.

One rough idea would be to use some cleverly created symlinks and text files inside an out-of-tree build directory to fake an "old style" GAP root directory.

  • in the following, $SRCDIR is the path to the GAP source directory, and $BUILDDIR the path to an out-of-tree build directory.
  • first generate a "unique" GAParch (e.g. take config.guess output, then append a random string / a UUID; or perhaps the hash of the build options). This way, one can still use a single package with multiple different GAP builds.
  • create a fake "outer" sysinfo.gap containing something like this:
GAParch=UNIQUE_GAPARCH
GAParch_system=UNIQUE_GAPARCH-t
GAParch_abi=64-bit

Perhaps we can also get rid of GAParch_abi (only linboxing and homology seem to use it, and both seem to need updates anyway in order to work, even with our old build system)

  • create a bin directory $BUILDDIR/bin/UNIQUE_GAPARCH
  • create a fake inner sysinfo.gap (?)
  • add symlinks for lib, src etc. inside $BUILDDIR which point to $SRCDIR/src, $SRCDIR/lib etc.
  • ... TODO: flesh this out.

Package build system migration strategy

TODO: describe how new package build systems could look like

TODO: we need to write both a document that explains to package authors how their build system should "hook" into GAP; and also should provide a default / "example" package build system doing just that (or possibly two; say one based on a super-simple Makefile plus gac or gap-config; the other based on whatever build system we end up using for GAP itself)

TODO: Rough idea for convenient buildin of packages: they provide a build.sh script. The user invokes this script, with (optionally) the path of the GAPROOT to use. The build script could do something similar to this (example is based on autoconf)

SRCDIR=`pwd`
BUILDDIR="$GAPROOT/pkg/$PKGNAME"
# TODO: perform test whether dir is valid and writeable
mkdir -p "$BUILDDIR"
cd "$BUILDDIR"
"$SRCDIR/configure"
make

TODO: can we arrange things so that if you type LoadPackage("io"), but io has not yet been compiled, then GAP somehow figured out how to compile your package, does that for you, and then loads it (and of course it should do that for dependencies, too)? That would be a good step towards a package manager

TODO: random bits that need to be sorted into appropriate places later on

  • extend GAP to be more clever when it comes to finding its library: namely, try some strategies:

    • look for a pkg dir relative to the directory the GAP binary itself is in
      • in the current build system, the natural place is ../../pkg
      • in the new build system, it might be .
    • also hardcode a PATH into the binary
      • in the current build system, it might be the GAPROOT from which the build was started
      • in the new build system, it might be the GAPROOT derived from the prefix (e.g. something like /usr/share/gap; possibly multiple spots, e.g. /usr/share/gap/lib for the GAP library code, vs. /usr/lib/gap/ for kernel extensions
  • support for installing GAP via (the analogue of) make install; this would install...

    • GAPROOT/bin/gap.sh OR GAPROOT/bin/ARCH/gap into PREFIX/bin
    • dirs not containing compiled stuff (lib, small, trans, perf, doc, ...) into e.g. PREFIX/share/gap
    • the actual binary and / or *.o files into PREFIX/lib/gap or PREFIX/libexec/gap
    • packages would probably still go into PREFIX/share/gap/pkg, but their binaries (if any) would go into a separate dir, say PREFIX/lib/gap/pkg (luckily, we already abstract away the way packages find their binaries, so that shouldn't be too hard)
    • headers could either go into PREFIX/includes/gap (more "standard-y"), or into PREFIX/share/gap/src (makes it easier to provide backwards compatibility)
    • ... though if there is something like config.h, this needs special consideration (it cannot go into PREFIX/share/, as that is reserved for files which are architecture independent)
    • to support building packages which expect all of GAP to be in a single directory, there could be a directory which creates a fake "classic" GAPROOT via some symlinks
  • On Mac OS X, there is by default no GNU readline and no GMP. So do we keep bundling GMP, and perhaps also bundle readline? Or do we expect people to install these via Fink / MacPorts / homebrew? If we do the latter, this might affect once again our choice of build tools...

  • support custom CLFAGS etc. being passed to Make ???

  • provide an alternative for "gac"

    • determine what that exactly means
    • do we want to keep supporting static linked kernel extensions?
  • when building kernel extensions, one has two options:

    1. point the extension's build system at the GAP source folder: then this GAP path is hardcoded in the extension. THis is what GAP developers would use
    2. when pointed to a system installed GAP (say in /usr/local or /usr or /opt/GAP/ or ...) then of course that path is hard coded
  • compile time switch between HPC-GAP and GAP

  • GAP package support:

    • it should be easy to write GAP packages with kernel extensions based on the new build system
    • e.g. keep providing gac or something like it
    • provide a gap-config script and/or a gap.pc file for pkg-config which enable package authors to easily detect the "active" GAP root etc. (this should be mostly interesting for systems where "make install" was used to install GAP globally)
  • collect some thoughts on a package manager here, too?

    • Simply ask every package to provide a special script, say build.sh, in its root directory, which we promise to call with several parameter (e.g. path to GAP installation, possible also to the gap binary; the arch type; maybe more).
    • use ReadWeb() or SingleHTTPRequest() from IO to download the latest .tar.gz of a package, then unpack it locally, and (using the above) build the package. Finally, tell the user to restart GAP for the update to take effect.
    • Or write a small GAP package that uses libcurl and libarchive to download archives and extract them (even on Windows). Perhaps more cleanly, provide wrapper packages for libcurl and libarchive.
  • It would be super cool if we could support git bisect for revisions before the build system change. Not sure whether that is feasible, though...

    • perhaps the build system should detect if we switched to a commit with the old build system, and in that case, refrain from reseting itself (i.e. no calls to config.status, if we are using autoconf)
    • that would require extensive testing, of course
  • if we end up using config.guess output anywhere, I want a special case for Mac OS X: There, config.guess outputs something like e.g. x86_64-apple-darwin15.3.0 for Mac OS 10.11.3 (the 15.3 corresponds to 11.3 -- but e.g. OS X 10.8.5 is 12.6 and not 12.5; anyway....). This is quite annoying because it means e.g. OS X 10.11.2 and 10.11.3 are effectively treated as binary incompatible, which they are not. Solution: Strip the last ".N.N" bits from the arch name.

  • if we use autoconf, let's reduce the tests that configure performs to the bare minimum. There is no point in e.g. checking for the presence of stdio.h

Clone this wiki locally