diff --git a/r-admin/Installing-R-under-Unix-alikes.html b/r-admin/Installing-R-under-Unix-alikes.html index e6eaef4..566414b 100644 --- a/r-admin/Installing-R-under-Unix-alikes.html +++ b/r-admin/Installing-R-under-Unix-alikes.html @@ -622,8 +622,8 @@
There is support for using link-time optimization (LTO) if the toolchain supports it: configure with flag --enable-lto
. When LTO is enabled it is used for compiled code in add-on packages unless the flag --enable-lto=R
is used12.
12 Then recommended packages installed as part of the R installation do use LTO, but not packages installed later.
13 A complete CRAN installation reduced from 50 to 35GB.
The main benefit seen to date from LTO has been detecting long-standing bugs in the ways packages pass arguments to compiled code and between compilation units. Benchmarking in 2020 with gcc
/gfortran
10 showed gains of a few percent in increased performance and reduction in installed size for builds without debug symbols, but large size reductions for some packages13 with debug symbols. (Performance and size gains are said to be most often seen in complex C++ builds.)
There is support for using link-time optimization (LTO) if the toolchain supports it: configure with flag --enable-lto
. When LTO is enabled it is also used for compiled code in add-on packages unless the flag --enable-lto=R
is used12.
12 Then all add-on packages, including recommended packages are not installed with LTO.
13 A complete CRAN installation reduced from 50 to 35GB.
The main benefit seen to date from LTO has been detecting long-standing bugs in the ways packages pass arguments to compiled code and between compilation units. Benchmarking in 2020 with gcc
/gfortran
10 showed gains of a few percent in increased performance and reduction in installed size for builds without debug symbols, but large size reductions for some packages13 with debug symbols. (Performance and size gains are said to be most often seen in complex C++ builds.)
Whether toolchains support LTO is often unclear: all of the C compiler, the Fortran compiler14 and linker have to support it, and support it by the same mechanism (so mixing compiler families may not work and a non-default linker may be needed). It has been supported by the GCC and LLVM projects for some years with diverging implementations.
14 although there is the possibility to exclude Fortran but that misses some of the benefits.
LTO support was added in 2011 for GCC 4.5 on Linux but was little used before 2019: compiler support has steadily improved over those years and --enable-lto=R
is nowadays used for some routine CRAN checking.
Unfortunately --enable-lto
may be accepted but silently do nothing useful if some of the toolchain does not support LTO: this is less common than it once was.
We suggest only using these if the problem is encountered (it had not been seen on CRAN with GCC 10–14 at the time of writing).
-Note that R may need to be re-compiled after even a minor update to the compiler (e.g. from 13.1 to 13.2).
+We suggest only using these if the problem is encountered (it had not been seen on CRAN with GCC 10–15 at the time of writing).
+Note that R will usually need to be re-compiled after even a minor update to the compiler (e.g. from 13.1 to 13.2).
macOS does not comes with an installed Java runtime (JRE) and a macOS upgrade may remove one if already installed: it is intended to be installed at first use. Check if a JRE is installed by running java -version
in a Terminal
window: if Java is not installed on an Intel Mac this may prompt you to install it. We recommend you install a version with long-term support, e.g. 17 or 2121 but not 18–20, 22–24 with a 6-month lifetime.
21 The planned nextLTS release is 25 in September 2025. Java 8 aka 1.8.0 is still LTS but some packages require 11 or later.
22 which website works with Safari but not some other browsers.
The currently simplest way to install Java is from Adoptium22: this installs into an Apple-standard location and so works with /usr/bin/java
. Other builds are available from https://www.azul.com/downloads/zulu-community/?os=macos&architecture=arm-64-bit&package=jdk and from OpenJDK at https://jdk.java.net/, for which JAVA_HOME
may need to be set both when configuring R and at runtime. Note that Java distribution sites may use unusual designations for macOS CPUs such as AArch64
, x64
or x86 64-bit
.
macOS does not comes with an installed Java runtime (JRE) and a macOS upgrade may remove one if already installed: it is intended to be installed at first use. Check if a JRE is installed by running java -version
in a Terminal
window: if Java is not installed this may prompt you to install it from Oracle21 (but see the next paragraph). We recommend you install a version with long-term support, e.g. 17 or 2122 but not 18–20, 22–24 with a 6-month lifetime.
21 Oracle Java has a restrictive licence, unlike distributions based on OpenJDK.
22 The planned nextLTS release is 25 in September 2025. Java 8 aka 1.8.0 is still LTS but some packages require 11 or later.
23 which website works with Safari but not some other browsers.
The currently simplest way to install Java is from Adoptium23: this installs into an Apple-standard location and so works with /usr/bin/java
. Other builds of OpenJDK are available from https://www.azul.com/downloads/zulu-community/?os=macos&architecture=arm-64-bit&package=jdk and from OpenJDK at https://jdk.java.net/, for which JAVA_HOME
may need to be set both when configuring R and at runtime. Note that Java distribution sites may use unusual designations for macOS CPUs such as AArch64
, x64
or x86 64-bit
.
Binary distributions of R are built against a specific version (e.g. 11.0.18 or 17.0.1) of Java so
sudo R CMD javareconf
will likely be needed to be run before using Java-using packages.
diff --git a/r-admin/search.json b/r-admin/search.json index 0abf5a4..79ffdee 100644 --- a/r-admin/search.json +++ b/r-admin/search.json @@ -114,7 +114,7 @@ "href": "Installing-R-under-Unix-alikes.html#other-options", "title": "2 Installing R under Unix-alikes", "section": "2.7 Other Options", - "text": "2.7 Other Options\nThere are many other installation options, most of which are listed by configure --help. Almost all of those not listed elsewhere in this manual are either standard autoconf options not relevant to R or intended for specialist uses by the R developers.\nOne that may be useful when working on R itself is the option --disable-byte-compiled-packages, which ensures that the base and recommended packages are not byte-compiled. (Alternatively the (make or environment) variable R_NO_BASE_COMPILE can be set to a non-empty value for the duration of the build.)\nOption --with-internal-tzcode makes use of R’s own code and copy of the IANA database for managing timezones. This will be preferred where there are issues with the system implementation, usually involving times after 2037 or before 1916. An alternative time-zone directory7 can be used, pointed to by environment variable TZDIR: this should contain files such as Europe/London. On all tested OSes the system timezone was deduced correctly, but if necessary it can be set as the value of environment variable TZ.\n7 How to prepare such a directory is described in file src/extra/tzone/Notes in the R sources.8 But on Windows problems have been seen with case-changing functions on accented Latin-1 characters.Options --with-internal-iswxxxxx, --with-internal-towlower and --with-internal-wcwidth control the replacement of the system wide-character classification (such as iswprint), case-changing (wctrans) and width (wcwidth and wcswidth) functions by ones contained in the R sources. Replacement of the classification functions has been done for many years on macOS and AIX (and Windows): option --with-internal-iswxxxxx allows this to be suppressed on those platforms or used on others. Replacing the case-changing functions is the default on macOS and Windows. Replacement of the width functions has also been done for many years and remains the default. These options will only matter to those working with non-ASCII character data, especially in languages written in a non-Western script8 (which includes ‘symbols’ such as emoji). Note that one of those iswxxxxx is iswprint which is used to decide whether to output a character as a glyph or as a \\U{xxxxxx} escape—for example, try \"\\U1f600\", an emoji. The width functions are of most importance in East Asian locale: their values differ between such locales. (Replacing the system functions provides a degree of platform-independence (including to OS updates) but replaces it with a dependence on the R version.)\n\n2.7.1 Debugging Symbols\nBy default, configure adds a flag (usually -g) to the compilation flags for C, Fortran and C++ sources. This will slow down compilation and increase object sizes of both R and packages, so it may be a good idea to change those flags (set CFLAGS etc in config.site before configuring, or edit files Makeconf and etc/Makeconf between running configure and make).\nHaving debugging symbols available is useful both when running R under a debugger (e.g., R -d gdb) and when using sanitizers and valgrind, all things intended for experts.\nDebugging symbols (and some others) can be ‘stripped’ on installation by using\nmake install-strip\nHow well this is supported depends on the platform: it works best on those using GNU binutils. On x86_64 Linux a typical reduction in overall size was from 92MB to 66MB. On macOS debugging symbols are not by default included in .dylib and .so files, so there is negligible difference.\n\n\n2.7.2 OpenMP Support\nBy default configure searches for suitable flags9 for OpenMP support for the C, C++ (default standard) and Fortran compilers.\n9 for example, -fopenmp, -fiopenmp, -xopenmp or -qopenmp. This includes for clang and the Intel and Oracle compilers.Only the C result is currently used for R itself, and only if MAIN_LD/DYLIB_LD were not specified. This can be overridden by specifying\nR_OPENMP_CFLAGS\nUse for packages has similar restrictions (involving SHLIB_LD and similar: note that as Fortran code is by default linked by the C (or C++) compiler, both need to support OpenMP) and can be overridden by specifying some of\nSHLIB_OPENMP_CFLAGS\nSHLIB_OPENMP_CXXFLAGS\nSHLIB_OPENMP_FFLAGS\nSetting these to an empty value will disable OpenMP for that compiler (and configuring with --disable-openmp will disable all detection10 of OpenMP). The configure detection test is to compile and link a standalone OpenMP program, which is not the same as compiling a shared object and loading it into the C program of R’s executable. Note that overridden values are not tested.\n10 This does not necessarily disable use of OpenMP – the configure code allows for platforms where OpenMP is used without a flag. For the flang compiler in late 2017, the Fortran runtime always used OpenMP.\n\n2.7.3 C++ Support\nC++ is not used by R itself, but support is provided for installing packages with C++ code via make macros defined in file etc/Makeconf (and with explanations in file config.site):\nCXX\nCXXFLAGS\nCXXPICFLAGS\nCXXSTD\n\nCXX11\nCXX11STD\nCXX11FLAGS\nCXX11PICFLAGS\n\nCXX14\nCXX14STD\nCXX14FLAGS\nCXX14PICFLAGS\n\nCXX17\nCXX17STD\nCXX17FLAGS\nCXX17PICFLAGS\n\nCXX20\nCXX20STD\nCXX20FLAGS\nCXX20PICFLAGS\n\nCXX23\nCXX23STD\nCXX23FLAGS\nCXX23PICFLAGS\nThe macros CXX etc are those used by default for C++ code. configure will attempt to set the rest suitably, choosing for CXXSTD and CXX11STD a suitable flag such as -std=gnu++17 for C++17 support (which is required if C++ is to be supported by default). Inferred values can be overridden in file config.site or on the configure command line: user-supplied values will be tested by compiling some C++11/14/17/20/23 code.\nIt may be that there is no suitable flag for C++14/17/20/23 support with the default compiler, in which case a different compiler could be selected for CXX14/CXX17/CXX20/CXX23 with its corresponding flags.\nIf no suitable compiler/flag is found for the default C++ compiler, one can be set in file config.site via macros CXX and CXXSTD. A user-specified compiler does not need to pass the C++17 tests, so do this at your own risk as some packages may not compile.\nThe -std flag is supported by the GCC, clang++ and Intel compilers. Currently accepted values are (plus some synonyms)\ng++: c++11 gnu+11 c++14 gnu++14 c++17 gnu++17 c++2a gnu++2a (from 8)\n c++20 gnu++20 (from 10) c++23 gnu++23 c++2b gnu++2b (from 11)\nIntel: c++11 gnu+11 c++14 gnu++14 c++17 gnu++17\n c++20 gnu++20 (from 2021.1) c++2b gnu++2b (from 2022.2)\n c++23 gnu++23 (at least from 2024.0)\n(Those for LLVM clang++ are documented at https://clang.llvm.org/cxx_status.html, and follow g++: -std=c++20 is supported from Clang 10, -std=c++2b from Clang 13 and -std=c++23 from Clang 17. Apple Clang supports -std=c++2b from 13.1.6 and -std=c++23 from 16.0.0.)\n‘Standards’ for g++ starting with gnu enable ‘GNU extensions’: what those are is hard to track down.\nFor the use of C++ in R packages, see Writing R Extensions. Prior to R 3.6.0 the default C++ standard was that of the compiler used: currently it is C++17.\nhttps://en.cppreference.com/w/cpp/compiler_support indicates which versions of common compilers support (parts of) which C++ standards. GCC introduced C++17 support gradually, but version 7 should suffice.\n\n\n2.7.4 C standards\nCompiling R requires some POSIX features (such as strdup11 and the ssize_t type) not in the C standards. Typically compilers make these available, but not if strict C compliance is specified by for example -std=c17. So if you want to specify a non-default standard use something like -std=gnu23.\n11 this is part of C23, but part of the C library not the compiler.Compiling R requires C99 or later: C11 and C17 are minor updates, but the substantial update ‘C23’ (finally published in October 2024) is also supported for current versions of GCC and clang.\nAs from R 4.3.0 there is support for packages to indicate their preferred C version. Macros CC17, C17FLAGS, CC23 and C23FLAGS can be set in config.site (there are examples there). Those for C17 should support C17 or earlier and not allow C23 additions so for example bool, true and false can be used as identifiers. Those for C23 should support the new types such as bool.\nSome compilers warn enthusiastically about prototypes. For most, omitting -Wstrict-prototypes in C17FLAGS suffices. However, versions 15 and later of LLVM clang and 14.0.3 and later of Apple clang warn by default in all modes if -Wall or -pedantic is used, and may need -Wno-strict-prototypes.\n\n\n2.7.5 Link-Time Optimization\nThere is support for using link-time optimization (LTO) if the toolchain supports it: configure with flag --enable-lto. When LTO is enabled it is used for compiled code in add-on packages unless the flag --enable-lto=R is used12.\n12 Then recommended packages installed as part of the R installation do use LTO, but not packages installed later.13 A complete CRAN installation reduced from 50 to 35GB.The main benefit seen to date from LTO has been detecting long-standing bugs in the ways packages pass arguments to compiled code and between compilation units. Benchmarking in 2020 with gcc/gfortran 10 showed gains of a few percent in increased performance and reduction in installed size for builds without debug symbols, but large size reductions for some packages13 with debug symbols. (Performance and size gains are said to be most often seen in complex C++ builds.)\nWhether toolchains support LTO is often unclear: all of the C compiler, the Fortran compiler14 and linker have to support it, and support it by the same mechanism (so mixing compiler families may not work and a non-default linker may be needed). It has been supported by the GCC and LLVM projects for some years with diverging implementations.\n14 although there is the possibility to exclude Fortran but that misses some of the benefits.LTO support was added in 2011 for GCC 4.5 on Linux but was little used before 2019: compiler support has steadily improved over those years and --enable-lto=R is nowadays used for some routine CRAN checking.\nUnfortunately --enable-lto may be accepted but silently do nothing useful if some of the toolchain does not support LTO: this is less common than it once was.\nVarious macros can be set in file config.site to customize how LTO is used. If the Fortran compiler is not of the same family as the C/C++ compilers, set macro LTO_FC (probably to empty). Macro LTO_LD can be used to select an alternative linker should that be needed.\n\n\n2.7.6 LTO with GCC\nThis has been tested on Linux with gcc/gfortran 8 and later: that needed setting (e.g. in config.site)\nAR=gcc-ar\nRANLIB=gcc-ranlib\nFor non-system compilers or if those wrappers have not been installed one may need something like\nAR=\"ar --plugin=/path/to/liblto_plugin.so\"\nRANLIB=\"ranlib --plugin=/path/to/liblto_plugin.so\"\nand NM may be needed to be set analogously. (If using an LTO-enabled build to check packages, set environment variable UserNM15 to gcc-nm.)\n15 not NM as we found make overriding that.With GCC 5 and later it is possible to parallelize parts of the LTO linking process: set the make macro LTO to something like LTO=-flto=8 (to use 8 threads), for example in file config.site.\nUnder some circumstances and for a few packages, the PIC flags have needed overriding on Linux with GCC 9: e.g use in config.site:\nCPICFLAGS=-fPIC\nCXXPICFLAGS=-fPIC\nCXX11PICFLAGS=-fPIC\nCXX14PICFLAGS=-fPIC\nCXX17PICFLAGS=-fPIC\nCXX20PICFLAGS=-fPIC\nFPICFLAGS=-fPIC\nWe suggest only using these if the problem is encountered (it had not been seen on CRAN with GCC 10–14 at the time of writing).\nNote that R may need to be re-compiled after even a minor update to the compiler (e.g. from 13.1 to 13.2).\n\n\n2.7.7 LTO with LLVM\nLLVM supports another type of LTO called ‘Thin LTO’ as well as a similar implementation to GCC, sometimes called ‘Full LTO’. (See https://clang.llvm.org/docs/ThinLTO.html.) Currently the LLVM compilers relevant to R are clang and flang-new for which this can be selected by setting macro LTO=-flto=thin. LLVM has\nAR=llvm-ar\nRANLIB=llvm-ranlib\n(but macOS does not, and these are not needed there). Where the linker supports a parallel backend for Thin LTO this can be specified via the macro LTO_LD: see the URL above for per-linker settings and further linking optimizations.)\nFor example, on macOS one might use\nLTO=-flto=thin\nLTO_FC=\nLTO_LD=-Wl,-mllvm,-threads=4\nto use Thin LTO with 4 threads for C/C++ code, but skip LTO for Fortran code compiled with gfortran.\nIt is said to be particularly beneficial to use -O3 for clang in conjunction with LTO.\nIt seems that flang-new may in future support LTO.\nThe 2020s versions of Intel’s C/C++ compilers are based on LLVM and as such support LLVM-style LTO, both ‘full’ and ‘thin’. This might use something like\nLTO=-flto=thin -flto-jobs=8\n\n\n2.7.8 LTO for package checking\nLTO effectively compiles all the source code in a package as a single compilation unit and so allows the compiler (with sufficient diagnostic flags such as -Wall) to check consistency between what are normally separate compilation units.\nWith gcc/gfortran 9.x and later16 LTO will flag inconsistencies in calls to Fortran subroutines/functions, both between Fortran source files and between Fortran and C/C++. gfortran 8.4, 9.2 and later can help understanding these by extracting C prototypes from Fortran source files with option -fc-prototypes-external, e.g. that (at the time of writing) Fortran LOGICAL corresponds to int_least32_t * in C.\n16 probably also 8.4 and later.", + "text": "2.7 Other Options\nThere are many other installation options, most of which are listed by configure --help. Almost all of those not listed elsewhere in this manual are either standard autoconf options not relevant to R or intended for specialist uses by the R developers.\nOne that may be useful when working on R itself is the option --disable-byte-compiled-packages, which ensures that the base and recommended packages are not byte-compiled. (Alternatively the (make or environment) variable R_NO_BASE_COMPILE can be set to a non-empty value for the duration of the build.)\nOption --with-internal-tzcode makes use of R’s own code and copy of the IANA database for managing timezones. This will be preferred where there are issues with the system implementation, usually involving times after 2037 or before 1916. An alternative time-zone directory7 can be used, pointed to by environment variable TZDIR: this should contain files such as Europe/London. On all tested OSes the system timezone was deduced correctly, but if necessary it can be set as the value of environment variable TZ.\n7 How to prepare such a directory is described in file src/extra/tzone/Notes in the R sources.8 But on Windows problems have been seen with case-changing functions on accented Latin-1 characters.Options --with-internal-iswxxxxx, --with-internal-towlower and --with-internal-wcwidth control the replacement of the system wide-character classification (such as iswprint), case-changing (wctrans) and width (wcwidth and wcswidth) functions by ones contained in the R sources. Replacement of the classification functions has been done for many years on macOS and AIX (and Windows): option --with-internal-iswxxxxx allows this to be suppressed on those platforms or used on others. Replacing the case-changing functions is the default on macOS and Windows. Replacement of the width functions has also been done for many years and remains the default. These options will only matter to those working with non-ASCII character data, especially in languages written in a non-Western script8 (which includes ‘symbols’ such as emoji). Note that one of those iswxxxxx is iswprint which is used to decide whether to output a character as a glyph or as a \\U{xxxxxx} escape—for example, try \"\\U1f600\", an emoji. The width functions are of most importance in East Asian locale: their values differ between such locales. (Replacing the system functions provides a degree of platform-independence (including to OS updates) but replaces it with a dependence on the R version.)\n\n2.7.1 Debugging Symbols\nBy default, configure adds a flag (usually -g) to the compilation flags for C, Fortran and C++ sources. This will slow down compilation and increase object sizes of both R and packages, so it may be a good idea to change those flags (set CFLAGS etc in config.site before configuring, or edit files Makeconf and etc/Makeconf between running configure and make).\nHaving debugging symbols available is useful both when running R under a debugger (e.g., R -d gdb) and when using sanitizers and valgrind, all things intended for experts.\nDebugging symbols (and some others) can be ‘stripped’ on installation by using\nmake install-strip\nHow well this is supported depends on the platform: it works best on those using GNU binutils. On x86_64 Linux a typical reduction in overall size was from 92MB to 66MB. On macOS debugging symbols are not by default included in .dylib and .so files, so there is negligible difference.\n\n\n2.7.2 OpenMP Support\nBy default configure searches for suitable flags9 for OpenMP support for the C, C++ (default standard) and Fortran compilers.\n9 for example, -fopenmp, -fiopenmp, -xopenmp or -qopenmp. This includes for clang and the Intel and Oracle compilers.Only the C result is currently used for R itself, and only if MAIN_LD/DYLIB_LD were not specified. This can be overridden by specifying\nR_OPENMP_CFLAGS\nUse for packages has similar restrictions (involving SHLIB_LD and similar: note that as Fortran code is by default linked by the C (or C++) compiler, both need to support OpenMP) and can be overridden by specifying some of\nSHLIB_OPENMP_CFLAGS\nSHLIB_OPENMP_CXXFLAGS\nSHLIB_OPENMP_FFLAGS\nSetting these to an empty value will disable OpenMP for that compiler (and configuring with --disable-openmp will disable all detection10 of OpenMP). The configure detection test is to compile and link a standalone OpenMP program, which is not the same as compiling a shared object and loading it into the C program of R’s executable. Note that overridden values are not tested.\n10 This does not necessarily disable use of OpenMP – the configure code allows for platforms where OpenMP is used without a flag. For the flang compiler in late 2017, the Fortran runtime always used OpenMP.\n\n2.7.3 C++ Support\nC++ is not used by R itself, but support is provided for installing packages with C++ code via make macros defined in file etc/Makeconf (and with explanations in file config.site):\nCXX\nCXXFLAGS\nCXXPICFLAGS\nCXXSTD\n\nCXX11\nCXX11STD\nCXX11FLAGS\nCXX11PICFLAGS\n\nCXX14\nCXX14STD\nCXX14FLAGS\nCXX14PICFLAGS\n\nCXX17\nCXX17STD\nCXX17FLAGS\nCXX17PICFLAGS\n\nCXX20\nCXX20STD\nCXX20FLAGS\nCXX20PICFLAGS\n\nCXX23\nCXX23STD\nCXX23FLAGS\nCXX23PICFLAGS\nThe macros CXX etc are those used by default for C++ code. configure will attempt to set the rest suitably, choosing for CXXSTD and CXX11STD a suitable flag such as -std=gnu++17 for C++17 support (which is required if C++ is to be supported by default). Inferred values can be overridden in file config.site or on the configure command line: user-supplied values will be tested by compiling some C++11/14/17/20/23 code.\nIt may be that there is no suitable flag for C++14/17/20/23 support with the default compiler, in which case a different compiler could be selected for CXX14/CXX17/CXX20/CXX23 with its corresponding flags.\nIf no suitable compiler/flag is found for the default C++ compiler, one can be set in file config.site via macros CXX and CXXSTD. A user-specified compiler does not need to pass the C++17 tests, so do this at your own risk as some packages may not compile.\nThe -std flag is supported by the GCC, clang++ and Intel compilers. Currently accepted values are (plus some synonyms)\ng++: c++11 gnu+11 c++14 gnu++14 c++17 gnu++17 c++2a gnu++2a (from 8)\n c++20 gnu++20 (from 10) c++23 gnu++23 c++2b gnu++2b (from 11)\nIntel: c++11 gnu+11 c++14 gnu++14 c++17 gnu++17\n c++20 gnu++20 (from 2021.1) c++2b gnu++2b (from 2022.2)\n c++23 gnu++23 (at least from 2024.0)\n(Those for LLVM clang++ are documented at https://clang.llvm.org/cxx_status.html, and follow g++: -std=c++20 is supported from Clang 10, -std=c++2b from Clang 13 and -std=c++23 from Clang 17. Apple Clang supports -std=c++2b from 13.1.6 and -std=c++23 from 16.0.0.)\n‘Standards’ for g++ starting with gnu enable ‘GNU extensions’: what those are is hard to track down.\nFor the use of C++ in R packages, see Writing R Extensions. Prior to R 3.6.0 the default C++ standard was that of the compiler used: currently it is C++17.\nhttps://en.cppreference.com/w/cpp/compiler_support indicates which versions of common compilers support (parts of) which C++ standards. GCC introduced C++17 support gradually, but version 7 should suffice.\n\n\n2.7.4 C standards\nCompiling R requires some POSIX features (such as strdup11 and the ssize_t type) not in the C standards. Typically compilers make these available, but not if strict C compliance is specified by for example -std=c17. So if you want to specify a non-default standard use something like -std=gnu23.\n11 this is part of C23, but part of the C library not the compiler.Compiling R requires C99 or later: C11 and C17 are minor updates, but the substantial update ‘C23’ (finally published in October 2024) is also supported for current versions of GCC and clang.\nAs from R 4.3.0 there is support for packages to indicate their preferred C version. Macros CC17, C17FLAGS, CC23 and C23FLAGS can be set in config.site (there are examples there). Those for C17 should support C17 or earlier and not allow C23 additions so for example bool, true and false can be used as identifiers. Those for C23 should support the new types such as bool.\nSome compilers warn enthusiastically about prototypes. For most, omitting -Wstrict-prototypes in C17FLAGS suffices. However, versions 15 and later of LLVM clang and 14.0.3 and later of Apple clang warn by default in all modes if -Wall or -pedantic is used, and may need -Wno-strict-prototypes.\n\n\n2.7.5 Link-Time Optimization\nThere is support for using link-time optimization (LTO) if the toolchain supports it: configure with flag --enable-lto. When LTO is enabled it is also used for compiled code in add-on packages unless the flag --enable-lto=R is used12.\n12 Then all add-on packages, including recommended packages are not installed with LTO.13 A complete CRAN installation reduced from 50 to 35GB.The main benefit seen to date from LTO has been detecting long-standing bugs in the ways packages pass arguments to compiled code and between compilation units. Benchmarking in 2020 with gcc/gfortran 10 showed gains of a few percent in increased performance and reduction in installed size for builds without debug symbols, but large size reductions for some packages13 with debug symbols. (Performance and size gains are said to be most often seen in complex C++ builds.)\nWhether toolchains support LTO is often unclear: all of the C compiler, the Fortran compiler14 and linker have to support it, and support it by the same mechanism (so mixing compiler families may not work and a non-default linker may be needed). It has been supported by the GCC and LLVM projects for some years with diverging implementations.\n14 although there is the possibility to exclude Fortran but that misses some of the benefits.LTO support was added in 2011 for GCC 4.5 on Linux but was little used before 2019: compiler support has steadily improved over those years and --enable-lto=R is nowadays used for some routine CRAN checking.\nUnfortunately --enable-lto may be accepted but silently do nothing useful if some of the toolchain does not support LTO: this is less common than it once was.\nVarious macros can be set in file config.site to customize how LTO is used. If the Fortran compiler is not of the same family as the C/C++ compilers, set macro LTO_FC (probably to empty). Macro LTO_LD can be used to select an alternative linker should that be needed.\n\n\n2.7.6 LTO with GCC\nThis has been tested on Linux with gcc/gfortran 8 and later: that needed setting (e.g. in config.site)\nAR=gcc-ar\nRANLIB=gcc-ranlib\nFor non-system compilers or if those wrappers have not been installed one may need something like\nAR=\"ar --plugin=/path/to/liblto_plugin.so\"\nRANLIB=\"ranlib --plugin=/path/to/liblto_plugin.so\"\nand NM may be needed to be set analogously. (If using an LTO-enabled build to check packages, set environment variable UserNM15 to gcc-nm.)\n15 not NM as we found make overriding that.With GCC 5 and later it is possible to parallelize parts of the LTO linking process: set the make macro LTO to something like LTO=-flto=8 (to use 8 threads), for example in file config.site.\nUnder some circumstances and for a few packages, the PIC flags have needed overriding on Linux with GCC 9: e.g use in config.site:\nCPICFLAGS=-fPIC\nCXXPICFLAGS=-fPIC\nCXX11PICFLAGS=-fPIC\nCXX14PICFLAGS=-fPIC\nCXX17PICFLAGS=-fPIC\nCXX20PICFLAGS=-fPIC\nFPICFLAGS=-fPIC\nWe suggest only using these if the problem is encountered (it had not been seen on CRAN with GCC 10–15 at the time of writing).\nNote that R will usually need to be re-compiled after even a minor update to the compiler (e.g. from 13.1 to 13.2).\n\n\n2.7.7 LTO with LLVM\nLLVM supports another type of LTO called ‘Thin LTO’ as well as a similar implementation to GCC, sometimes called ‘Full LTO’. (See https://clang.llvm.org/docs/ThinLTO.html.) Currently the LLVM compilers relevant to R are clang and flang-new for which this can be selected by setting macro LTO=-flto=thin. LLVM has\nAR=llvm-ar\nRANLIB=llvm-ranlib\n(but macOS does not, and these are not needed there). Where the linker supports a parallel backend for Thin LTO this can be specified via the macro LTO_LD: see the URL above for per-linker settings and further linking optimizations.)\nFor example, on macOS one might use\nLTO=-flto=thin\nLTO_FC=\nLTO_LD=-Wl,-mllvm,-threads=4\nto use Thin LTO with 4 threads for C/C++ code, but skip LTO for Fortran code compiled with gfortran.\nIt is said to be particularly beneficial to use -O3 for clang in conjunction with LTO.\nIt seems that flang-new may in future support LTO.\nThe 2020s versions of Intel’s C/C++ compilers are based on LLVM and as such support LLVM-style LTO, both ‘full’ and ‘thin’. This might use something like\nLTO=-flto=thin -flto-jobs=8\n\n\n2.7.8 LTO for package checking\nLTO effectively compiles all the source code in a package as a single compilation unit and so allows the compiler (with sufficient diagnostic flags such as -Wall) to check consistency between what are normally separate compilation units.\nWith gcc/gfortran 9.x and later16 LTO will flag inconsistencies in calls to Fortran subroutines/functions, both between Fortran source files and between Fortran and C/C++. gfortran 8.4, 9.2 and later can help understanding these by extracting C prototypes from Fortran source files with option -fc-prototypes-external, e.g. that (at the time of writing) Fortran LOGICAL corresponds to int_least32_t * in C.\n16 probably also 8.4 and later.", "crumbs": [ "2 Installing R under Unix-alikes" ] @@ -580,7 +580,7 @@ "href": "Platform-notes.html#macos", "title": "Appendix C — Platform notes", "section": "C.3 macOS", - "text": "C.3 macOS\nThe instructions here are for ‘Apple Silicon’ (arm64) or Intel 64-bit (x86_64) builds on macOS 11 (Big Sur), 12 (Monterey), 13 (Ventura), 14 (Sonoma) and likely later. (They may well work on Intel macOS 10.14 or 10.15, but are untested there.)\n\nC.3.1 Prerequisites\nThe Apple silicon components install into /opt/R/arm64, the Intel ones into /opt/R/x86_64. That may not exist6 so it is simplest to first create the directory and adjust its ownership if desired: for example by\n6 it will if R has been installed from CRAN since R 4.3.0.sudo mkdir -p /opt/R/arm64\nsudo chown -R $USER /opt/R\nAlso, add /opt/R/arm64/bin or /opt/R/x86_64/bin to your path.\nDefine an appropriate variable in your Terminal:\nset LOCAL=/opt/R/arm64 # Apple Silicon\nset LOCAL=/opt/R/x86_64 # Intel\nto use the code snippets here.\nThe following are essential to build R:\n\nApple’s ‘Command Line Tools’: these can be (re-)installed by running xcode-select --install in a terminal.\nIf you have a fresh OS installation, running e.g. make in a terminal will offer the installation of the command-line tools. If you have installed Xcode, this provides the command-line tools. The tools may need to be reinstalled when macOS is upgraded, as upgrading may partially or completely remove them.\nThe Command Line Tools provide C and C++ compilers derived from LLVM’s clang but nowadays known as ‘Apple clang’ with different versioning (so Apple clang 15 is unrelated to LLVM clang 15).\nA Fortran compiler. See Fortran compiler.\nBinary components pcre27 and xz (for liblzma) from https://mac.r-project.org/bin/. There is an R script there to help with installing all the needed components. (At the time of writing install.libs(\"r-base-dev\") installed neither readline5 nor those needed to support Pango.)\nIntel users want the darwin20 components: the darwin17 ones are for macOS 10.13–10.15.\nOr this can be done manually, by for example\ncurl -OL https://mac.r-project.org/bin/darwin20/arm64/pcre2-10.42-darwin.20-arm64.tar.xz\nsudo tar -xvzf pcre2-10.42-darwin.20-arm64.tar.gz -C /\ncurl -OL https://mac.r-project.org/bin/darwin20/arm64/xz-5.4.2-darwin.20-arm64.tar.xz\nsudo tar -xvzf xz-5.4.2-darwin.20-arm64.tar.xz -C /\n(sudo is not needed if your account owns /opt/R/arm64 or /opt/R/x86_64 as appropriate.)\nMessages like opt/R/: Can't restore time should be ignored.\n\n7 If compiling it from source on arm64, pcre2 (at least up to version 10.39) needs to be built without JIT support (the default) as the R build segfaults if that is enabled, so do run make check on your build.and desirable\n\nComponent readline5.8 If readline is not present, the emulation in Apple’s version of libedit (aka editline) will be used: if you wish to avoid that, configure with --without-readline.\nComponents jpeg, libpng, pkgconfig, tiff and zlib-system-stub from https://mac.r-project.org/bin// for the full range of bitmapped graphics devices. (Some builds of tiff may require libwebp and/or openjpeg.)\nAn X sub-system unless configuring using --without-x: see https://www.xquartz.org/. R’s configure script can be told to look for X11 in XQuartz’s main location of /opt/X11, e.g. by\n--x-includes=/opt/X11/include --x-libraries=/opt/X11/lib\nBe wary of pre-release versions of XQuartz, which may be offered as an update.\nAn Objective-C compiler, as provided by clang in the Command Line Tools: this is needed for the quartz() graphics device.\nUse --without-aqua if you want a standard Unix-alike build: apart from disabling quartz() and the ability to use the build with R.APP, it also changes the default location of the personal library (see ?.libPaths).\nA Tcl/Tk installation, See Tcl/Tk headers and libraries.\nSupport for Cairo-based graphics devices. See Cairo graphics.\nA TeX installation. See Other libraries.\ntexi2any from a Texinfo distribution, which requires perl (currently a default part of macOS but it has been announced that it may not be in future). A version of texi2any has been included in the binary distribution of R and there is a texinfo component at https://mac.r-project.org/bin/.\n\n8 For licence reasons this is version 5.2 of readline: for those who want a more recent version it is straightforward to compile it from its sources.To build R itself from the sources with the C/C++ compilers in the Command Line Tools (or Xcode) and gfortran from the installer mentioned below, use a file config.site containing\nCC=clang\nOBJC=$CC\nFC=\"/opt/gfortran/bin/gfortran -mtune=native\"\nCPPFLAGS='-isystem $LOCAL/include'\nCXX=clang++\nand configure by something like\n./configure -C \\\n--enable-R-shlib --enable-memory-profiling \\\n--x-includes=/opt/X11/include --x-libraries=/opt/X11/lib \\\n--with-tcl-config=$LOCAL/lib/tclConfig.sh \\\n--with-tk-config=$LOCAL/lib/tkConfig.sh \\\nPKG_CONFIG_PATH=$LOCAL/lib/pkgconfig:/usr/lib/pkgconfig\n(See below for other options for Tcl/Tk.) For an arm64 build further flags are desirable in config.site:\nCFLAGS=\"-falign-functions=8 -g -O2\"\nis needed to inter-work with gfortran without segfaulting in some packages. Some builds of gfortran have targetted the current version of macOS (unlike clang), causing linker warnings: to avoid these use\nFFLAGS=\"-g -O2 -mmacosx-version-min=11.0\"\nFCFLAGS=\"-g -O2 -mmacosx-version-min=11.0\"\nor perhaps\nFFLAGS=\"-g -O2 -mmacos-version-min=11.0\"\nFCFLAGS=\"-g -O2 -mmacos-version-min=11.0\"\nwhere 11.0 can be replaced by 12.0, 13.0 or 14.0 for macOS 12.x, 13.x and 14.x.\nTo install packages using compiled code one needs the Command Line Tools (or Xcode) and appropriate compilers, e.g. the C/C++ compilers from those tools and/or gfortran. Some packages have further requirements such as component pkgconfig (and to set PKG_CONFIG_PATH= as above).\nA subversion client can be obtained from https://mac.r-project.org/tools/, for example by (Apple Silicon)\ncurl -OL https://mac.r-project.org/tools/subversion-1.14.1-darwin.20-arm64.tar.gz\ntar xf subversion-1.14.1-darwin.20-arm64.tar.gz \nsudo cp subversion-1.14.1-darwin-20-arm64/svn $LOCAL/bin\nor (Intel)\ncurl -OL https://mac.r-project.org/tools/subversion-1.14.0-darwin15.6.tar.gz\ntar xf subversion-1.14.0-darwin15.6.tar.gz\nsudo cp subversion-1.14.0-darwin15.6/svn $LOCAL/bin\nIf building software or installing source packages with cmake (or a non-Apple make) for ‘Apple Silicon’ ensure it contains the arm64 architecture (use file to be sure). Running Apple compilers from an x86_64 executable will generate x86_64 code ….\nUpdating an arm64 build may fail because of the bug described at https://openradar.appspot.com/FB8914243 but ab initio builds work. This has been far rarer since macOS 13.\nIf you are using the macOS 13 SDK9, you may need to add something like -mmacos-version-min=12.0 to CFLAGS.\n9 ls -l xcrun -show-sdk-path in a terminal will show you which SDK is selected.Linker warnings like\nld: warning: could not create compact unwind for _sort_:\n register 26 saved somewhere other than in frame\nld: warning: ld: warning:\n could not create compact unwind for _arcoef_: registers 23 and 24 not saved contiguously in frame\nld: warning: could not create compact unwind for ___emutls_get_address:\n registers 23 and 24 not saved contiguously in frame\ncan be ignored. These stem from compiled Fortran code, including its run-time libraries.\nThe default security settings can make it difficult to install Apple packages which have not been ‘notarized’10 by Apple. And not just packages, as this has been seen for executables contained in tarballs/zipfiles (for example, for pandoc). Usually one can use Open With (Control/right/two-finger-click in Finder), then select Installer and Open if you get a further warning message.\n10 See https://developer.apple.com/documentation/xcode/notarizing_macos_software_before_distribution.If you run into problems with ‘quarantine’ for tarballs downloaded in a browser, consider using curl -OL to download (as illustrated above) or xattr -c to remove extended attributes.\nconfigure defaults to --with-internal-tzcode on macOS. The native implementation used to be unusable on earlier versions (with a 32-bit time_t and/or timezone tables missing information beyond the 32-bit range). As from macOS 12.6, option --without-internal-tzcode can be used to override this and R contains sufficient workarounds (for example, the native code fails to recognize dates with a negative tm_year, that is dates before 1900) for R to pass its checks. However, there are discrepancies, notably in Europe in the 1900s and 1940s, even though the Olson database contains the correct information.\n\n\nC.3.2 Fortran compiler\nThere is a ‘universal’ (arm64 and Intel) build of gfortran 12.2 at https://mac.r-project.org/tools/gfortran-12.2-universal.pkg. This installs into /opt/gfortran.\nThe /opt/gfortran/SDK symlink should point to the desired path to the SDK (defaults to the command line tools SDK). This can be updated by running /opt/gfortran/bin/gfortran-update-sdk or manually. If the symlink is broken, the driver will issue a warning and use xcrun -show-sdk-path to try to find an SDK and use its path. (The SDK path is used when using gfortran to link, so not when building R but when installing a few packages.)\nBuilds of gfortran 13.2 and 14.1 for arm64 macOS 14 are available at https://github.com/fxcoudert/gfortran-for-macOS/releases. These can be built for Intel and older OSes from the sources at https://github.com/iains/gcc-13-branch/ and gcc-14-branch/. To use one of the pre-built compilers with Apple clang needs something like\nLDFLAGS=\"-L/opt/R/arm64/lib -rpath /usr/local/gfortran/lib\"\nin config.site to ensure the Fortran run-time libraries are found.\n\n\nC.3.3 Cairo graphics\nCairo-based graphics devices such as cairo_ps, cairo_pdf, X11(type = \"cairo\") and the Cairo-based types of devices bmp jpeg, png and tiff are not the default on macOS, and much less used than the Quartz-based devices. However, the only SVG device in the R distribution, svg, is based on Cairo.\nSupport for Cairo is optional and can be added in several ways, all of which need pkg-config. configure will add Cairo support if pkg-config finds package cairo unless --without-cairo is used.\nA way to statically link Cairo is by downloading and unpacking components cairo, fontconfig, freetype, pixman and zlib-system-stub (and do not have /opt/X11/lib/pkgconfig in PKG_CONFIG_PATH). Some static builds of fontconfig need libxml2 (from component xml2) and others expat, supplied by macOS but needing a file $LOCAL/lib/pkgconfig/expat.pc along the lines of\nName: expat\nVersion: 2.2.8\nDescription: expat XML parser\nURL: http://www.libexpat.org\nLibs: -lexpat\nCflags: \nNote that the list of components is liable to change: running pkg-config cairo --exists --print-errors should tell you if any others are required.\nThe best font experience of Cairo graphics will be to use it in combination with Pango which will match that supported on most other Unix-alikes. configure uses pkg-config to determine if all the external software required by both Pango and Cairo is available: running pkg-config pangocairo --exists --print-errors should show if the installation suffices and if not, what is missing. At the time of writing using pre-built components cairo, fontconfig, freetype, ffi, fribidi, gettext, icu, glib, harfbuzz, pango, pcre, pixman and xml2 sufficed.\n\n\nC.3.4 Other C/C++ compilers\nOther pre-compiled distributions of clang may be available from https://github.com/llvm/llvm-project/releases/ (recently only for arm64 and usually unsigned/not notarized which makes them hard to use). In particular, these include support for OpenMP which Apple clang does not. Some of these have included support for the ASan and UBSan sanitizers.\nSuppose one of these distributions is installed under $LOCAL/llvm. Use a file config.site containing\nSDK=`xcrun -show-sdk-path`\nCC=\"$LOCAL/llvm/bin/clang -isysroot $SDK\"\nCXX=\"$LOCAL/llvm/bin/clang++ -isysroot $SDK\"\nOBJC=$CC\nFC=/opt/gfortran/bin/gfortran\nLDFLAGS=\"-L$LOCAL/llvm/lib -L$LOCAL/lib\"\nR_LD_LIBRARY_PATH=$LOCAL/llvm/lib:$LOCAL/lib\nThe care to specify library paths is to ensure that the OpenMP runtime library, here $LOCAL/llvm/lib/libomp.dylib, is found when needed. If this works, you should see the line\nchecking whether OpenMP SIMD reduction is supported... yes\nin the configure output. Also, R_LD_LIBRARY_PATH needs to be set to find the latest version of the C++ run-time libraries rather than the system ones.\nIt is normally possible to build R with GCC (built from the sources, from a gfortran distribution, from Homebrew, …). When last tested it was not possible to use gcc to build the quartz() device, so configure --without-aqua may be required. R was built and tested with the GCC 14.1 compilers in the arm64 gfortran distribution mentioned above using a config.site containing\nCC=/usr/local/gfortran/bin/gcc\nCXX=/usr/local/gfortran/bin/g++\nFC=/usr/local/gfortran/bin/gfortran\nCFLAGS=\"-g -O2 -Wall -pedantic -Wstrict-prototypes\"\nC17FLAGS=\"-g -O2 -Wall -pedantic -Wno-strict-prototypes\"\nC90FLAGS=$C17FLAGS\nC99FLAGS=$C17FLAGS\nCXXFLAGS=\"-g -O2 -Wall -pedantic\"\nCPPFLAGS='-isystem /opt/R/arm64/include'\nLDFLAGS=-L/opt/R/arm64/lib\nIt is usually possible to add some OpenMP support to the Apple clang compilers: see https://mac.r-project.org/openmp/. Note that that approach is somewhat fragile as it needs a libomp.dylib library matching the version of the compiler used—and for example at the time of writing none was offered for the current compilers in Xcode CLT 14.3 nor 15.\n\n\nC.3.5 Other libraries\nPre-compiled versions of many of the Useful libraries and programs are available from https://mac.r-project.org/bin//.\nLooking at the top of /Library/Frameworks/R.framework/Resources/etc/Makeconf will show the compilers and configuration options used for the CRAN binary package for R: at the time of writing the non-default options\n--enable-memory-profiling --enable-R-framework\n--x-libraries=/opt/X11/lib --x-includes=/opt/X11/include\nwere used. (--enable-R-framework implies --enable-R-shlib.)\nThe main TeX implementation used by the developers is MacTeX11 (https://www.tug.org/mactex/): the full installation is about 8.5GB, but a much smaller version (‘Basic TeX’) is available at https://www.tug.org/mactex/morepackages.html to which you will need to add some packages to build R, e.g. for the 2022 version we needed to add12 helvetic, inconsolata and texinfo which brought this to about 310MB.13 TeX Live Utility (available via the MacTeX front page) provides a graphical means to manage TeX packages. These contain executables which run natively on both arm64 and x86_64.\n11 An essentially equivalent TeX installation can be obtained by the Unix TeX Live installation scripts.12 E.g. via tlmgr install helvetic inconsolata texinfo .13 Adding all the packages needed to check CRAN increased this to about 600MB.Checking packages thoroughly requires Ghostscript (part of the full MacTeX distribution or separately from https://www.tug.org/mactex/morepackages.html) and qpdf (from https://mac.r-project.org/bin//, a version of which is in the bin directory of a binary installation of R, usually /Library/Frameworks/R.framework/Resources/bin/qpdf).\nR CMD check --as-cran makes use of ‘HTML Tidy’. macOS at the time of writing has a version in /usr/bin/tidy dating from 2006 which is far too old. Up-to-date versions can be installed from http://binaries.html-tidy.org/.\nOne macOS quirk is that the default path has /usr/local/bin after /usr/bin, contrary to common practice on Unix-alikes. This means that if you install tools from the sources they will by default be installed under /usr/local and not supersede the system versions.\nParallel installation of packages will make use of the utility timeout if available. A ‘universal’ build can be downloaded from https://www.stats.ox.ac.uk/pub/bdr/timeout: make it executable (chmod 755 timeout) and put it somewhere on your path.\n\n\nC.3.6 Accelerate\nThe Accelerate library14 can be used via the configuration option\n14 https://developer.apple.com/documentation/accelerate.--with-blas=\"-framework Accelerate\"\nto provide potentially higher-performance versions of the BLAS and LAPACK routines.15 This includes a full LAPACK which can be used via --with-lapack: however, the version of LAPACK it contains has often been seriously old (and is not used unless --with-lapack is specified). Some CRAN builds of R can be switched16 to use Accelerate’s BLAS.\n15 It has been reported that for some non-Apple toolchains CPPFLAGS needed to contain -D__ACCELERATE__: not needed for clang from LLVM.16 https://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#Which-BLAS-is-used-and-how-can-it-be-changed_003f17 Released 2021-04-01.As from macOS 13.3, the BLAS and LAPACK libraries under the Accelerate framework are ‘now inline with reference version 3.9.1’.17 However, this has been done by naming new entry points and so only accessible via their C headers. That version can be used for BLAS calls via configure option --with-newAccelerate: it requires at least macOS 13.3 and SDK 13.3 (from Xcode CLT 14.3). To use it for both BLAS and LAPACK calls, configure with --with-newAccelerate=lapack. These options cannot be used with others such as --with-blas and --with-lapack.\nThreading in Accelerate is controlled by ‘Grand Central Dispatch’18 and is said not to need user control. Test nls.R in package stats has often failed with the Accelerate BLAS on Intel macOS. All versions of Accelerate show differences from the reference BLAS (and most others) in the use of NA vs NaN and a substantial number of R packages fail their checks.\n18 E.g., https://en.wikipedia.org/wiki/Grand_Central_Dispatch .\n\nC.3.7 OpenBLAS\nR has been built on arm64 using OpenBLAS 0.3.24 (sources from https://github.com/OpenMathLib/OpenBLAS/releases) by symlinking /opt/OpenBLAS/lib/libopenblas.dylib to lib/libRblas.dylib (see Shared BLAS).\nOn macOS, a default build of OpenBLAS uses pthreads (as macOS does not have OpenMP) with the number of threads controlled by environment variable OPENBLAS_NUM_THREADS. On an M1 Pro this defaulted to 10 threads (there are 8 ‘performance’ cores and 2 ‘efficiency cores’) and we saw a 9x speedup over the reference BLAS on a large SVD (which was slightly faster than Accelerate).\n\n\nC.3.8 Tcl/Tk headers and libraries\nIf you plan to use the tcltk package for R, you will need to install a distribution of Tcl/Tk. There are two alternatives. If you use R.APP you will want to use X11-based Tcl/Tk (as used on other Unix-alikes), which is installed under $LOCAL/lib as part of the CRAN binary for R.19 This may need configure options\n19 Just that component can be selected from the installer for R: at the ‘Installation Type’ screen select ‘Customise’ and then just the ‘Tcl/Tk 8.6.11’ component.--with-tcltk=$LOCAL/lib\nor\n--with-tcl-config=$LOCAL/lib/tclConfig.sh\n--with-tk-config=$LOCAL/lib/tkConfig.sh\nNote that this requires a matching XQuartz installation.\nThere is also a native (‘Aqua’) version of Tcl/Tk which produces widgets in the native macOS style: this will not work with R.APP because of conflicts over the macOS menu, but for those only using command-line R this provides a much more intuitive interface to Tk for experienced Mac users. Earlier versions of macOS came with an Aqua Tcl/Tk distribution but these were often not at all recent versions of Tcl/Tk. It is better to install Tcl/Tk 8.6.x from the sources20 or a binary distribution from https://www.activestate.com/products/tcl/. For the latter, configure R with\n20 Configure Tk with --enable-aqua.--with-tcl-config=/Library/Frameworks/Tcl.framework/tclConfig.sh \n--with-tk-config=/Library/Frameworks/Tk.framework/tkConfig.sh\nIf you need to find out which distribution of Tk is in use at run time, use\nlibrary(tcltk)\ntclvalue(.Tcl(\"tk windowingsystem\")) # \"x11\" or \"aqua\"\nNote that some Tcl/Tk extensions only support the X11 interface: this includes Tktable and the CRAN package tkrplot.\n\n\nC.3.9 Java\nmacOS does not comes with an installed Java runtime (JRE) and a macOS upgrade may remove one if already installed: it is intended to be installed at first use. Check if a JRE is installed by running java -version in a Terminal window: if Java is not installed on an Intel Mac this may prompt you to install it. We recommend you install a version with long-term support, e.g. 17 or 2121 but not 18–20, 22–24 with a 6-month lifetime.\n21 The planned nextLTS release is 25 in September 2025. Java 8 aka 1.8.0 is still LTS but some packages require 11 or later.22 which website works with Safari but not some other browsers.The currently simplest way to install Java is from Adoptium22: this installs into an Apple-standard location and so works with /usr/bin/java. Other builds are available from https://www.azul.com/downloads/zulu-community/?os=macos&architecture=arm-64-bit&package=jdk and from OpenJDK at https://jdk.java.net/, for which JAVA_HOME may need to be set both when configuring R and at runtime. Note that Java distribution sites may use unusual designations for macOS CPUs such as AArch64, x64 or x86 64-bit.\nBinary distributions of R are built against a specific version (e.g. 11.0.18 or 17.0.1) of Java so\nsudo R CMD javareconf\nwill likely be needed to be run before using Java-using packages.\nTo see what compatible versions of Java are currently installed, run the appropriate one of\n/usr/libexec/java_home -V -a arm64\n/usr/libexec/java_home -V -a x86_64\nIf needed, set the environment variable JAVA_HOME to choose between these, both when R is built from the sources and when R CMD javareconf is run.\nConfiguring and building R both looks for a JRE and for support for compiling JNI programs (used to install packages rJava and JavaGD); the latter requires a JDK (Java SDK). Most distributions of Java 11 or later are of a full JDK.\nThe build process tries to fathom out what JRE/JDK to use, but it may need some help, e.g. by setting environment variable JAVA_HOME. To select a build from Adoptium set e.g.\nJAVA_HOME=/Library/Java/JavaVirtualMachines/temurin-21.jdk/Contents/Home\nin config.site. For Java 21 from https://jdk.java.net/ (which might no longer be available), use\nJAVA_HOME=/path/to/jdk-21.jdk/Contents/Home\nNote that it is necessary to set the environment variable NOAWT to 1 to install many of the Java-using packages.\n\n\nC.3.10 Frameworks\nThe CRAN build of R is installed as a framework, which is selected by the option\n./configure --enable-R-framework\n(This is intended to be used with an Apple toolchain: others may not support frameworks correctly but those from LLVM have done so.)\nIt is only needed if you want to build R for use with the R.APP console, and implies --enable-R-shlib to build R as a dynamic library. This option configures R to be built and installed as a framework called R.framework. The default installation path for R.framework is /Library/Frameworks but this can be changed at configure time by specifying the flag --enable-R-framework[=DIR] (or --prefix) or at install time via\nmake prefix=/where/you/want/R.framework/to/go install\nNote that installation as a framework is non-standard (especially to a non-standard location) and Unix utilities may not support it (e.g. the pkg-config file libR.pc will be put somewhere unknown to pkg-config).\n\n\nC.3.11 Building R.app\nBuilding the R.APP GUI console is a separate project, using Xcode. Before compiling R.APP make sure the current version of R is installed in /Library/Frameworks/R.framework and is working at the command-line (this can be a binary install).\nThe current sources can be checked out by\nsvn co https://svn.r-project.org/R-packages/trunk/Mac-GUI\nand built by loading the R.xcodeproj project (select the R target and a suitable configuration), or from the command-line by e.g.\nxcodebuild -target R -configuration Release\nSee also the INSTALL file in the checkout or directly at https://svn.r-project.org/R-packages/trunk/Mac-GUI/INSTALL.\nR.APP does not need to be installed in any specific way. Building R.APP results in the R.APP bundle which appears as one R icon. This application bundle can be run from anywhere and it is customary to place it in the /Applications folder.\n\n\nC.3.12 Building binary packages\nCRAN macOS binary packages are distributed as tarballs with suffix .tgz to distinguish them from source tarballs. One can tar an existing installed package, or use R CMD INSTALL --build.\nHowever, there are some important details.\n\nCurrent CRAN macOS distributions are targeted at Big Sur so it is wise to ensure that the compilers generate code that will run on Big Sur or later. With the recommended compilers we can use\nCC=\"clang -mmacos-version-min=11.0\"\nCXX=\"clang++ -mmacos-version-min=11.0\"\nFC=\"/opt//gfortran/bin/gfortran -mmacosx-version-min=11.0\"\nor set the environment variable\nexport MACOSX_DEPLOYMENT_TARGET=11.0\nUsing the flag -Werror=partial-availability can help trigger compilation errors on functionality not in Big Sur.\nCheck that any compiled code is not dynamically linked to libraries only on your machine, for example by using otool -L or objdump -macho -dylibs-used. This can include C++ and Fortran run-time libraries under /opt/R/x86_64/lib or /opt/R/arm64/lib: one can use install_name_tool to point these at system versions or those shipped with R, for example\ninstall_name_tool -change /usr/local/llvm/lib/libc++.1.dylib \\\n/usr/lib/libc++.1.dylib \\\npkg.so\n\ninstall_name_tool -change\n/opt/gfortran/lib/gcc/aarch64-apple-darwin20.0/12.2.0/libgfortran.5.dylib \\\n/Library/Frameworks/R.framework/Resources/lib/libgfortran.5.dylib \\\npkg.so\n\ninstall_name_tool -change\n/opt/gfortran/lib/gcc/aarch64-apple-darwin20.0/12.2.0/libquadmath.0.dylib \\\n/Library/Frameworks/R.framework/Resources/lib/libquadmath.0.dylib \\\npkg.so\n(where the details depend on the compilers and CRAN macOS R release).\nFor C++ code there is the possibility that calls will be generated to entry points not in the system /usr/lib/libc++.1.dylib. The previous step allows this to be tested against the system library on the build OS, but not against earlier ones. It may be possible to circumvent that by static linking to libc++.a and libc++abi.a by something like\nSHLIB_CXXLD = /usr/local/llvm/bin/clang\nPKG_LIBS = /usr/local/llvm/lib/libc++.a /usr/local/llvm/lib/libc++abi.a\nin src/Makevars. It would also be possible to static link the Fortran runtime libraries libgfortran.a and libquadmath.a should the Fortran compiler have later versions (but gfortran 8–14 all have version 5).\n\nThe CRAN binary packages are built with the Apple compiler on the oldest supported version of macOS, which avoids the first two and any issues with C++ libraries.\n\n\nC.3.13 Building for Intel on arm64\nShould one want to build R for Intel on an arm64 Big Sur Mac, add the target for the compilers:\nCC=\"clang -arch x86_64\nOBJC=$CC\nCXX=\"clang++ -arch x86_64\"\nFC=\"/opt//gfortran/bin/gfortran -arch x86_64 -mtune=native -mmacosx-version-min=11\"\nand install the Fortran compiler and external software described above for Intel builds (and have /opt/R/x86_64/bin before /opt/R/arm64/bin in your path).\nTo set the correct architecture (which will be auto-detected as aarch64), use something like\n/path/to/configure --build=x86_64-apple-darwin20\n\n\nC.3.14 Installer\nThe scripts for the CRAN packaging of R can be found under https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/: start with the README file in that directory.", + "text": "C.3 macOS\nThe instructions here are for ‘Apple Silicon’ (arm64) or Intel 64-bit (x86_64) builds on macOS 11 (Big Sur), 12 (Monterey), 13 (Ventura), 14 (Sonoma) and likely later. (They may well work on Intel macOS 10.14 or 10.15, but are untested there.)\n\nC.3.1 Prerequisites\nThe Apple silicon components install into /opt/R/arm64, the Intel ones into /opt/R/x86_64. That may not exist6 so it is simplest to first create the directory and adjust its ownership if desired: for example by\n6 it will if R has been installed from CRAN since R 4.3.0.sudo mkdir -p /opt/R/arm64\nsudo chown -R $USER /opt/R\nAlso, add /opt/R/arm64/bin or /opt/R/x86_64/bin to your path.\nDefine an appropriate variable in your Terminal:\nset LOCAL=/opt/R/arm64 # Apple Silicon\nset LOCAL=/opt/R/x86_64 # Intel\nto use the code snippets here.\nThe following are essential to build R:\n\nApple’s ‘Command Line Tools’: these can be (re-)installed by running xcode-select --install in a terminal.\nIf you have a fresh OS installation, running e.g. make in a terminal will offer the installation of the command-line tools. If you have installed Xcode, this provides the command-line tools. The tools may need to be reinstalled when macOS is upgraded, as upgrading may partially or completely remove them.\nThe Command Line Tools provide C and C++ compilers derived from LLVM’s clang but nowadays known as ‘Apple clang’ with different versioning (so Apple clang 15 is unrelated to LLVM clang 15).\nA Fortran compiler. See Fortran compiler.\nBinary components pcre27 and xz (for liblzma) from https://mac.r-project.org/bin/. There is an R script there to help with installing all the needed components. (At the time of writing install.libs(\"r-base-dev\") installed neither readline5 nor those needed to support Pango.)\nIntel users want the darwin20 components: the darwin17 ones are for macOS 10.13–10.15.\nOr this can be done manually, by for example\ncurl -OL https://mac.r-project.org/bin/darwin20/arm64/pcre2-10.42-darwin.20-arm64.tar.xz\nsudo tar -xvzf pcre2-10.42-darwin.20-arm64.tar.gz -C /\ncurl -OL https://mac.r-project.org/bin/darwin20/arm64/xz-5.4.2-darwin.20-arm64.tar.xz\nsudo tar -xvzf xz-5.4.2-darwin.20-arm64.tar.xz -C /\n(sudo is not needed if your account owns /opt/R/arm64 or /opt/R/x86_64 as appropriate.)\nMessages like opt/R/: Can't restore time should be ignored.\n\n7 If compiling it from source on arm64, pcre2 (at least up to version 10.39) needs to be built without JIT support (the default) as the R build segfaults if that is enabled, so do run make check on your build.and desirable\n\nComponent readline5.8 If readline is not present, the emulation in Apple’s version of libedit (aka editline) will be used: if you wish to avoid that, configure with --without-readline.\nComponents jpeg, libpng, pkgconfig, tiff and zlib-system-stub from https://mac.r-project.org/bin// for the full range of bitmapped graphics devices. (Some builds of tiff may require libwebp and/or openjpeg.)\nAn X sub-system unless configuring using --without-x: see https://www.xquartz.org/. R’s configure script can be told to look for X11 in XQuartz’s main location of /opt/X11, e.g. by\n--x-includes=/opt/X11/include --x-libraries=/opt/X11/lib\nBe wary of pre-release versions of XQuartz, which may be offered as an update.\nAn Objective-C compiler, as provided by clang in the Command Line Tools: this is needed for the quartz() graphics device.\nUse --without-aqua if you want a standard Unix-alike build: apart from disabling quartz() and the ability to use the build with R.APP, it also changes the default location of the personal library (see ?.libPaths).\nA Tcl/Tk installation, See Tcl/Tk headers and libraries.\nSupport for Cairo-based graphics devices. See Cairo graphics.\nA TeX installation. See Other libraries.\ntexi2any from a Texinfo distribution, which requires perl (currently a default part of macOS but it has been announced that it may not be in future). A version of texi2any has been included in the binary distribution of R and there is a texinfo component at https://mac.r-project.org/bin/.\n\n8 For licence reasons this is version 5.2 of readline: for those who want a more recent version it is straightforward to compile it from its sources.To build R itself from the sources with the C/C++ compilers in the Command Line Tools (or Xcode) and gfortran from the installer mentioned below, use a file config.site containing\nCC=clang\nOBJC=$CC\nFC=\"/opt/gfortran/bin/gfortran -mtune=native\"\nCPPFLAGS='-isystem $LOCAL/include'\nCXX=clang++\nand configure by something like\n./configure -C \\\n--enable-R-shlib --enable-memory-profiling \\\n--x-includes=/opt/X11/include --x-libraries=/opt/X11/lib \\\n--with-tcl-config=$LOCAL/lib/tclConfig.sh \\\n--with-tk-config=$LOCAL/lib/tkConfig.sh \\\nPKG_CONFIG_PATH=$LOCAL/lib/pkgconfig:/usr/lib/pkgconfig\n(See below for other options for Tcl/Tk.) For an arm64 build further flags are desirable in config.site:\nCFLAGS=\"-falign-functions=8 -g -O2\"\nis needed to inter-work with gfortran without segfaulting in some packages. Some builds of gfortran have targetted the current version of macOS (unlike clang), causing linker warnings: to avoid these use\nFFLAGS=\"-g -O2 -mmacosx-version-min=11.0\"\nFCFLAGS=\"-g -O2 -mmacosx-version-min=11.0\"\nor perhaps\nFFLAGS=\"-g -O2 -mmacos-version-min=11.0\"\nFCFLAGS=\"-g -O2 -mmacos-version-min=11.0\"\nwhere 11.0 can be replaced by 12.0, 13.0 or 14.0 for macOS 12.x, 13.x and 14.x.\nTo install packages using compiled code one needs the Command Line Tools (or Xcode) and appropriate compilers, e.g. the C/C++ compilers from those tools and/or gfortran. Some packages have further requirements such as component pkgconfig (and to set PKG_CONFIG_PATH= as above).\nA subversion client can be obtained from https://mac.r-project.org/tools/, for example by (Apple Silicon)\ncurl -OL https://mac.r-project.org/tools/subversion-1.14.1-darwin.20-arm64.tar.gz\ntar xf subversion-1.14.1-darwin.20-arm64.tar.gz \nsudo cp subversion-1.14.1-darwin-20-arm64/svn $LOCAL/bin\nor (Intel)\ncurl -OL https://mac.r-project.org/tools/subversion-1.14.0-darwin15.6.tar.gz\ntar xf subversion-1.14.0-darwin15.6.tar.gz\nsudo cp subversion-1.14.0-darwin15.6/svn $LOCAL/bin\nIf building software or installing source packages with cmake (or a non-Apple make) for ‘Apple Silicon’ ensure it contains the arm64 architecture (use file to be sure). Running Apple compilers from an x86_64 executable will generate x86_64 code ….\nUpdating an arm64 build may fail because of the bug described at https://openradar.appspot.com/FB8914243 but ab initio builds work. This has been far rarer since macOS 13.\nIf you are using the macOS 13 SDK9, you may need to add something like -mmacos-version-min=12.0 to CFLAGS.\n9 ls -l xcrun -show-sdk-path in a terminal will show you which SDK is selected.Linker warnings like\nld: warning: could not create compact unwind for _sort_:\n register 26 saved somewhere other than in frame\nld: warning: ld: warning:\n could not create compact unwind for _arcoef_: registers 23 and 24 not saved contiguously in frame\nld: warning: could not create compact unwind for ___emutls_get_address:\n registers 23 and 24 not saved contiguously in frame\ncan be ignored. These stem from compiled Fortran code, including its run-time libraries.\nThe default security settings can make it difficult to install Apple packages which have not been ‘notarized’10 by Apple. And not just packages, as this has been seen for executables contained in tarballs/zipfiles (for example, for pandoc). Usually one can use Open With (Control/right/two-finger-click in Finder), then select Installer and Open if you get a further warning message.\n10 See https://developer.apple.com/documentation/xcode/notarizing_macos_software_before_distribution.If you run into problems with ‘quarantine’ for tarballs downloaded in a browser, consider using curl -OL to download (as illustrated above) or xattr -c to remove extended attributes.\nconfigure defaults to --with-internal-tzcode on macOS. The native implementation used to be unusable on earlier versions (with a 32-bit time_t and/or timezone tables missing information beyond the 32-bit range). As from macOS 12.6, option --without-internal-tzcode can be used to override this and R contains sufficient workarounds (for example, the native code fails to recognize dates with a negative tm_year, that is dates before 1900) for R to pass its checks. However, there are discrepancies, notably in Europe in the 1900s and 1940s, even though the Olson database contains the correct information.\n\n\nC.3.2 Fortran compiler\nThere is a ‘universal’ (arm64 and Intel) build of gfortran 12.2 at https://mac.r-project.org/tools/gfortran-12.2-universal.pkg. This installs into /opt/gfortran.\nThe /opt/gfortran/SDK symlink should point to the desired path to the SDK (defaults to the command line tools SDK). This can be updated by running /opt/gfortran/bin/gfortran-update-sdk or manually. If the symlink is broken, the driver will issue a warning and use xcrun -show-sdk-path to try to find an SDK and use its path. (The SDK path is used when using gfortran to link, so not when building R but when installing a few packages.)\nBuilds of gfortran 13.2 and 14.1 for arm64 macOS 14 are available at https://github.com/fxcoudert/gfortran-for-macOS/releases. These can be built for Intel and older OSes from the sources at https://github.com/iains/gcc-13-branch/ and gcc-14-branch/. To use one of the pre-built compilers with Apple clang needs something like\nLDFLAGS=\"-L/opt/R/arm64/lib -rpath /usr/local/gfortran/lib\"\nin config.site to ensure the Fortran run-time libraries are found.\n\n\nC.3.3 Cairo graphics\nCairo-based graphics devices such as cairo_ps, cairo_pdf, X11(type = \"cairo\") and the Cairo-based types of devices bmp jpeg, png and tiff are not the default on macOS, and much less used than the Quartz-based devices. However, the only SVG device in the R distribution, svg, is based on Cairo.\nSupport for Cairo is optional and can be added in several ways, all of which need pkg-config. configure will add Cairo support if pkg-config finds package cairo unless --without-cairo is used.\nA way to statically link Cairo is by downloading and unpacking components cairo, fontconfig, freetype, pixman and zlib-system-stub (and do not have /opt/X11/lib/pkgconfig in PKG_CONFIG_PATH). Some static builds of fontconfig need libxml2 (from component xml2) and others expat, supplied by macOS but needing a file $LOCAL/lib/pkgconfig/expat.pc along the lines of\nName: expat\nVersion: 2.2.8\nDescription: expat XML parser\nURL: http://www.libexpat.org\nLibs: -lexpat\nCflags: \nNote that the list of components is liable to change: running pkg-config cairo --exists --print-errors should tell you if any others are required.\nThe best font experience of Cairo graphics will be to use it in combination with Pango which will match that supported on most other Unix-alikes. configure uses pkg-config to determine if all the external software required by both Pango and Cairo is available: running pkg-config pangocairo --exists --print-errors should show if the installation suffices and if not, what is missing. At the time of writing using pre-built components cairo, fontconfig, freetype, ffi, fribidi, gettext, icu, glib, harfbuzz, pango, pcre, pixman and xml2 sufficed.\n\n\nC.3.4 Other C/C++ compilers\nOther pre-compiled distributions of clang may be available from https://github.com/llvm/llvm-project/releases/ (recently only for arm64 and usually unsigned/not notarized which makes them hard to use). In particular, these include support for OpenMP which Apple clang does not. Some of these have included support for the ASan and UBSan sanitizers.\nSuppose one of these distributions is installed under $LOCAL/llvm. Use a file config.site containing\nSDK=`xcrun -show-sdk-path`\nCC=\"$LOCAL/llvm/bin/clang -isysroot $SDK\"\nCXX=\"$LOCAL/llvm/bin/clang++ -isysroot $SDK\"\nOBJC=$CC\nFC=/opt/gfortran/bin/gfortran\nLDFLAGS=\"-L$LOCAL/llvm/lib -L$LOCAL/lib\"\nR_LD_LIBRARY_PATH=$LOCAL/llvm/lib:$LOCAL/lib\nThe care to specify library paths is to ensure that the OpenMP runtime library, here $LOCAL/llvm/lib/libomp.dylib, is found when needed. If this works, you should see the line\nchecking whether OpenMP SIMD reduction is supported... yes\nin the configure output. Also, R_LD_LIBRARY_PATH needs to be set to find the latest version of the C++ run-time libraries rather than the system ones.\nIt is normally possible to build R with GCC (built from the sources, from a gfortran distribution, from Homebrew, …). When last tested it was not possible to use gcc to build the quartz() device, so configure --without-aqua may be required. R was built and tested with the GCC 14.1 compilers in the arm64 gfortran distribution mentioned above using a config.site containing\nCC=/usr/local/gfortran/bin/gcc\nCXX=/usr/local/gfortran/bin/g++\nFC=/usr/local/gfortran/bin/gfortran\nCFLAGS=\"-g -O2 -Wall -pedantic -Wstrict-prototypes\"\nC17FLAGS=\"-g -O2 -Wall -pedantic -Wno-strict-prototypes\"\nC90FLAGS=$C17FLAGS\nC99FLAGS=$C17FLAGS\nCXXFLAGS=\"-g -O2 -Wall -pedantic\"\nCPPFLAGS='-isystem /opt/R/arm64/include'\nLDFLAGS=-L/opt/R/arm64/lib\nIt is usually possible to add some OpenMP support to the Apple clang compilers: see https://mac.r-project.org/openmp/. Note that that approach is somewhat fragile as it needs a libomp.dylib library matching the version of the compiler used—and for example at the time of writing none was offered for the current compilers in Xcode CLT 14.3 nor 15.\n\n\nC.3.5 Other libraries\nPre-compiled versions of many of the Useful libraries and programs are available from https://mac.r-project.org/bin//.\nLooking at the top of /Library/Frameworks/R.framework/Resources/etc/Makeconf will show the compilers and configuration options used for the CRAN binary package for R: at the time of writing the non-default options\n--enable-memory-profiling --enable-R-framework\n--x-libraries=/opt/X11/lib --x-includes=/opt/X11/include\nwere used. (--enable-R-framework implies --enable-R-shlib.)\nThe main TeX implementation used by the developers is MacTeX11 (https://www.tug.org/mactex/): the full installation is about 8.5GB, but a much smaller version (‘Basic TeX’) is available at https://www.tug.org/mactex/morepackages.html to which you will need to add some packages to build R, e.g. for the 2022 version we needed to add12 helvetic, inconsolata and texinfo which brought this to about 310MB.13 TeX Live Utility (available via the MacTeX front page) provides a graphical means to manage TeX packages. These contain executables which run natively on both arm64 and x86_64.\n11 An essentially equivalent TeX installation can be obtained by the Unix TeX Live installation scripts.12 E.g. via tlmgr install helvetic inconsolata texinfo .13 Adding all the packages needed to check CRAN increased this to about 600MB.Checking packages thoroughly requires Ghostscript (part of the full MacTeX distribution or separately from https://www.tug.org/mactex/morepackages.html) and qpdf (from https://mac.r-project.org/bin//, a version of which is in the bin directory of a binary installation of R, usually /Library/Frameworks/R.framework/Resources/bin/qpdf).\nR CMD check --as-cran makes use of ‘HTML Tidy’. macOS at the time of writing has a version in /usr/bin/tidy dating from 2006 which is far too old. Up-to-date versions can be installed from http://binaries.html-tidy.org/.\nOne macOS quirk is that the default path has /usr/local/bin after /usr/bin, contrary to common practice on Unix-alikes. This means that if you install tools from the sources they will by default be installed under /usr/local and not supersede the system versions.\nParallel installation of packages will make use of the utility timeout if available. A ‘universal’ build can be downloaded from https://www.stats.ox.ac.uk/pub/bdr/timeout: make it executable (chmod 755 timeout) and put it somewhere on your path.\n\n\nC.3.6 Accelerate\nThe Accelerate library14 can be used via the configuration option\n14 https://developer.apple.com/documentation/accelerate.--with-blas=\"-framework Accelerate\"\nto provide potentially higher-performance versions of the BLAS and LAPACK routines.15 This includes a full LAPACK which can be used via --with-lapack: however, the version of LAPACK it contains has often been seriously old (and is not used unless --with-lapack is specified). Some CRAN builds of R can be switched16 to use Accelerate’s BLAS.\n15 It has been reported that for some non-Apple toolchains CPPFLAGS needed to contain -D__ACCELERATE__: not needed for clang from LLVM.16 https://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#Which-BLAS-is-used-and-how-can-it-be-changed_003f17 Released 2021-04-01.As from macOS 13.3, the BLAS and LAPACK libraries under the Accelerate framework are ‘now inline with reference version 3.9.1’.17 However, this has been done by naming new entry points and so only accessible via their C headers. That version can be used for BLAS calls via configure option --with-newAccelerate: it requires at least macOS 13.3 and SDK 13.3 (from Xcode CLT 14.3). To use it for both BLAS and LAPACK calls, configure with --with-newAccelerate=lapack. These options cannot be used with others such as --with-blas and --with-lapack.\nThreading in Accelerate is controlled by ‘Grand Central Dispatch’18 and is said not to need user control. Test nls.R in package stats has often failed with the Accelerate BLAS on Intel macOS. All versions of Accelerate show differences from the reference BLAS (and most others) in the use of NA vs NaN and a substantial number of R packages fail their checks.\n18 E.g., https://en.wikipedia.org/wiki/Grand_Central_Dispatch .\n\nC.3.7 OpenBLAS\nR has been built on arm64 using OpenBLAS 0.3.24 (sources from https://github.com/OpenMathLib/OpenBLAS/releases) by symlinking /opt/OpenBLAS/lib/libopenblas.dylib to lib/libRblas.dylib (see Shared BLAS).\nOn macOS, a default build of OpenBLAS uses pthreads (as macOS does not have OpenMP) with the number of threads controlled by environment variable OPENBLAS_NUM_THREADS. On an M1 Pro this defaulted to 10 threads (there are 8 ‘performance’ cores and 2 ‘efficiency cores’) and we saw a 9x speedup over the reference BLAS on a large SVD (which was slightly faster than Accelerate).\n\n\nC.3.8 Tcl/Tk headers and libraries\nIf you plan to use the tcltk package for R, you will need to install a distribution of Tcl/Tk. There are two alternatives. If you use R.APP you will want to use X11-based Tcl/Tk (as used on other Unix-alikes), which is installed under $LOCAL/lib as part of the CRAN binary for R.19 This may need configure options\n19 Just that component can be selected from the installer for R: at the ‘Installation Type’ screen select ‘Customise’ and then just the ‘Tcl/Tk 8.6.11’ component.--with-tcltk=$LOCAL/lib\nor\n--with-tcl-config=$LOCAL/lib/tclConfig.sh\n--with-tk-config=$LOCAL/lib/tkConfig.sh\nNote that this requires a matching XQuartz installation.\nThere is also a native (‘Aqua’) version of Tcl/Tk which produces widgets in the native macOS style: this will not work with R.APP because of conflicts over the macOS menu, but for those only using command-line R this provides a much more intuitive interface to Tk for experienced Mac users. Earlier versions of macOS came with an Aqua Tcl/Tk distribution but these were often not at all recent versions of Tcl/Tk. It is better to install Tcl/Tk 8.6.x from the sources20 or a binary distribution from https://www.activestate.com/products/tcl/. For the latter, configure R with\n20 Configure Tk with --enable-aqua.--with-tcl-config=/Library/Frameworks/Tcl.framework/tclConfig.sh \n--with-tk-config=/Library/Frameworks/Tk.framework/tkConfig.sh\nIf you need to find out which distribution of Tk is in use at run time, use\nlibrary(tcltk)\ntclvalue(.Tcl(\"tk windowingsystem\")) # \"x11\" or \"aqua\"\nNote that some Tcl/Tk extensions only support the X11 interface: this includes Tktable and the CRAN package tkrplot.\n\n\nC.3.9 Java\nmacOS does not comes with an installed Java runtime (JRE) and a macOS upgrade may remove one if already installed: it is intended to be installed at first use. Check if a JRE is installed by running java -version in a Terminal window: if Java is not installed this may prompt you to install it from Oracle21 (but see the next paragraph). We recommend you install a version with long-term support, e.g. 17 or 2122 but not 18–20, 22–24 with a 6-month lifetime.\n21 Oracle Java has a restrictive licence, unlike distributions based on OpenJDK.22 The planned nextLTS release is 25 in September 2025. Java 8 aka 1.8.0 is still LTS but some packages require 11 or later.23 which website works with Safari but not some other browsers.The currently simplest way to install Java is from Adoptium23: this installs into an Apple-standard location and so works with /usr/bin/java. Other builds of OpenJDK are available from https://www.azul.com/downloads/zulu-community/?os=macos&architecture=arm-64-bit&package=jdk and from OpenJDK at https://jdk.java.net/, for which JAVA_HOME may need to be set both when configuring R and at runtime. Note that Java distribution sites may use unusual designations for macOS CPUs such as AArch64, x64 or x86 64-bit.\nBinary distributions of R are built against a specific version (e.g. 11.0.18 or 17.0.1) of Java so\nsudo R CMD javareconf\nwill likely be needed to be run before using Java-using packages.\nTo see what compatible versions of Java are currently installed, run the appropriate one of\n/usr/libexec/java_home -V -a arm64\n/usr/libexec/java_home -V -a x86_64\nIf needed, set the environment variable JAVA_HOME to choose between these, both when R is built from the sources and when R CMD javareconf is run.\nConfiguring and building R both looks for a JRE and for support for compiling JNI programs (used to install packages rJava and JavaGD); the latter requires a JDK (Java SDK). Most distributions of Java 11 or later are of a full JDK.\nThe build process tries to fathom out what JRE/JDK to use, but it may need some help, e.g. by setting environment variable JAVA_HOME. To select a build from Adoptium set e.g.\nJAVA_HOME=/Library/Java/JavaVirtualMachines/temurin-21.jdk/Contents/Home\nin config.site. For Java 21 from https://jdk.java.net/ (which might no longer be available), use\nJAVA_HOME=/path/to/jdk-21.jdk/Contents/Home\nNote that it is necessary to set the environment variable NOAWT to 1 to install many of the Java-using packages.\n\n\nC.3.10 Frameworks\nThe CRAN build of R is installed as a framework, which is selected by the option\n./configure --enable-R-framework\n(This is intended to be used with an Apple toolchain: others may not support frameworks correctly but those from LLVM have done so.)\nIt is only needed if you want to build R for use with the R.APP console, and implies --enable-R-shlib to build R as a dynamic library. This option configures R to be built and installed as a framework called R.framework. The default installation path for R.framework is /Library/Frameworks but this can be changed at configure time by specifying the flag --enable-R-framework[=DIR] (or --prefix) or at install time via\nmake prefix=/where/you/want/R.framework/to/go install\nNote that installation as a framework is non-standard (especially to a non-standard location) and Unix utilities may not support it (e.g. the pkg-config file libR.pc will be put somewhere unknown to pkg-config).\n\n\nC.3.11 Building R.app\nBuilding the R.APP GUI console is a separate project, using Xcode. Before compiling R.APP make sure the current version of R is installed in /Library/Frameworks/R.framework and is working at the command-line (this can be a binary install).\nThe current sources can be checked out by\nsvn co https://svn.r-project.org/R-packages/trunk/Mac-GUI\nand built by loading the R.xcodeproj project (select the R target and a suitable configuration), or from the command-line by e.g.\nxcodebuild -target R -configuration Release\nSee also the INSTALL file in the checkout or directly at https://svn.r-project.org/R-packages/trunk/Mac-GUI/INSTALL.\nR.APP does not need to be installed in any specific way. Building R.APP results in the R.APP bundle which appears as one R icon. This application bundle can be run from anywhere and it is customary to place it in the /Applications folder.\n\n\nC.3.12 Building binary packages\nCRAN macOS binary packages are distributed as tarballs with suffix .tgz to distinguish them from source tarballs. One can tar an existing installed package, or use R CMD INSTALL --build.\nHowever, there are some important details.\n\nCurrent CRAN macOS distributions are targeted at Big Sur so it is wise to ensure that the compilers generate code that will run on Big Sur or later. With the recommended compilers we can use\nCC=\"clang -mmacos-version-min=11.0\"\nCXX=\"clang++ -mmacos-version-min=11.0\"\nFC=\"/opt//gfortran/bin/gfortran -mmacosx-version-min=11.0\"\nor set the environment variable\nexport MACOSX_DEPLOYMENT_TARGET=11.0\nUsing the flag -Werror=partial-availability can help trigger compilation errors on functionality not in Big Sur.\nCheck that any compiled code is not dynamically linked to libraries only on your machine, for example by using otool -L or objdump -macho -dylibs-used. This can include C++ and Fortran run-time libraries under /opt/R/x86_64/lib or /opt/R/arm64/lib: one can use install_name_tool to point these at system versions or those shipped with R, for example\ninstall_name_tool -change /usr/local/llvm/lib/libc++.1.dylib \\\n/usr/lib/libc++.1.dylib \\\npkg.so\n\ninstall_name_tool -change\n/opt/gfortran/lib/gcc/aarch64-apple-darwin20.0/12.2.0/libgfortran.5.dylib \\\n/Library/Frameworks/R.framework/Resources/lib/libgfortran.5.dylib \\\npkg.so\n\ninstall_name_tool -change\n/opt/gfortran/lib/gcc/aarch64-apple-darwin20.0/12.2.0/libquadmath.0.dylib \\\n/Library/Frameworks/R.framework/Resources/lib/libquadmath.0.dylib \\\npkg.so\n(where the details depend on the compilers and CRAN macOS R release).\nFor C++ code there is the possibility that calls will be generated to entry points not in the system /usr/lib/libc++.1.dylib. The previous step allows this to be tested against the system library on the build OS, but not against earlier ones. It may be possible to circumvent that by static linking to libc++.a and libc++abi.a by something like\nSHLIB_CXXLD = /usr/local/llvm/bin/clang\nPKG_LIBS = /usr/local/llvm/lib/libc++.a /usr/local/llvm/lib/libc++abi.a\nin src/Makevars. It would also be possible to static link the Fortran runtime libraries libgfortran.a and libquadmath.a should the Fortran compiler have later versions (but gfortran 8–14 all have version 5).\n\nThe CRAN binary packages are built with the Apple compiler on the oldest supported version of macOS, which avoids the first two and any issues with C++ libraries.\n\n\nC.3.13 Building for Intel on arm64\nShould one want to build R for Intel on an arm64 Big Sur Mac, add the target for the compilers:\nCC=\"clang -arch x86_64\nOBJC=$CC\nCXX=\"clang++ -arch x86_64\"\nFC=\"/opt//gfortran/bin/gfortran -arch x86_64 -mtune=native -mmacosx-version-min=11\"\nand install the Fortran compiler and external software described above for Intel builds (and have /opt/R/x86_64/bin before /opt/R/arm64/bin in your path).\nTo set the correct architecture (which will be auto-detected as aarch64), use something like\n/path/to/configure --build=x86_64-apple-darwin20\n\n\nC.3.14 Installer\nThe scripts for the CRAN packaging of R can be found under https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/: start with the README file in that directory.", "crumbs": [ "Appendices", "C Platform notes" diff --git a/r-exts/Debugging.html b/r-exts/Debugging.html index ffa96c9..1259943 100644 --- a/r-exts/Debugging.html +++ b/r-exts/Debugging.html @@ -606,7 +606,7 @@AddressSanitizer
(‘ASan’) is a tool with similar aims to the memory checker in valgrind
. It is available with suitable builds7 of gcc
and clang
on common Linux and macOS platforms. See https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation, https://clang.llvm.org/docs/AddressSanitizer.html and https://github.com/google/sanitizers.
7 currently on x86_64
/ix86
Linux and FreeBSD, with some support for Intel macOS but not with the toolchain normally used with R. (There is a faster variant, HWASAN, for aarch64
only.) On some platforms the runtime library, libasan, needs to be installed separately, and for checking C++ you may also need libubsan.
More thorough checks of C++ code are done if the C++ library has been ‘annotated’: at the time of writing this applied to std::vector
in libc++
for use with clang
and gives rise to container-overflow
8 reports.
7 currently on x86_64
/ix86
Linux and FreeBSD, with some support for macOS – see https://developer.apple.com/documentation/xcode/diagnosing-memory-thread-and-crash-issues-early. (There is a faster variant, HWASAN, for aarch64
only.) On some platforms the runtime library, libasan, needs to be installed separately, and for checking C++ you may also need libubsan.
More thorough checks of C++ code are done if the C++ library has been ‘annotated’: at the time of writing this applied to std::vector
in libc++
for use with clang
and gives rise to container-overflow
8 reports.
It requires code to have been compiled and linked with -fsanitize=address
and compiling with -fno-omit-frame-pointer
will give more legible reports. It has a runtime penalty of 2–3x, extended compilation times and uses substantially more memory, often 1–2GB, at run time. On 64-bit platforms it reserves (but does not allocate) 16–20TB of virtual memory: restrictive shell settings can cause problems. It can be helpful to increase the stack size, for example to 40MB.
By comparison with valgrind
, ASan can detect misuse of stack and global variables but not the use of uninitialized memory.
Recent versions return symbolic addresses for the location of the error provided llvm-symbolizer
9 is on the path: if it is available but not on the path or has been renamed10, one can use an environment variable, e.g.
‘Undefined behaviour’ is where the language standard does not require particular behaviour from the compiler. Examples include division by zero (where for doubles R requires the ISO/IEC 60559 behaviour but C/C++ do not), use of zero-length arrays, shifts too far for signed types (e.g. int x, y; y = x << 31;
), out-of-range coercion, invalid C++ casts and mis-alignment. Not uncommon examples of out-of-range coercion in R packages are attempts to coerce a NaN
or infinity to type int
or NA_INTEGER
to an unsigned type such as size_t
. Also common is y[x - 1]
forgetting that x
might be NA_INTEGER
.
‘UBSanitizer’ is a tool for C/C++ source code selected by -fsanitize=undefined
in suitable builds14 of clang
and GCC. Its (main) runtime library is linked into each package’s DLL, so it is less often needed to be included in MAIN_LDFLAGS
. Platforms supported by clang
are listed at https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#supported-platforms: CRAN uses it for C/C++ with both GCC and clang
on x86_64
Linux: the two toolchains often highlight different things with more reports from clang
than GCC.
14 On some platforms the runtime library, libubsan, needs to be installed separately.
This sanitizer may be combined with the Address Sanitizer by -fsanitize=undefined,address
(where both are supported, and we have seen library conflicts for clang
17 and later).
14 On some platforms the runtime library, libubsan, needs to be installed separately. For macOS, see https://developer.apple.com/documentation/xcode/diagnosing-memory-thread-and-crash-issues-early.
This sanitizer may be combined with the Address Sanitizer by -fsanitize=undefined,address
(where both are supported, and we have seen library conflicts for clang
17 and later).
Finer control of what is checked can be achieved by other options.
For clang
see https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#ubsan-checks. The current set is (on a single line):
-fsanitize=alignment,bool,bounds,builtin,enum,float-cast-overflow,
diff --git a/r-exts/search.json b/r-exts/search.json
index 9dae987..f599d9e 100644
--- a/r-exts/search.json
+++ b/r-exts/search.json
@@ -394,7 +394,7 @@
"href": "Debugging.html#checking-memory-access",
"title": "4 Debugging",
"section": "4.3 Checking memory access",
- "text": "4.3 Checking memory access\nErrors in memory allocation and reading/writing outside arrays are very common causes of crashes (e.g., segfaults) on some machines. Often the crash appears long after the invalid memory access: in particular damage to the structures which R itself has allocated may only become apparent at the next garbage collection (or even at later garbage collections after objects have been deleted).\nNote that memory access errors may be seen with LAPACK, BLAS, OpenMP and Java-using packages: some at least of these seem to be intentional, and some are related to passing characters to Fortran.\nSome of these tools can detect mismatched allocation and deallocation. C++ programmers should note that memory allocated by new [] must be freed by delete [], other uses of new by delete, and memory allocated by malloc, calloc and realloc by free. Some platforms will tolerate mismatches (perhaps with memory leaks) but others will segfault.\n\n4.3.1 Using gctorture\nWe can help to detect memory problems in R objects earlier by running garbage collection as often as possible. This is achieved by gctorture(TRUE), which as described on its help page\n\nProvokes garbage collection on (nearly) every memory allocation. Intended to ferret out memory protection bugs. Also makes R run very slowly, unfortunately.\n\nThe reference to ‘memory protection’ is to missing C-level calls to PROTECT/UNPROTECT (see Handling the effects of garbage collection) which if missing allow R objects to be garbage-collected when they are still in use. But it can also help with other memory-related errors.\nNormally running under gctorture(TRUE) will just produce a crash earlier in the R program, hopefully close to the actual cause. See the next section for how to decipher such crashes.\nIt is possible to run all the examples, tests and vignettes covered by R CMD check under gctorture(TRUE) by using the option --use-gct.\nThe function gctorture2 provides more refined control over the GC torture process. Its arguments step, wait and inhibit_release are documented on its help page. Environment variables can also be used at the start of the R session to turn on GC torture: R_GCTORTURE corresponds to the step argument to gctorture2, R_GCTORTURE_WAIT to wait, and R_GCTORTURE_INHIBIT_RELEASE to inhibit_release.\nIf R is configured with --enable-strict-barrier then a variety of tests for the integrity of the write barrier are enabled. In addition tests to help detect protect issues are enabled:\n\nAll GCs are full GCs.\nNew nodes in small node pages are marked as NEWSXP on creation.\nAfter a GC all free nodes that are not of type NEWSXP are marked as type FREESXP and their previous type is recorded.\nMost calls to accessor functions check their SEXP inputs and SEXP outputs and signal an error if a FREESXP is found. The address of the node and the old type are included in the error message.\n\nR CMD check --use-gct can be set to use gctorture2(n) rather than gctorture(TRUE) by setting environment variable _R_CHECK_GCT_N_ to a positive integer value to be used as n.\nUsed with a debugger and with gctorture or gctorture2 this mechanism can be helpful in isolating memory protect problems.\n\n\n4.3.2 Using Valgrind\nIf you have access to Linux on a common CPU type or supported versions of FreeBSD or Solaris2 you can use valgrind (https://valgrind.org/, pronounced to rhyme with ‘tinned’) to check for possible problems. To run some examples under valgrind use something like\n2 The macOS support is for long-obsolete versions.R -d valgrind --vanilla < mypkg-Ex.R\nR -d \"valgrind --tool=memcheck --leak-check=full\" --vanilla < mypkg-Ex.R\nwhere mypkg-Ex.R is a set of examples, e.g. the file created in mypkg.Rcheck by R CMD check. Occasionally this reports memory reads of ‘uninitialised values’ that are the result of compiler optimization, so can be worth checking under an unoptimized compile: for maximal information use a build with debugging symbols. We know there will be some small memory leaks from readline and R itself — these are memory areas that are in use right up to the end of the R session. Expect this to run around 20x slower than without valgrind, and in some cases much slower than that. Several versions of valgrind were not happy with some optimized BLAS libraries that use CPU-specific instructions so you may need to build a version of R specifically to use with valgrind.\nOn platforms where valgrind and its headers3 are installed you can build a version of R with extra instrumentation to help valgrind detect errors in the use of memory allocated from the R heap. The configure option is --with-valgrind-instrumentation=level, where level is 0, 1 or 2. Level 0 is the default and does not add anything. Level 1 will detect some uses4 of uninitialised memory and has little impact on speed (compared to level 0). Level 2 will detect many other memory-use bugs5 but make R much slower when running under valgrind. Using this in conjunction with gctorture can be even more effective (and even slower).\n3 in some distributions packaged separately, for example as valgrind-devel.4 Those in some numeric, logical, integer, raw, complex vectors and in memory allocated by R_alloc.5 including using the data sections of R vectors after they are freed.An example of valgrind output is\n==12539== Invalid read of size 4\n==12539== at 0x1CDF6CBE: csc_compTr (Mutils.c:273)\n==12539== by 0x1CE07E1E: tsc_transpose (dtCMatrix.c:25)\n==12539== by 0x80A67A7: do_dotcall (dotcode.c:858)\n==12539== by 0x80CACE2: Rf_eval (eval.c:400)\n==12539== by 0x80CB5AF: R_execClosure (eval.c:658)\n==12539== by 0x80CB98E: R_execMethod (eval.c:760)\n==12539== by 0x1B93DEFA: R_standardGeneric (methods_list_dispatch.c:624)\n==12539== by 0x810262E: do_standardGeneric (objects.c:1012)\n==12539== by 0x80CAD23: Rf_eval (eval.c:403)\n==12539== by 0x80CB2F0: Rf_applyClosure (eval.c:573)\n==12539== by 0x80CADCC: Rf_eval (eval.c:414)\n==12539== by 0x80CAA03: Rf_eval (eval.c:362)\n==12539== Address 0x1C0D2EA8 is 280 bytes inside a block of size 1996 alloc'd\n==12539== at 0x1B9008D1: malloc (vg_replace_malloc.c:149)\n==12539== by 0x80F1B34: GetNewPage (memory.c:610)\n==12539== by 0x80F7515: Rf_allocVector (memory.c:1915)\n...\nThis example is from an instrumented version of R, while tracking down a bug in the Matrix package in 2006. The first line indicates that R has tried to read 4 bytes from a memory address that it does not have access to. This is followed by a C stack trace showing where the error occurred. Next is a description of the memory that was accessed. It is inside a block allocated by malloc, called from GetNewPage, that is, in the internal R heap. Since this memory all belongs to R, valgrind would not (and did not) detect the problem in an uninstrumented build of R. In this example the stack trace was enough to isolate and fix the bug, which was in tsc_transpose, and in this example running under gctorture() did not provide any additional information.\nvalgrind is good at spotting the use of uninitialized values: use option --track-origins=yes to show where these originated from. What it cannot detect is the misuse of arrays allocated on the stack: this includes C automatic variables and some6 Fortran arrays.\n6 small fixed-size arrays by default in gfortran, for example.It is possible to run all the examples, tests and vignettes covered by R CMD check under valgrind by using the option --use-valgrind. If you do this you will need to select the valgrind options some other way, for example by having a ~/.valgrindrc file containing\n--leak-check=full\n--track-origins=yes\nor setting the environment variable VALGRIND_OPTS. As from R 4.2.0, --use-valgrind also uses valgrind when re-building the vignettes.\nThis section has described the use of memtest, the default (and most useful) of valgrinds tools. There are others described in its documentation: helgrind can be useful for threaded programs.\n\n\n4.3.3 Using the Address Sanitizer\nAddressSanitizer (‘ASan’) is a tool with similar aims to the memory checker in valgrind. It is available with suitable builds7 of gcc and clang on common Linux and macOS platforms. See https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation, https://clang.llvm.org/docs/AddressSanitizer.html and https://github.com/google/sanitizers.\n7 currently on x86_64/ix86 Linux and FreeBSD, with some support for Intel macOS but not with the toolchain normally used with R. (There is a faster variant, HWASAN, for aarch64 only.) On some platforms the runtime library, libasan, needs to be installed separately, and for checking C++ you may also need libubsan.8 see https://llvm.org/devmtg/2014-04/PDFs/LightningTalks/EuroLLVM%202014%20--%20container%20overflow.pdf.More thorough checks of C++ code are done if the C++ library has been ‘annotated’: at the time of writing this applied to std::vector in libc++ for use with clang and gives rise to container-overflow8 reports.\nIt requires code to have been compiled and linked with -fsanitize=address and compiling with -fno-omit-frame-pointer will give more legible reports. It has a runtime penalty of 2–3x, extended compilation times and uses substantially more memory, often 1–2GB, at run time. On 64-bit platforms it reserves (but does not allocate) 16–20TB of virtual memory: restrictive shell settings can cause problems. It can be helpful to increase the stack size, for example to 40MB.\nBy comparison with valgrind, ASan can detect misuse of stack and global variables but not the use of uninitialized memory.\nRecent versions return symbolic addresses for the location of the error provided llvm-symbolizer9 is on the path: if it is available but not on the path or has been renamed10, one can use an environment variable, e.g.\n9 part of the LLVM project and distributed in llvm RPMs and .debs on Linux. It is not currently shipped by Apple.10 as Ubuntu has been said to do.ASAN_SYMBOLIZER_PATH=/path/to/llvm-symbolizer\nAn alternative is to pipe the output through asan_symbolize.py11 and perhaps then (for compiled C++ code) c++filt. (On macOS, you may need to run dsymutil to get line-number reports.)\n11 installed on some Linux systems as asan_symbolize, and obtainable from https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/asan/scripts/asan_symbolize.py: it makes use of llvm-symbolizer if available.The simplest way to make use of this is to build a version of R with something like\nCC=\"gcc -std=gnu99 -fsanitize=address\"\nCFLAGS=\"-fno-omit-frame-pointer -g -O2 -Wall -pedantic -mtune=native\"\nwhich will ensure that the libasan run-time library is compiled into the R executable. However this check can be enabled on a per-package basis by using a ~/.R/Makevars file like\nCC = gcc -std=gnu99 -fsanitize=address -fno-omit-frame-pointer\nCXX = g++ -fsanitize=address -fno-omit-frame-pointer\nFC = gfortran -fsanitize=address\n(Note that -fsanitize=address has to be part of the compiler specification to ensure it is used for linking. These settings will not be honoured by packages which ignore ~/.R/Makevars.) It will be necessary to build R with\nMAIN_LDFLAGS = -fsanitize=address\nto link the runtime libraries into the R executable if it was not specified as part of CC when R was built. (For some builds without OpenMP, -pthread is also required.)\nFor options available via the environment variable ASAN_OPTIONS see https://github.com/google/sanitizers/wiki/AddressSanitizerFlags. With gcc additional control is available via the --param flag: see its man page.\nFor more detailed information on an error, R can be run under a debugger with a breakpoint set before the address sanitizer report is produced: for gdb or lldb you could use\nbreak __asan_report_error\n(See https://github.com/google/sanitizers/wiki/AddressSanitizerAndDebugger.)\nMore recent versions12 added the flag -fsanitize-address-use-after-scope: see https://github.com/google/sanitizers/wiki/AddressSanitizerUseAfterScope.\n12 including gcc 7.1 and clang 4.0.0: for gcc it is implied by -fsanitize=address.13 for example, X11/GL libraries on Linux, seen when checking package rgl and some others using it—a workaround is to set environment variable RGL_USE_NULL=true.One of the checks done by ASan is that malloc/free and in C++ new/delete and new[]/delete[] are used consistently (rather than say free being used to deallocate memory allocated by new[]). This matters on some systems but not all: unfortunately on some of those where it does not matter, system libraries13 are not consistent. The check can be suppressed by including alloc_dealloc_mismatch=0 in ASAN_OPTIONS.\nASan also checks system calls and sometimes reports can refer to problems in the system software and not the package nor R. A couple of reports have been of ‘heap-use-after-free’ errors in the X11 libraries called from Tcl/Tk.\n\n\n4.3.4 Using the Leak Sanitizer\nFor x86_64 Linux there is a leak sanitizer, ‘LSan’: see https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer. This is available on recent versions of gcc and clang, and where available is compiled in as part of ASan.\nOne way to invoke this from an ASan-enabled build is by the environment variable\nASAN_OPTIONS='detect_leaks=1'\nHowever, this was made the default as from clang 3.5 and gcc 5.1.0.\nWhen LSan is enabled, leaks give the process a failure error status (by default 23). For an R package this means the R process, and as the parser retains some memory to the end of the process, if R itself was built against ASan all runs will have a failure error status (which may include running R as part of building R itself).\nTo disable this, allocation-mismatch checking and some strict C++ checking use\nsetenv ASAN_OPTIONS 'alloc_dealloc_mismatch=0:detect_leaks=0:detect_odr_violation=0'\nLSan also has a ‘stand-alone’ mode where it is compiled in using -fsanitize=leak and avoids the run-time overhead of ASan.\n\n\n4.3.5 Using the Undefined Behaviour Sanitizer\n‘Undefined behaviour’ is where the language standard does not require particular behaviour from the compiler. Examples include division by zero (where for doubles R requires the ISO/IEC 60559 behaviour but C/C++ do not), use of zero-length arrays, shifts too far for signed types (e.g. int x, y; y = x << 31;), out-of-range coercion, invalid C++ casts and mis-alignment. Not uncommon examples of out-of-range coercion in R packages are attempts to coerce a NaN or infinity to type int or NA_INTEGER to an unsigned type such as size_t. Also common is y[x - 1] forgetting that x might be NA_INTEGER.\n‘UBSanitizer’ is a tool for C/C++ source code selected by -fsanitize=undefined in suitable builds14 of clang and GCC. Its (main) runtime library is linked into each package’s DLL, so it is less often needed to be included in MAIN_LDFLAGS. Platforms supported by clang are listed at https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#supported-platforms: CRAN uses it for C/C++ with both GCC and clang on x86_64 Linux: the two toolchains often highlight different things with more reports from clang than GCC.\n14 On some platforms the runtime library, libubsan, needs to be installed separately.This sanitizer may be combined with the Address Sanitizer by -fsanitize=undefined,address (where both are supported, and we have seen library conflicts for clang 17 and later).\nFiner control of what is checked can be achieved by other options.\nFor clang see https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#ubsan-checks. The current set is (on a single line):\n-fsanitize=alignment,bool,bounds,builtin,enum,float-cast-overflow,\nfloat-divide-by-zero,function,implicit-unsigned-integer-truncation,\nimplicit-signed-integer-truncation,implicit-integer-sign-change,\ninteger-divide-by-zero,nonnull-attribute,null,nullability-arg,\nnullability-assign,nullability-return,object-size,\npointer-overflow,return,returns-nonnull-attribute,shift,\nsigned-integer-overflow,unreachable,unsigned-integer-overflow,\nunsigned-shift-base,vla-bound,vptr\n(plus the more specific versions array-bounsds, local-bounds, shift-base and shift-exponent), or use something like\n-fsanitize=undefined -fno-sanitize=float-divide-by-zero\nwhere in recent versions -fno-sanitize=float-divide-by-zero is the default.\nOptions return and vptr apply only to C++: to use vptr its run-time library needs to be linked into the main R executable by building the latter with something like\nMAIN_LD=\"clang++ -fsanitize=undefined\"\nOption float-divide-by-zero is undesirable for use with R which allow such divisions as part of IEC 60559 arithmetic, and in versions of clang since June 2019 it is no longer part of -fsanitize=undefined.\nThere are also groups of options implicit-integer-truncation, mplicit-integer-arithmetic-value-change, implicit-conversion, integer and nullability.\nFor GCC see https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html (or the manual for your version of GCC, installed or via https://gcc.gnu.org/onlinedocs/: look for ‘Program Instrumentation Options’) for the options supported by GCC: versions 13.x supported\n-fsanitize=alignment,bool,bounds,builtin,enum,integer-divide-by-zero,\nnonnull-attribute,null,object-size,pointer-overflow,return,\nreturns-nonnull-attribute,shift,signed-integer-overflow,\nunreachable,vla-bound,vptr\nplus the more specific versions shift-base and shift-exponent and non-default options\nbounds-strict,float-cast-overflow,float-divide-by-zero\nwhere float-divide-by-zero is not desirable for R uses and bounds-strict is an extension of bounds.\nOther useful flags include\n-no-fsanitize-recover\nwhich causes the first report to be fatal (it always is for the unreachable and return suboptions). For more detailed information on where the runtime error occurs, using\nsetenv UBSAN_OPTIONS 'print_stacktrace=1'\nwill include a traceback in the report. Beyond that, R can be run under a debugger with a breakpoint set before the sanitizer report is produced: for gdb or lldb you could use\nbreak __ubsan_handle_float_cast_overflow\nbreak __ubsan_handle_float_cast_overflow_abort\nor similar (there are handlers for each type of undefined behaviour).\nThere are also the compiler flags -fcatch-undefined-behavior and -ftrapv, said to be more reliable in clang than gcc.\nFor more details on the topic see https://blog.regehr.org/archives/213 and https://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html (which has 3 parts).\nIt may or may not be possible to build R itself with -fsanitize=undefined: problems have in the past been seen with OpenMP-using code with gcc but there has been success with clang up to version 16.. However, problems have been seen with clang 17 and later, including missing entry points and R builds hanging. What has succeeded is to use UBSAN just for the package under test (and not in combination with ASAN). To do so, check with an unaltered R, using a custom Makevars file something like\nCC = clang -fsanitize=undefined -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer\nCXX = clang++ -fsanitize=undefined -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -frtti\n\nUBSAN_DIR = /path/to/LLVM18/lib/clang/18/lib/x86_64-unknown-linux-gnu\nSAN_LIBS = $(UBSAN_DIR)/libclang_rt.ubsan_standalone.a $(UBSAN_DIR)/libclang_rt.ubsan_standalone_cxx.a\nwhich links the UBSAN libraries statically into the package-under-test’s DSO. It is also possible to use the dynamic library via\nSAN_LIBS = -L$(UBSAN_DIR) -Wl,-rpath,$(UBSAN_DIR) -lclang_rt.ubsan_standalone\nprovided UBSAN_DIR is added to the runtime library path (as shown or using LD_LIBRARY_PATH). N.B.: The details, especially the paths used, have changed several times recently.\n\n\n4.3.6 Other analyses with ‘clang’\nRecent versions of clang on x86_64 Linux have ‘ThreadSanitizer’ (https://github.com/google/sanitizers/wiki#threadsanitizer), a ‘data race detector for C/C++ programs’, and ‘MemorySanitizer’ (https://clang.llvm.org/docs/MemorySanitizer.html, https://github.com/google/sanitizers) for the detection of uninitialized memory. Both are based on and provide similar functionality to tools in valgrind.\nclang has a ‘Static Analyzer’ which can be run on the source files during compilation: see https://clang-analyzer.llvm.org/.\n\n\n4.3.7 Other analyses with ‘gcc’\nGCC 10 introduced a new flag -fanalyzer which does static analysis during compilation, currently for C code. It is regarded as experimental and it may slow down computation considerably when problems are found (and use many GB of resident memory). There is some overlap with problems detected by the Undefined Behaviour sanitizer, but some issues are only reported by this tool and as it is a static analysis, it does not rely on code paths being exercised.\nSee https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html (or the documentation for your version of gcc if later) and https://developers.redhat.com/blog/2020/03/26/static-analysis-in-gcc-10\n\n\n4.3.8 Using ‘Dr. Memory’\n‘Dr. Memory’ from https://drmemory.org/ is a memory checker for (currently) Windows, Linux and macOS with similar aims to valgrind. It works with unmodified executables15 and detects memory access errors, uninitialized reads and memory leaks.\n15 but works better if inlining and frame pointer optimizations are disabled.\n\n4.3.9 Fortran array bounds checking\nMost of the Fortran compilers used with R allow code to be compiled with checking of array bounds: for example gfortran has option -fbounds-check. This will give an error when the upper or lower bound is exceeded, e.g.\nAt line 97 of file .../src/appl/dqrdc2.f\nFortran runtime error: Index '1' of dimension 1 of array 'x' above upper bound of 0\nOne does need to be aware that lazy programmers often specify Fortran dimensions as 1 rather than * or a real bound and these will be reported (as may * dimensions)\nIt is easy to arrange to use this check on just the code in your package: add to ~/.R/Makevars something like (for gfortran)\nFFLAGS = -g -O2 -mtune=native -fbounds-check\nwhen you run R CMD check.\nThis may report errors with the way that Fortran character variables are passed, particularly when Fortran subroutines are called from C code and character lengths are not passed (see Fortran character strings).",
+ "text": "4.3 Checking memory access\nErrors in memory allocation and reading/writing outside arrays are very common causes of crashes (e.g., segfaults) on some machines. Often the crash appears long after the invalid memory access: in particular damage to the structures which R itself has allocated may only become apparent at the next garbage collection (or even at later garbage collections after objects have been deleted).\nNote that memory access errors may be seen with LAPACK, BLAS, OpenMP and Java-using packages: some at least of these seem to be intentional, and some are related to passing characters to Fortran.\nSome of these tools can detect mismatched allocation and deallocation. C++ programmers should note that memory allocated by new [] must be freed by delete [], other uses of new by delete, and memory allocated by malloc, calloc and realloc by free. Some platforms will tolerate mismatches (perhaps with memory leaks) but others will segfault.\n\n4.3.1 Using gctorture\nWe can help to detect memory problems in R objects earlier by running garbage collection as often as possible. This is achieved by gctorture(TRUE), which as described on its help page\n\nProvokes garbage collection on (nearly) every memory allocation. Intended to ferret out memory protection bugs. Also makes R run very slowly, unfortunately.\n\nThe reference to ‘memory protection’ is to missing C-level calls to PROTECT/UNPROTECT (see Handling the effects of garbage collection) which if missing allow R objects to be garbage-collected when they are still in use. But it can also help with other memory-related errors.\nNormally running under gctorture(TRUE) will just produce a crash earlier in the R program, hopefully close to the actual cause. See the next section for how to decipher such crashes.\nIt is possible to run all the examples, tests and vignettes covered by R CMD check under gctorture(TRUE) by using the option --use-gct.\nThe function gctorture2 provides more refined control over the GC torture process. Its arguments step, wait and inhibit_release are documented on its help page. Environment variables can also be used at the start of the R session to turn on GC torture: R_GCTORTURE corresponds to the step argument to gctorture2, R_GCTORTURE_WAIT to wait, and R_GCTORTURE_INHIBIT_RELEASE to inhibit_release.\nIf R is configured with --enable-strict-barrier then a variety of tests for the integrity of the write barrier are enabled. In addition tests to help detect protect issues are enabled:\n\nAll GCs are full GCs.\nNew nodes in small node pages are marked as NEWSXP on creation.\nAfter a GC all free nodes that are not of type NEWSXP are marked as type FREESXP and their previous type is recorded.\nMost calls to accessor functions check their SEXP inputs and SEXP outputs and signal an error if a FREESXP is found. The address of the node and the old type are included in the error message.\n\nR CMD check --use-gct can be set to use gctorture2(n) rather than gctorture(TRUE) by setting environment variable _R_CHECK_GCT_N_ to a positive integer value to be used as n.\nUsed with a debugger and with gctorture or gctorture2 this mechanism can be helpful in isolating memory protect problems.\n\n\n4.3.2 Using Valgrind\nIf you have access to Linux on a common CPU type or supported versions of FreeBSD or Solaris2 you can use valgrind (https://valgrind.org/, pronounced to rhyme with ‘tinned’) to check for possible problems. To run some examples under valgrind use something like\n2 The macOS support is for long-obsolete versions.R -d valgrind --vanilla < mypkg-Ex.R\nR -d \"valgrind --tool=memcheck --leak-check=full\" --vanilla < mypkg-Ex.R\nwhere mypkg-Ex.R is a set of examples, e.g. the file created in mypkg.Rcheck by R CMD check. Occasionally this reports memory reads of ‘uninitialised values’ that are the result of compiler optimization, so can be worth checking under an unoptimized compile: for maximal information use a build with debugging symbols. We know there will be some small memory leaks from readline and R itself — these are memory areas that are in use right up to the end of the R session. Expect this to run around 20x slower than without valgrind, and in some cases much slower than that. Several versions of valgrind were not happy with some optimized BLAS libraries that use CPU-specific instructions so you may need to build a version of R specifically to use with valgrind.\nOn platforms where valgrind and its headers3 are installed you can build a version of R with extra instrumentation to help valgrind detect errors in the use of memory allocated from the R heap. The configure option is --with-valgrind-instrumentation=level, where level is 0, 1 or 2. Level 0 is the default and does not add anything. Level 1 will detect some uses4 of uninitialised memory and has little impact on speed (compared to level 0). Level 2 will detect many other memory-use bugs5 but make R much slower when running under valgrind. Using this in conjunction with gctorture can be even more effective (and even slower).\n3 in some distributions packaged separately, for example as valgrind-devel.4 Those in some numeric, logical, integer, raw, complex vectors and in memory allocated by R_alloc.5 including using the data sections of R vectors after they are freed.An example of valgrind output is\n==12539== Invalid read of size 4\n==12539== at 0x1CDF6CBE: csc_compTr (Mutils.c:273)\n==12539== by 0x1CE07E1E: tsc_transpose (dtCMatrix.c:25)\n==12539== by 0x80A67A7: do_dotcall (dotcode.c:858)\n==12539== by 0x80CACE2: Rf_eval (eval.c:400)\n==12539== by 0x80CB5AF: R_execClosure (eval.c:658)\n==12539== by 0x80CB98E: R_execMethod (eval.c:760)\n==12539== by 0x1B93DEFA: R_standardGeneric (methods_list_dispatch.c:624)\n==12539== by 0x810262E: do_standardGeneric (objects.c:1012)\n==12539== by 0x80CAD23: Rf_eval (eval.c:403)\n==12539== by 0x80CB2F0: Rf_applyClosure (eval.c:573)\n==12539== by 0x80CADCC: Rf_eval (eval.c:414)\n==12539== by 0x80CAA03: Rf_eval (eval.c:362)\n==12539== Address 0x1C0D2EA8 is 280 bytes inside a block of size 1996 alloc'd\n==12539== at 0x1B9008D1: malloc (vg_replace_malloc.c:149)\n==12539== by 0x80F1B34: GetNewPage (memory.c:610)\n==12539== by 0x80F7515: Rf_allocVector (memory.c:1915)\n...\nThis example is from an instrumented version of R, while tracking down a bug in the Matrix package in 2006. The first line indicates that R has tried to read 4 bytes from a memory address that it does not have access to. This is followed by a C stack trace showing where the error occurred. Next is a description of the memory that was accessed. It is inside a block allocated by malloc, called from GetNewPage, that is, in the internal R heap. Since this memory all belongs to R, valgrind would not (and did not) detect the problem in an uninstrumented build of R. In this example the stack trace was enough to isolate and fix the bug, which was in tsc_transpose, and in this example running under gctorture() did not provide any additional information.\nvalgrind is good at spotting the use of uninitialized values: use option --track-origins=yes to show where these originated from. What it cannot detect is the misuse of arrays allocated on the stack: this includes C automatic variables and some6 Fortran arrays.\n6 small fixed-size arrays by default in gfortran, for example.It is possible to run all the examples, tests and vignettes covered by R CMD check under valgrind by using the option --use-valgrind. If you do this you will need to select the valgrind options some other way, for example by having a ~/.valgrindrc file containing\n--leak-check=full\n--track-origins=yes\nor setting the environment variable VALGRIND_OPTS. As from R 4.2.0, --use-valgrind also uses valgrind when re-building the vignettes.\nThis section has described the use of memtest, the default (and most useful) of valgrinds tools. There are others described in its documentation: helgrind can be useful for threaded programs.\n\n\n4.3.3 Using the Address Sanitizer\nAddressSanitizer (‘ASan’) is a tool with similar aims to the memory checker in valgrind. It is available with suitable builds7 of gcc and clang on common Linux and macOS platforms. See https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation, https://clang.llvm.org/docs/AddressSanitizer.html and https://github.com/google/sanitizers.\n7 currently on x86_64/ix86 Linux and FreeBSD, with some support for macOS – see https://developer.apple.com/documentation/xcode/diagnosing-memory-thread-and-crash-issues-early. (There is a faster variant, HWASAN, for aarch64 only.) On some platforms the runtime library, libasan, needs to be installed separately, and for checking C++ you may also need libubsan.8 see https://llvm.org/devmtg/2014-04/PDFs/LightningTalks/EuroLLVM%202014%20--%20container%20overflow.pdf.More thorough checks of C++ code are done if the C++ library has been ‘annotated’: at the time of writing this applied to std::vector in libc++ for use with clang and gives rise to container-overflow8 reports.\nIt requires code to have been compiled and linked with -fsanitize=address and compiling with -fno-omit-frame-pointer will give more legible reports. It has a runtime penalty of 2–3x, extended compilation times and uses substantially more memory, often 1–2GB, at run time. On 64-bit platforms it reserves (but does not allocate) 16–20TB of virtual memory: restrictive shell settings can cause problems. It can be helpful to increase the stack size, for example to 40MB.\nBy comparison with valgrind, ASan can detect misuse of stack and global variables but not the use of uninitialized memory.\nRecent versions return symbolic addresses for the location of the error provided llvm-symbolizer9 is on the path: if it is available but not on the path or has been renamed10, one can use an environment variable, e.g.\n9 part of the LLVM project and distributed in llvm RPMs and .debs on Linux. It is not currently shipped by Apple.10 as Ubuntu has been said to do.ASAN_SYMBOLIZER_PATH=/path/to/llvm-symbolizer\nAn alternative is to pipe the output through asan_symbolize.py11 and perhaps then (for compiled C++ code) c++filt. (On macOS, you may need to run dsymutil to get line-number reports.)\n11 installed on some Linux systems as asan_symbolize, and obtainable from https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/asan/scripts/asan_symbolize.py: it makes use of llvm-symbolizer if available.The simplest way to make use of this is to build a version of R with something like\nCC=\"gcc -std=gnu99 -fsanitize=address\"\nCFLAGS=\"-fno-omit-frame-pointer -g -O2 -Wall -pedantic -mtune=native\"\nwhich will ensure that the libasan run-time library is compiled into the R executable. However this check can be enabled on a per-package basis by using a ~/.R/Makevars file like\nCC = gcc -std=gnu99 -fsanitize=address -fno-omit-frame-pointer\nCXX = g++ -fsanitize=address -fno-omit-frame-pointer\nFC = gfortran -fsanitize=address\n(Note that -fsanitize=address has to be part of the compiler specification to ensure it is used for linking. These settings will not be honoured by packages which ignore ~/.R/Makevars.) It will be necessary to build R with\nMAIN_LDFLAGS = -fsanitize=address\nto link the runtime libraries into the R executable if it was not specified as part of CC when R was built. (For some builds without OpenMP, -pthread is also required.)\nFor options available via the environment variable ASAN_OPTIONS see https://github.com/google/sanitizers/wiki/AddressSanitizerFlags. With gcc additional control is available via the --param flag: see its man page.\nFor more detailed information on an error, R can be run under a debugger with a breakpoint set before the address sanitizer report is produced: for gdb or lldb you could use\nbreak __asan_report_error\n(See https://github.com/google/sanitizers/wiki/AddressSanitizerAndDebugger.)\nMore recent versions12 added the flag -fsanitize-address-use-after-scope: see https://github.com/google/sanitizers/wiki/AddressSanitizerUseAfterScope.\n12 including gcc 7.1 and clang 4.0.0: for gcc it is implied by -fsanitize=address.13 for example, X11/GL libraries on Linux, seen when checking package rgl and some others using it—a workaround is to set environment variable RGL_USE_NULL=true.One of the checks done by ASan is that malloc/free and in C++ new/delete and new[]/delete[] are used consistently (rather than say free being used to deallocate memory allocated by new[]). This matters on some systems but not all: unfortunately on some of those where it does not matter, system libraries13 are not consistent. The check can be suppressed by including alloc_dealloc_mismatch=0 in ASAN_OPTIONS.\nASan also checks system calls and sometimes reports can refer to problems in the system software and not the package nor R. A couple of reports have been of ‘heap-use-after-free’ errors in the X11 libraries called from Tcl/Tk.\n\n\n4.3.4 Using the Leak Sanitizer\nFor x86_64 Linux there is a leak sanitizer, ‘LSan’: see https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer. This is available on recent versions of gcc and clang, and where available is compiled in as part of ASan.\nOne way to invoke this from an ASan-enabled build is by the environment variable\nASAN_OPTIONS='detect_leaks=1'\nHowever, this was made the default as from clang 3.5 and gcc 5.1.0.\nWhen LSan is enabled, leaks give the process a failure error status (by default 23). For an R package this means the R process, and as the parser retains some memory to the end of the process, if R itself was built against ASan all runs will have a failure error status (which may include running R as part of building R itself).\nTo disable this, allocation-mismatch checking and some strict C++ checking use\nsetenv ASAN_OPTIONS 'alloc_dealloc_mismatch=0:detect_leaks=0:detect_odr_violation=0'\nLSan also has a ‘stand-alone’ mode where it is compiled in using -fsanitize=leak and avoids the run-time overhead of ASan.\n\n\n4.3.5 Using the Undefined Behaviour Sanitizer\n‘Undefined behaviour’ is where the language standard does not require particular behaviour from the compiler. Examples include division by zero (where for doubles R requires the ISO/IEC 60559 behaviour but C/C++ do not), use of zero-length arrays, shifts too far for signed types (e.g. int x, y; y = x << 31;), out-of-range coercion, invalid C++ casts and mis-alignment. Not uncommon examples of out-of-range coercion in R packages are attempts to coerce a NaN or infinity to type int or NA_INTEGER to an unsigned type such as size_t. Also common is y[x - 1] forgetting that x might be NA_INTEGER.\n‘UBSanitizer’ is a tool for C/C++ source code selected by -fsanitize=undefined in suitable builds14 of clang and GCC. Its (main) runtime library is linked into each package’s DLL, so it is less often needed to be included in MAIN_LDFLAGS. Platforms supported by clang are listed at https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#supported-platforms: CRAN uses it for C/C++ with both GCC and clang on x86_64 Linux: the two toolchains often highlight different things with more reports from clang than GCC.\n14 On some platforms the runtime library, libubsan, needs to be installed separately. For macOS, see https://developer.apple.com/documentation/xcode/diagnosing-memory-thread-and-crash-issues-early.This sanitizer may be combined with the Address Sanitizer by -fsanitize=undefined,address (where both are supported, and we have seen library conflicts for clang 17 and later).\nFiner control of what is checked can be achieved by other options.\nFor clang see https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#ubsan-checks. The current set is (on a single line):\n-fsanitize=alignment,bool,bounds,builtin,enum,float-cast-overflow,\nfloat-divide-by-zero,function,implicit-unsigned-integer-truncation,\nimplicit-signed-integer-truncation,implicit-integer-sign-change,\ninteger-divide-by-zero,nonnull-attribute,null,nullability-arg,\nnullability-assign,nullability-return,object-size,\npointer-overflow,return,returns-nonnull-attribute,shift,\nsigned-integer-overflow,unreachable,unsigned-integer-overflow,\nunsigned-shift-base,vla-bound,vptr\n(plus the more specific versions array-bounsds, local-bounds, shift-base and shift-exponent), or use something like\n-fsanitize=undefined -fno-sanitize=float-divide-by-zero\nwhere in recent versions -fno-sanitize=float-divide-by-zero is the default.\nOptions return and vptr apply only to C++: to use vptr its run-time library needs to be linked into the main R executable by building the latter with something like\nMAIN_LD=\"clang++ -fsanitize=undefined\"\nOption float-divide-by-zero is undesirable for use with R which allow such divisions as part of IEC 60559 arithmetic, and in versions of clang since June 2019 it is no longer part of -fsanitize=undefined.\nThere are also groups of options implicit-integer-truncation, mplicit-integer-arithmetic-value-change, implicit-conversion, integer and nullability.\nFor GCC see https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html (or the manual for your version of GCC, installed or via https://gcc.gnu.org/onlinedocs/: look for ‘Program Instrumentation Options’) for the options supported by GCC: versions 13.x supported\n-fsanitize=alignment,bool,bounds,builtin,enum,integer-divide-by-zero,\nnonnull-attribute,null,object-size,pointer-overflow,return,\nreturns-nonnull-attribute,shift,signed-integer-overflow,\nunreachable,vla-bound,vptr\nplus the more specific versions shift-base and shift-exponent and non-default options\nbounds-strict,float-cast-overflow,float-divide-by-zero\nwhere float-divide-by-zero is not desirable for R uses and bounds-strict is an extension of bounds.\nOther useful flags include\n-no-fsanitize-recover\nwhich causes the first report to be fatal (it always is for the unreachable and return suboptions). For more detailed information on where the runtime error occurs, using\nsetenv UBSAN_OPTIONS 'print_stacktrace=1'\nwill include a traceback in the report. Beyond that, R can be run under a debugger with a breakpoint set before the sanitizer report is produced: for gdb or lldb you could use\nbreak __ubsan_handle_float_cast_overflow\nbreak __ubsan_handle_float_cast_overflow_abort\nor similar (there are handlers for each type of undefined behaviour).\nThere are also the compiler flags -fcatch-undefined-behavior and -ftrapv, said to be more reliable in clang than gcc.\nFor more details on the topic see https://blog.regehr.org/archives/213 and https://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html (which has 3 parts).\nIt may or may not be possible to build R itself with -fsanitize=undefined: problems have in the past been seen with OpenMP-using code with gcc but there has been success with clang up to version 16.. However, problems have been seen with clang 17 and later, including missing entry points and R builds hanging. What has succeeded is to use UBSAN just for the package under test (and not in combination with ASAN). To do so, check with an unaltered R, using a custom Makevars file something like\nCC = clang -fsanitize=undefined -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer\nCXX = clang++ -fsanitize=undefined -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -frtti\n\nUBSAN_DIR = /path/to/LLVM18/lib/clang/18/lib/x86_64-unknown-linux-gnu\nSAN_LIBS = $(UBSAN_DIR)/libclang_rt.ubsan_standalone.a $(UBSAN_DIR)/libclang_rt.ubsan_standalone_cxx.a\nwhich links the UBSAN libraries statically into the package-under-test’s DSO. It is also possible to use the dynamic library via\nSAN_LIBS = -L$(UBSAN_DIR) -Wl,-rpath,$(UBSAN_DIR) -lclang_rt.ubsan_standalone\nprovided UBSAN_DIR is added to the runtime library path (as shown or using LD_LIBRARY_PATH). N.B.: The details, especially the paths used, have changed several times recently.\n\n\n4.3.6 Other analyses with ‘clang’\nRecent versions of clang on x86_64 Linux have ‘ThreadSanitizer’ (https://github.com/google/sanitizers/wiki#threadsanitizer), a ‘data race detector for C/C++ programs’, and ‘MemorySanitizer’ (https://clang.llvm.org/docs/MemorySanitizer.html, https://github.com/google/sanitizers) for the detection of uninitialized memory. Both are based on and provide similar functionality to tools in valgrind.\nclang has a ‘Static Analyzer’ which can be run on the source files during compilation: see https://clang-analyzer.llvm.org/.\n\n\n4.3.7 Other analyses with ‘gcc’\nGCC 10 introduced a new flag -fanalyzer which does static analysis during compilation, currently for C code. It is regarded as experimental and it may slow down computation considerably when problems are found (and use many GB of resident memory). There is some overlap with problems detected by the Undefined Behaviour sanitizer, but some issues are only reported by this tool and as it is a static analysis, it does not rely on code paths being exercised.\nSee https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html (or the documentation for your version of gcc if later) and https://developers.redhat.com/blog/2020/03/26/static-analysis-in-gcc-10\n\n\n4.3.8 Using ‘Dr. Memory’\n‘Dr. Memory’ from https://drmemory.org/ is a memory checker for (currently) Windows, Linux and macOS with similar aims to valgrind. It works with unmodified executables15 and detects memory access errors, uninitialized reads and memory leaks.\n15 but works better if inlining and frame pointer optimizations are disabled.\n\n4.3.9 Fortran array bounds checking\nMost of the Fortran compilers used with R allow code to be compiled with checking of array bounds: for example gfortran has option -fbounds-check. This will give an error when the upper or lower bound is exceeded, e.g.\nAt line 97 of file .../src/appl/dqrdc2.f\nFortran runtime error: Index '1' of dimension 1 of array 'x' above upper bound of 0\nOne does need to be aware that lazy programmers often specify Fortran dimensions as 1 rather than * or a real bound and these will be reported (as may * dimensions)\nIt is easy to arrange to use this check on just the code in your package: add to ~/.R/Makevars something like (for gfortran)\nFFLAGS = -g -O2 -mtune=native -fbounds-check\nwhen you run R CMD check.\nThis may report errors with the way that Fortran character variables are passed, particularly when Fortran subroutines are called from C code and character lengths are not passed (see Fortran character strings).",
"crumbs": [
"4 Debugging"
]