-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a Linux ARM64 builder node #255
Conversation
Hi @martin-g, Thanks for the PR. After some internal discussion we decided that it would be easier if you configured kunpeng1 as a standalone builder (as opposed to a non-standalone builder that participates to our official daily builds). At least for now. See our Prepare-Ubuntu-20.04-HOWTO.md document for the difference between a standalone builder and a non-standalone builder (a.k.a. secondary builder). The major differences between the standalone vs non-standalone setups are:
Note that the standalone setup requires to install and run the Apache server on the machine. In addition, you will need to make the following changes to your current PR:
Once you are ready to run the builds, try to run them manually first before you run them via a crontab. First try to run the
If everything is going as expected, this will produce a long output showing progression. If it completes successfully (this might take one hour or two), then the last lines of output should be something like this:
Next thing to try is the
This might take between 1 or 4 days, or even more, to complete, depending on how powerful your machine is and how you've set the If it completes successfully, then the last lines of output should be something like this:
Last step is to run the
If this completes successfully (should only take a couple of minutes or less), then the last lines of output should be something like this:
The HTML report should be available at http://localhost/BBS/3.17/bioc/report/ Let us know how it goes. Best, |
@hpages Thanks for your detail guide, @martin-g help to configure and launch the BBS in a local aarch64 linux machine. Finally, we got the first report, and I also upload it in here to convenient to others take a look (just for review):
We took a rough look on the checkresults, there are mainly below kinds of error in the reporst: 1. (1320 counts) WARNING about
|
The first step in making a builder is going through the appropriate BiocPkgTools |
You will need ~20GB of disk for that installation? Maybe more, maybe less. Be sure not to fill the disk. Also manage the TMPDIR variable well. |
There are more than 350GB of disk space! I didn't understand well whether I need to do something manually with BiocPkgTools PkgDependency or it has been done by the first report build and there will be less such kind of issues in the second and following build runs ? |
If there are a lot of failures due to unavailable packages that have not been dropped frpm CRAN or Bioc, I think you can conclude that the builder just wasn't set up with the 4000+ packages assumed to be available for all installations to succeed. So get the list of all packages and use BiocManager to install them all ... it will avoid redundancies and can use multicore via Ncpus= argument. |
Herve will have more definitive information |
I have installed all But I haven't seen anything explaining that I need to install 4000+ R packages from CRAN. Maybe this is the step I missed ?! |
You don't need to install any package manually, that would be wild! 😉 The build system normally takes care of installing all Bioconductor software packages + their deps. That's 4000+ packages! Seems that you actually have them: https://yikun.github.io/bioconductor/report/kunpeng1-R-instpkgs.html (to get to that page, click on the link under Installed pkgs at the top of the main page of the report). About those warnings:
Never seen them before but I suspect they're kind of related to the use of the Anyways, these look like spurious warnings to me. An easy way to get rid of them, if you wanted to (and if my diagnostic is correct), would be to nuke the Best, |
@hpages Ah, thanks for your reply, I remembered that I had encountered a issue after new version (4.2.x) R lang. We finally introduce the R_LIBS_SITE manually to solve. This might also related. |
@hpages Looks like it works: https://yikun.github.io/bioconductor-0201/report/long-report.html Although there are still some error, but I think we are on the right road! |
I wonder why
|
@Yikun Great! Looks like tximportData failed to install. What do you see in the logs for this? Look at |
@hpages Emm, looks like due to network issue... |
make sure
or some such setting is in force for the builder |
Yeah Anyways 10 min to download 73.8 MB seems extremely slow to me, especially for a server that runs the builds. Was probably some intermittent network outage. |
@vjcitn @hpages Thanks! Beside this one, there are still many R package install timeout [1]. The build machine is in Singapore region, I'm not sure whether is too far from the bioc repo. Anyway, I have set the [1] https://gist.github.com/Yikun/e9101cb48396f2a698374394cdca525c |
I'm not sure why the package install timeout is
but I already set
So just curious where to set / configure / import the |
bioconductor.org is served via Amazon CloudFront so physical distance from the bioc repo should not matter. You can see this with |
BBS has its own timeout limit of 40 min per command (INSTALL, BUILD, CHECK). You can change this via environment vartiables |
Here is latest results: https://yikun.github.io/bioconductor-0208/report/long-report.html According latest results: 1. 94+ INSTALL ERROR:
2. 240+ BUILD ERROR:
3. CHECK ERRORsome check error are due to limit network and performance, we are going to upgrade 8 cores to 32 core to rerun job to see any improvement or not. |
I'm just going to focus on INSTALL errors for now. Furthermore, I'm just going to focus on CRAN packages that failed to install for now. Unfortuntaely, the build report doesn't provide the details for these failures. So here is what I did: The online report contains a tarball,
So 31 packages didn't get installed. To get the list:
This produces the following output:
Now some of those packages are CRAN packages (e.g. animation) and others are Bioconductor packages (e.g. ChemmineR). Focusing on CRAN packages for now: CRAN packages can fail to install for different reasons. One common reason is that the package got removed from CRAN. This is the case for example for EntropyExplorer (required by Bioconductor package epihet): https://cran.r-project.org/package=EntropyExplorer Other CRAN packages that got removed are: kmlShape (required by Bioconductor package tscR), mGSZ (required by Bioconductor package ASpediaFI), mppa (required by Bioconductor package NBSplice), propr (required by Bioconductor package timeOmics), ReorderCluster (required by Bioconductor package AneuFinder), spatstat.core (required by Bioconductor package Statial), spp (required by Bioconductor package ChIC), and taRifx (required by Bioconductor package pulsedSilac). Note that those removals from CRAN cause the same breakage on our daily builds: https://bioconductor.org/checkResults/3.17/bioc-LATEST/ Not much we can do, so we're just gonna have to ignore those failures. Other CRAN packages that didn't install are: magick, rsvg, multipanelfigure, and summarytools. The Note that magick and rsvg have system requirements: For multipanelfigure and summarytools I have no idea what happened. Wanna share Thanks, |
@hpages Wow, thanks for your investigation, I will take a deep look when I back to my desktop. FYI, I also uploaded the install detail before: So we can see multipanelfigure and summarytools is due to unavailable magick package. rsvg seems due to deps: https://github.com/Yikun/yikun.github.com/blob/master/bioconductor-0208/products-in/kunpeng1/install/rsvg.install-out.txt No idea about magic: https://github.com/Yikun/yikun.github.com/blob/master/bioconductor-0208/products-in/kunpeng1/install/magick.install-out.txt |
BTW, |
Very useful, thanks!
Good to know. That reduces the CRAN installation failures to address to magick and rsvg only.
Easy to fix:
They don't have an error message that is as clear/useful as rsvg, but it looks to me that this is a similar issue. What does
You're right, thanks for the catch. Looks like I didn't miss any CRAN package installation failures though, only a few Bioconductor package installation failures, so I was lucky. But good to know for next time. If you re-run the builds (after installing the missing external libs for magick and rsvg), you should get much cleaner INSTALL results. Next step we'll focus on the remaining INSTALL failures that are ARM64 -specific i.e. that we see on your report but not on nebbiolo1. H. |
I installed them, and will re-run the builds soon:
Looks like need also add them to ? |
Ok. Are you able to install magick and rsvg manually? Start R and do
Aren't they here already? BBS/Ubuntu-files/20.04/apt_cran.txt Line 17 in 2df9b05
BBS/Ubuntu-files/20.04/apt_cran.txt Line 4 in 2df9b05
Make sure everything listed in all the files in |
Yes, I can install successfully!
Ah, I will do a double confirm! also cc @martin-g |
54 packages with an INSTALL failure on the latest report. That's good progress! These failures fall in 3 categories:
The big number of
Yes, the fact that the basic M1 chip has only 8 logical cores is a serious limitation as far as the builds are concerned 😞 H. |
Yes, failures fall in below categories, and we start to work with upstream to fix them:
Actually, this is a separate work already some pretty good progress with AHUG(Arm HPC User Group), from the share results we can see the problem of CRAN on aarch64 linux is nothing too serious.
Thanks, we already upgrade the node to a more powerful flavor 32 Core 64G Mem, current we set
Or could you give some more suggestion? |
Probably a good start. Let's see how it goes and you will be able to adjust if needed based on the results. |
@Yikun Thanks for CCing me, the work for getting CRAN to build is very much active. At the current stage we've offered our dual socket TX2 nodes for build checking and there seems to be some interest from CRAN maintainers. During SC22 we've discussed with AWS the requirements to conduct these checks in a sustainable way as the package count is still growing. The issues you've mentioned pretty much matches what we're seeing on CRAN as well (the AHUG presentation talks about these too) :
Here's a non-exhausitive list I've compiled of bad CRAN installs on AArch64 with AlmaLinux 8 (we're doing this for HPC use cases, hence RHEL derivative)
|
Here is latest report https://yikun.github.io/bioconductor-0301/report/long-report.html 31 packages with an INSTALL failure on the latest report.
In last month, There are 8 packages completed the Linux arm64 fix: bgx, Rbowtie2, FLAMES, gmapR, LEA, msa, Rbwa and SAIGEgds. (Really much thanks for @hpages guidance and package matainers review and help!) There are still 5 fixes need to be merged in upstream:
|
Only 5 Linux ARM64 specific INSTALL failures compared to 13 three weeks ago. That's great progress! Next we'll need to start looking at Linux ARM64 specific BUILD and CHECK failures like basilisk, biobtreeR, decoupleR, DESeq2, etc... Unfortunately there are many of those. In some cases the error seems to be due to slightly different results produced by operations that involve floating point arithmetic. These are not going to be easy to troubleshoot 😟 |
Hello, Here is a summary of the first report for Bioconductor 3.18:
|
ca3ee5f
to
b1a92d4
Compare
Could someone please check the new configs for 3.18 in this PR ?
Trying to re-install it still uses 3.17:
Do we need to update something more ? For the last run the installation of |
Sorry I am having trouble trying to view the repository on Github, but when
it's not using the right version of Bioconductor from the Spring to Fall
cycle, we do `BiocManager::install(version="devel")`.
…On Thu, May 11, 2023 at 9:26 AM Martin Grigorov ***@***.***> wrote:
Could someone please check the new configs for 3.18 in this PR ?
We ask because we have some doubts whether the latest report really uses
Bioc 3.18.
$ R
...
BiocManager::version()
[1] ‘3.17’
Trying to re-install it still uses 3.17:
install.packages("BiocManager")
trying URL 'https://cloud.r-project.org/src/contrib/BiocManager_1.30.20.tar.gz'
Content type 'application/x-gzip' length 265248 bytes (259 KB)
==================================================
downloaded 259 KB
* installing *source* package 'BiocManager' ...
** package 'BiocManager' successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (BiocManager)
The downloaded source packages are in
'/tmp/RtmpXa8oMh/downloaded_packages'
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
> BiocManager::version()
[1] '3.17'
Do we need to update something more ?
Thanks !
—
Reply to this email directly, view it on GitHub
<#255 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANJ3AJPDN3PNGCCNLI3CVLXFTSIVANCNFSM6AAAAAAT4X4YBI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thank you, @jwokaty ! |
I just re-tested all For example:
|
I want to thank all people involved in adding support for Linux ARM64! |
Great work! We are also considering using ARM64 machines for our work! |
Thank you for the nice words, @emiliofernandes & @markjens ! Here is a link to the latest (from 31.05.2023) run of BBS on Ubuntu ARM64 - https://yikun.github.io/latest-bioc/report/long-report.html |
@emiliofernandes @markjens We're working on adding Linux ARM64 to the official BBS runs. See #292 Thanks for your interest! |
6fe9740
to
cde01f1
Compare
Name: kunpeng1 Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
The files are copied from nebbiolo1. According to https://github.com/Bioconductor/BBS/blob/master/Doc/Prepare-Ubuntu-20.04-HOWTO.md#25-add-software-builds-to-biocbuilds-crontab those should be added to crontab. But for some reason only nebbiolo1 have them. All other builder don't have these scripts Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
For the time being kunpeng1 won't be used as a secondary builder Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
c35c02c
to
fab0d17
Compare
First Bioconductor daily report with kunpeng2 results: https://bioconductor.org/checkResults/3.18/bioc-LATEST/long-report.html 🎉 @martin-g @Yikun Can we close this PR? It's been superseded by #293 |
Thank you all! |
Name: kunpeng1
Signed-off-by: Martin Tzvetanov Grigorov [email protected]