Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{2023.06}[2023a,sapphire_rapids] Dependencies of PyTorch 2.1.2 #879

Conversation

bedroge
Copy link
Collaborator

@bedroge bedroge commented Jan 23, 2025

I'm trying to look into the test failures from #875. One of the culprits may be Z3, see easybuilders/easybuild-easyconfigs#20222. This PR adds all dependencies except Z3, I'll do a test build after this one to find out if this new Z3 version makes a difference.

@bedroge bedroge added 2023.06-software.eessi.io 2023.06 version of software.eessi.io sapphire_rapids labels Jan 23, 2025
Copy link

eessi-bot bot commented Jan 23, 2025

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/sapphire_rapids, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-software, eessi.io-2023.06-compat

Copy link

eessi-bot bot commented Jan 23, 2025

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-software, eessi.io-2023.06-compat

@bedroge
Copy link
Collaborator Author

bedroge commented Jan 23, 2025

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids

Copy link

eessi-bot bot commented Jan 23, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids resulted in:

Copy link

eessi-bot bot commented Jan 23, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Jan 23, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-intel-sapphire_rapids for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.01/pr_879/42198

date job status comment
Jan 23 09:03:11 UTC 2025 submitted job id 42198 awaits release by job manager
Jan 23 09:03:18 UTC 2025 released job awaits launch by Slurm scheduler
Jan 23 09:08:20 UTC 2025 running job 42198 is running
Jan 23 09:53:07 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-42198.out
✅ no message matching FATAL:
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-sapphire_rapids-1737625518.tar.gzsize: 27 MiB (28914118 bytes)
entries: 5163
modules under 2023.06/software/linux/x86_64/intel/sapphire_rapids/modules/all
expecttest/0.1.5-GCCcore-12.3.0.lua
GMP/6.2.1-GCCcore-12.3.0.lua
gmpy2/2.1.5-GCC-12.3.0.lua
libyaml/0.2.5-GCCcore-12.3.0.lua
MPC/1.3.1-GCCcore-12.3.0.lua
MPFR/4.2.0-GCCcore-12.3.0.lua
networkx/3.1-gfbf-2023a.lua
Pillow/10.0.0-GCCcore-12.3.0.lua
PyYAML/6.0-GCCcore-12.3.0.lua
sympy/1.12-gfbf-2023a.lua
software under 2023.06/software/linux/x86_64/intel/sapphire_rapids/software
expecttest/0.1.5-GCCcore-12.3.0
GMP/6.2.1-GCCcore-12.3.0
gmpy2/2.1.5-GCC-12.3.0
libyaml/0.2.5-GCCcore-12.3.0
MPC/1.3.1-GCCcore-12.3.0
MPFR/4.2.0-GCCcore-12.3.0
networkx/3.1-gfbf-2023a
Pillow/10.0.0-GCCcore-12.3.0
PyYAML/6.0-GCCcore-12.3.0
sympy/1.12-gfbf-2023a
other under 2023.06/software/linux/x86_64/intel/sapphire_rapids
no other files in tarball
Jan 23 09:53:07 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /775175bf @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 1.73 us (r:0, l:None, u:None)
[ OK ] (2/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /52707c40 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 1.74 us (r:0, l:None, u:None)
[ OK ] (3/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /b1aacda9 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 4.05 us (r:0, l:None, u:None)
[ OK ] (4/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /c6bad193 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 4.02 us (r:0, l:None, u:None)
[ OK ] (5/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 0.46 us (r:0, l:None, u:None)
[ OK ] (6/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 0.31 us (r:0, l:None, u:None)
[ OK ] (7/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:x86-64-intel-srapids-node+default
P: bandwidth: 13691.35 MB/s (r:0, l:None, u:None)
[ OK ] (8/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:x86-64-intel-srapids-node+default
P: bandwidth: 13895.36 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 8/8 test case(s) from 8 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-42198.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Collaborator Author

bedroge commented Jan 23, 2025

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids

Copy link

eessi-bot bot commented Jan 23, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids resulted in:

Copy link

eessi-bot bot commented Jan 23, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Jan 23, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-intel-sapphire_rapids for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.01/pr_879/42199

date job status comment
Jan 23 10:12:03 UTC 2025 submitted job id 42199 awaits release by job manager
Jan 23 10:12:11 UTC 2025 released job awaits launch by Slurm scheduler
Jan 23 10:13:14 UTC 2025 running job 42199 is running
Jan 23 10:57:00 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-42199.out
✅ no message matching FATAL:
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-sapphire_rapids-1737629351.tar.gzsize: 27 MiB (29104852 bytes)
entries: 5227
modules under 2023.06/software/linux/x86_64/intel/sapphire_rapids/modules/all
expecttest/0.1.5-GCCcore-12.3.0.lua
GMP/6.2.1-GCCcore-12.3.0.lua
gmpy2/2.1.5-GCC-12.3.0.lua
libyaml/0.2.5-GCCcore-12.3.0.lua
MPC/1.3.1-GCCcore-12.3.0.lua
MPFR/4.2.0-GCCcore-12.3.0.lua
networkx/3.1-gfbf-2023a.lua
Pillow/10.0.0-GCCcore-12.3.0.lua
pytest-flakefinder/1.1.0-GCCcore-12.3.0.lua
pytest-rerunfailures/12.0-GCCcore-12.3.0.lua
PyYAML/6.0-GCCcore-12.3.0.lua
sympy/1.12-gfbf-2023a.lua
software under 2023.06/software/linux/x86_64/intel/sapphire_rapids/software
expecttest/0.1.5-GCCcore-12.3.0
GMP/6.2.1-GCCcore-12.3.0
gmpy2/2.1.5-GCC-12.3.0
libyaml/0.2.5-GCCcore-12.3.0
MPC/1.3.1-GCCcore-12.3.0
MPFR/4.2.0-GCCcore-12.3.0
networkx/3.1-gfbf-2023a
Pillow/10.0.0-GCCcore-12.3.0
pytest-flakefinder/1.1.0-GCCcore-12.3.0
pytest-rerunfailures/12.0-GCCcore-12.3.0
PyYAML/6.0-GCCcore-12.3.0
sympy/1.12-gfbf-2023a
other under 2023.06/software/linux/x86_64/intel/sapphire_rapids
no other files in tarball
Jan 23 10:57:00 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /775175bf @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 2.42 us (r:0, l:None, u:None)
[ OK ] (2/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /52707c40 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 1.94 us (r:0, l:None, u:None)
[ OK ] (3/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /b1aacda9 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 3.81 us (r:0, l:None, u:None)
[ OK ] (4/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /c6bad193 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 3.76 us (r:0, l:None, u:None)
[ OK ] (5/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 0.41 us (r:0, l:None, u:None)
[ OK ] (6/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 0.38 us (r:0, l:None, u:None)
[ OK ] (7/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:x86-64-intel-srapids-node+default
P: bandwidth: 13840.93 MB/s (r:0, l:None, u:None)
[ OK ] (8/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:x86-64-intel-srapids-node+default
P: bandwidth: 13901.91 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 8/8 test case(s) from 8 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-42199.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Collaborator Author

bedroge commented Jan 23, 2025

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids

Copy link

eessi-bot bot commented Jan 23, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids resulted in:

Copy link

eessi-bot bot commented Jan 23, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/sapphire_rapids from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/sapphire_rapids resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Jan 23, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-intel-sapphire_rapids for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.01/pr_879/42202

date job status comment
Jan 23 12:46:17 UTC 2025 submitted job id 42202 awaits release by job manager
Jan 23 12:47:18 UTC 2025 released job awaits launch by Slurm scheduler
Jan 23 12:53:28 UTC 2025 running job 42202 is running
Jan 23 13:40:27 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-42202.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-sapphire_rapids-1737639157.tar.gzsize: 27 MiB (29183855 bytes)
entries: 5261
modules under 2023.06/software/linux/x86_64/intel/sapphire_rapids/modules/all
expecttest/0.1.5-GCCcore-12.3.0.lua
GMP/6.2.1-GCCcore-12.3.0.lua
gmpy2/2.1.5-GCC-12.3.0.lua
libyaml/0.2.5-GCCcore-12.3.0.lua
MPC/1.3.1-GCCcore-12.3.0.lua
MPFR/4.2.0-GCCcore-12.3.0.lua
networkx/3.1-gfbf-2023a.lua
Pillow/10.0.0-GCCcore-12.3.0.lua
pytest-flakefinder/1.1.0-GCCcore-12.3.0.lua
pytest-rerunfailures/12.0-GCCcore-12.3.0.lua
pytest-shard/0.1.2-GCCcore-12.3.0.lua
PyYAML/6.0-GCCcore-12.3.0.lua
sympy/1.12-gfbf-2023a.lua
software under 2023.06/software/linux/x86_64/intel/sapphire_rapids/software
expecttest/0.1.5-GCCcore-12.3.0
GMP/6.2.1-GCCcore-12.3.0
gmpy2/2.1.5-GCC-12.3.0
libyaml/0.2.5-GCCcore-12.3.0
MPC/1.3.1-GCCcore-12.3.0
MPFR/4.2.0-GCCcore-12.3.0
networkx/3.1-gfbf-2023a
Pillow/10.0.0-GCCcore-12.3.0
pytest-flakefinder/1.1.0-GCCcore-12.3.0
pytest-rerunfailures/12.0-GCCcore-12.3.0
pytest-shard/0.1.2-GCCcore-12.3.0
PyYAML/6.0-GCCcore-12.3.0
sympy/1.12-gfbf-2023a
other under 2023.06/software/linux/x86_64/intel/sapphire_rapids
no other files in tarball
Jan 23 13:40:27 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /775175bf @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 2.04 us (r:0, l:None, u:None)
[ OK ] (2/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /52707c40 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 1.83 us (r:0, l:None, u:None)
[ OK ] (3/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /b1aacda9 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 4.17 us (r:0, l:None, u:None)
[ OK ] (4/8) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /c6bad193 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 4.02 us (r:0, l:None, u:None)
[ OK ] (5/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 0.35 us (r:0, l:None, u:None)
[ OK ] (6/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:x86-64-intel-srapids-node+default
P: latency: 0.36 us (r:0, l:None, u:None)
[ OK ] (7/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:x86-64-intel-srapids-node+default
P: bandwidth: 13226.33 MB/s (r:0, l:None, u:None)
[ OK ] (8/8) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:x86-64-intel-srapids-node+default
P: bandwidth: 13203.49 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 8/8 test case(s) from 8 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-42202.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Jan 23 14:12:51 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-sapphire_rapids-1737639157.tar.gz to S3 bucket succeeded
Jan 23 14:13:58 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-sapphire_rapids-1737639157.tar.gz to S3 bucket succeeded

@bedroge bedroge added bot:deploy Ask bot to deploy missing software installations to EESSI and removed bot:deploy Ask bot to deploy missing software installations to EESSI labels Jan 23, 2025
@boegel boegel merged commit f7c1284 into EESSI:2023.06-software.eessi.io Jan 23, 2025
49 checks passed
Copy link

eessi-bot bot commented Jan 23, 2025

PR merged! Moved ['/project/def-users/SHARED/jobs/2025.01/pr_879/42198', '/project/def-users/SHARED/jobs/2025.01/pr_879/42199', '/project/def-users/SHARED/jobs/2025.01/pr_879/42202'] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.01.23

Copy link

eessi-bot bot commented Jan 23, 2025

PR merged! Moved [] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.01.23

@bedroge bedroge deleted the sapphire_rapids_pytorch_212_deps branch January 23, 2025 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io bot:deploy Ask bot to deploy missing software installations to EESSI sapphire_rapids
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants