forked from SchedMD/slurm
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathRELEASE_NOTES
82 lines (73 loc) · 4.32 KB
/
RELEASE_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
RELEASE NOTES FOR SLURM VERSION 21.08
IMPORTANT NOTES:
If using the slurmdbd (Slurm DataBase Daemon) you must update this first.
NOTE: If using a backup DBD you must start the primary first to do any
database conversion, the backup will not start until this has happened.
The 21.08 slurmdbd will work with Slurm daemons of version 20.02 and above.
You will not need to update all clusters at the same time, but it is very
important to update slurmdbd first and having it running before updating
any other clusters making use of it.
Slurm can be upgraded from version 20.02 or 20.11 to version 21.08 without loss
of jobs or other state information. Upgrading directly from an earlier version
of Slurm will result in loss of state information.
If using SPANK plugins that use the Slurm APIs, they should be recompiled when
upgrading Slurm to a new major release.
HIGHLIGHTS
==========
-- Removed gres/mic plugin used to support Xeon Phi coprocessors.
-- Add LimitFactor to the QOS. A float that is factored into an associations
GrpTRES limits. For example, if the LimitFactor is 2, then an association
with a GrpTRES of 30 CPUs, would be allowed to allocate 60 CPUs when
running under this QOS.
-- A job's next_step_id counter now resets to 0 after being requeued.
Previously, the step id's would continue from the job's last run.
-- API change: Removed slurm_kill_job_msg and modified the function signature
for slurm_kill_job2. slurm_kill_job2 should be used instead of
slurm_kill_job_msg.
-- AccountingStoreFlags=job_script allows you to store the job's batch script.
-- AccountingStoreFlags=job_env allows you to store the job's env vars.
-- configure: the --with option handling has been made consistent across the
various optional libraries. Specifying --with-foo=/path/to/foo will only
check that directory for the applicable library (rather than, in some cases,
falling back to the default directories), and will always error the build
if the library is not found (instead of a mix of error messages and non-
fatal warning messages).
-- configure: replace --with-rmsi_dir option with proper handling for
--with-rsmi=dir.
-- Removed sched/hold plugin.
-- cli_filter/lua, jobcomp/lua, job_submit/lua now load their scripts from the
same directory as the slurm.conf file (and thus now will respect changes
to the SLURM_CONF environment variable).
-- SPANK - call slurm_spank_init if defined without slurm_spank_slurmd_exit in
slurmd context.
-- Add new 'PLANNED' state to a node to represent when the backfill scheduler
has it planned to be used in the future instead of showing as 'IDLE'.
-- Put node into "INVAL" state upon registering with an invalid node
configuration. Node must register with a valid configuration to continue.
CONFIGURATION FILE CHANGES (see man appropriate man page for details)
=====================================================================
-- Errors detected in the parser handlers due to invalid configurations are now
propagated and can lead to fatal (and thus exit) the calling process.
-- Enforce a valid configuration for AccountingStorageEnforce in slurm.conf.
If the configuration is invalid, then an error message will be printed and
the command or daemon (including slurmctld) will not run.
-- Removed AccountingStoreJobComment option. Please update your config to use
AccountingStoreFlags=job_comment instead.
-- Removed DefaultStorage{Host,Loc,Pass,Port,Type,User} options.
-- Removed CacheGroups, CheckpointType, JobCheckpointDir, MemLimitEnforce,
SchedulerPort, SchedulerRootFilter options.
COMMAND CHANGES (see man pages for details)
===========================================
-- Changed the --format handling for negative field widths (left justified)
to apply to the column headers as well as the printed fields.
-- Invalidate multiple partition requests when using partition based
associations.
-- --cpus-per-task and --threads-per-core now imply --exact.
This fixes issues where steps would be allocated the wrong number of CPUs.
API CHANGES
===========
-- jobcomp plugin: change plugin API to jobcomp_p_*().
-- sched plugin: change plugin API to sched_p_*() and remove
slurm_sched_p_initial_priority() call.
-- step_ctx code has been removed from the api.
-- slurm_stepd_get_info()/stepd_get_info() has been removed from the api.