v0.13.0
HyperQueue 0.13.0
New features
Resource management
-
Almost complete rewrite of resource management.
CPU and other resources were unified: the most visible change is that you can define "cpus" and other resource;
and other resources can now be defined in groups (NUMA-like resources). -
Many improvements in scheduler: Improved schedules for multi-resource requests;
better behavior on non-heterogeneous clusters;
better interaction between resources and priorities.
Automatic allocation
- #467 You can now pause (and resume)
autoalloc queues usinghq alloc pause
andhq alloc resume
.
Paused queues will not submit new allocations into the selected job manager. They can be later resumed.
When an autoalloc queue hits too many submission or worker execution errors, it will now be paused
instead of removed.
Tasks
-
HQ allows to limit how many times a task may be in a running state while worker is lost
(such a task may be a potential source of worker's crash).
If the limit is reached, the task is marked as failed.
The limit can be configured by--crash-limit
in submit. -
Groups of workers are introduced. A multi-node task is now started only on workers from the same group.
By default, workers are grouped by PBS/Slurm allocations, but it can be configured manually.
Changes
Resource management
--cpus=no-ht
is now changed to a flag--no-hyper-threading
.- Explicit list definition of a resource was changed from
--resource xxx=list(1,2,3)
to--resource xxx=[1,2,3]
.
(this is the result of unification of CPUs with other resources). - Python API: Attribute
generic
inResourceRequest
is renamed toresources
Tasks
- #461 When a task is cancelled, times out
or its worker is killed, HyperQueue now tries to make sure that both the tasks and any processes that
it has spawned will be also terminated. - #480 You can now select multiple tasks in
hq task info
.
Artifact summary:
- hq-v0.13.0-*: Main HyperQueue build containing the
hq
binary. Download this archive to
use HyperQueue from the command line. - hyperqueue-0.13.0-*: Wheel containing the
hyperqueue
package with HyperQueue Python
bindings.