29 Aug 14:38

github-actions

e024ce8

v0.12.0-rc1 Pre-release

Pre-release

HyperQueue 0.12.0-rc1

New features

Automatic allocation

#457 You can now specify the idle timeout
for workers started by the automatic allocator using the --idle-timeout flag of the hq alloc add command.

Resiliency

#449 Tasks that were present during multiple
crashes of the workers will be canceled.

CLI

#463 You can now wait until N workers
are connected to the clusters with hq worker wait N.

Python API

Resource requests improvements in Python API.

Changes

CLI

#477 Requested resources are now shown while
submitting an array and while viewing information about task TASK_ID of specified
job JOB_ID using hq task info JOB_ID TASK_ID
#444 The hq task list command will now
hide some details by default, to conserve space in terminal output. To show all details, use the
-v flag to enable verbose output.
#455 Improve the quality of error messages
produced when parsing various CLI parameters, like resources.

Automatic allocation

#448 The automatic allocator will now start
workers in multi-node Slurm allocations using srun --overlap. This should avoid taking up Slurm
task resources by the started workers (if possible). If you run into any issues with using srun
inside HyperQueue tasks, please let us know.

Jobs

#483 There is no longer a length limit
for job names.

Fixes

Job submission

#450 Attempts to resubmit a job with zero
tasks will now result in an explicit error, rather than a crash of the client.

Artifact summary:

hq-v0.12.0-rc1-*: Main HyperQueue build containing the hq binary. Download this archive to
use HyperQueue from the command line.
hyperqueue-0.12.0-rc1-*: Wheel containing the hyperqueue package with HyperQueue Python
bindings.

Assets 5

22 Jul 12:25

github-actions

v0.11.0-ligate1

9b3dfc8

v0.11.0-ligate1 Pre-release

Pre-release

HyperQueue 0.11.0-ligate1

New features

CLI

#423 You can now specify the server
directory using the HQ_SERVER_DIR environment variable.

Resource management

#427 A new specifier has been added to
specify indexed pool resources for workers as a set of individual resource indices.
```
$ hq worker start --resource "gpus=list(1,3,8)"
```
#428 Workers will now attempt to automatically
detect available GPU resources from the CUDA_VISIBLE_DEVICES environment variable.

Stream log

Basic export of stream log into JSON (hq log <log_file> export)

Server

Improved scheduling of multi-node tasks.
Server now generates a random unique ID (UID) string every time a new server is started (hq server start).
It can be used as a placeholder %{SERVER_ID}.

Changes

CLI

#433 (Backwards incompatible change)
The CLI command hq job tasks has been removed and its functionality has been incorporated into the
hq task list command instead.
resource requests,
#420 Shebang (e.g. #!/bin/bash) will
now be read from submitted program based on the provided
directives mode. If a shebang
is found, HQ will execute the program located at the shebang path and pass it the rest of the
submitted arguments.

By default, directives and shebang will be read from the submitted program only if its filename ends
with .sh. If you want to explicitly enable reading the shebang, pass --directives=file to
hq submit.

Another change is that the shebang is now read by the client (i.e. it will be read on the node that
submits the job), not on worker nodes as previously. This means that the submitted file has to be
accessible on the client node.

Resource management

#427 (Backwards incompatible change)
The environment variable HQ_RESOURCE_INDICES_<resource-name>, which is passed to tasks with
resource requests,
has been renamed to HQ_RESOURCE_VALUES_<resource-name>.
#427 (Backwards incompatible change)
The specifier for specifying indexed pool resources for workers as a range has been renamed from
indices to range.
```
# before
$ hq worker start --resource "gpus=indices(1-3)"
# now
$ hq worker start --resource "gpus=range(1-3)"
```
#427 The
generic resource
documentation has been rewritten and improved.

Artifact summary:

hq-v0.11.0-ligate1-*: Main HyperQueue build containing the hq binary. Download this archive to
use HyperQueue from the command line.
hyperqueue-0.11.0-ligate1-*: Wheel containing the hyperqueue package with HyperQueue Python
bindings.

Assets 5

21 Jun 14:44

github-actions

v0.11.0

fce3b5d

v0.11.0

HyperQueue 0.11.0

New features

CLI

#423 You can now specify the server
directory using the HQ_SERVER_DIR environment variable.

Resource management

#427 A new specifier has been added to
specify indexed pool resources for workers as a set of individual resource indices.
```
$ hq worker start --resource "gpus=list(1,3,8)"
```
#428 Workers will now attempt to automatically
detect available GPU resources from the CUDA_VISIBLE_DEVICES environment variable.

Stream log

Basic export of stream log into JSON (hq log <log_file> export)

Server

Improved scheduling of multi-node tasks.
Server now generates a random unique ID (UID) string every time a new server is started (hq server start).
It can be used as a placeholder %{SERVER_ID}.

Changes

CLI

#433 (Backwards incompatible change)
The CLI command hq job tasks has been removed and its functionality has been incorporated into the
hq task list command instead.
resource requests,
#420 Shebang (e.g. #!/bin/bash) will
now be read from submitted program based on the provided
directives mode. If a shebang
is found, HQ will execute the program located at the shebang path and pass it the rest of the
submitted arguments.

By default, directives and shebang will be read from the submitted program only if its filename ends
with .sh. If you want to explicitly enable reading the shebang, pass --directives=file to
hq submit.

Another change is that the shebang is now read by the client (i.e. it will be read on the node that
submits the job), not on worker nodes as previously. This means that the submitted file has to be
accessible on the client node.

Resource management

#427 (Backwards incompatible change)
The environment variable HQ_RESOURCE_INDICES_<resource-name>, which is passed to tasks with
resource requests,
has been renamed to HQ_RESOURCE_VALUES_<resource-name>.
#427 (Backwards incompatible change)
The specifier for specifying indexed pool resources for workers as a range has been renamed from
indices to range.
```
# before
$ hq worker start --resource "gpus=indices(1-3)"
# now
$ hq worker start --resource "gpus=range(1-3)"
```
#427 The
generic resource
documentation has been rewritten and improved.

Artifact summary:

hq-v0.11.0-*: Main HyperQueue build containing the hq binary. Download this archive to
use HyperQueue from the command line.
hyperqueue-0.11.0-*: Wheel containing the hyperqueue package with HyperQueue Python
bindings.

Assets 6

16 Jun 11:44

github-actions

v0.11.0-rc1

2fa7685

v0.11.0-rc1 Pre-release

Pre-release

HyperQueue 0.11.0-rc1

New features

CLI

#423 You can now specify the server
directory using the HQ_SERVER_DIR environment variable.

Resource management

#427 A new specifier has been added to
specify indexed pool resources for workers as a set of individual resource indices.
```
$ hq worker start --resource "gpus=list(1,3,8)"
```
#428 Workers will now attempt to automatically
detect available GPU resources from the CUDA_VISIBLE_DEVICES environment variable.

Stream log

Basic export of stream log into JSON (hq log <log_file> export)

Server

Improved scheduling of multi-node tasks.
Server now generates a random unique ID (UID) string every time a new server is started (hq server start).
It can be used as a placeholder %{SERVER_ID}.

Changes

CLI

#433 (Backwards incompatible change)
The CLI command hq job tasks has been removed and its functionality has been incorporated into the
hq task list command instead.
resource requests,
#420 Shebang (e.g. #!/bin/bash) will
now be read from submitted program based on the provided
directives mode. If a shebang
is found, HQ will execute the program located at the shebang path and pass it the rest of the
submitted arguments.

By default, directives and shebang will be read from the submitted program only if its filename ends
with .sh. If you want to explicitly enable reading the shebang, pass --directives=file to
hq submit.

Another change is that the shebang is now read by the client (i.e. it will be read on the node that
submits the job), not on worker nodes as previously. This means that the submitted file has to be
accessible on the client node.

Resource management

#427 (Backwards incompatible change)
The environment variable HQ_RESOURCE_INDICES_<resource-name>, which is passed to tasks with
resource requests,
has been renamed to HQ_RESOURCE_VALUES_<resource-name>.
#427 (Backwards incompatible change)
The specifier for specifying indexed pool resources for workers as a range has been renamed from
indices to range.
```
# before
$ hq worker start --resource "gpus=indices(1-3)"
# now
$ hq worker start --resource "gpus=range(1-3)"
```
#427 The
generic resource
documentation has been rewritten and improved.

Artifact summary:

hq-v0.11.0-rc1-*: Main HyperQueue build containing the hq binary. Download this archive to
use HyperQueue from the command line.
hyperqueue-0.11.0-rc1-*: Wheel containing the hyperqueue package with HyperQueue Python
bindings.

Assets 6

20 May 13:21

github-actions

v0.10.0

bc1ae5b

v0.10.0

HyperQueue 0.10.0

New features

Running tasks

HQ will now set the OpenMP OMP_NUM_THREADS environment variable for each task. The amount of threads
will be set according to the number of requested cores. For example, this job submission:

$ hq submit --cpus=4 -- <program>

would pass OMP_NUM_THREADS=4 to the executed <program>.

New task OpenMP pinning mode was added. You can now use --pin=omp when submitting jobs. This
CPU pin mode will generate the corresponding OMP_PLACES and OMP_PROC_BIND environment variables
to make sure that OpenMP pins its threads to the exact cores allocated by HyperQueue.
Preview version of multi-node tasks. You may submit multi-node task by hq submit --nodes=X ...

CLI

Less verbose log output by default. You can use "--debug" to turn on the old behavior.

Changes

Scheduler

When there is only a few tasks, scheduler tries to fit tasks on fewer workers.
Goal is to enable earlier stopping of workers because of idle timeout.

CLI

The --pin boolean option for submitting jobs has been changed to take a value. You can get the
original behaviour by specifying --pin=taskset.

Fixes

Automatic allocation

PBS/Slurm allocations using multiple workers will now correctly spawn a HyperQueue worker on all
allocated nodes.

Artifact summary:

hq-v0.10.0-*: Main HyperQueue build containing the hq binary. Download this archive to
use HyperQueue from the command line.
hyperqueue-0.10.0-*: Wheel containing the hyperqueue package with HyperQueue Python
bindings.

Assets 6

13 May 09:12

github-actions

v0.10.0-rc1

f299e5f

v0.10.0-rc1 Pre-release

Pre-release

HyperQueue 0.10.0-rc1

New features

Running tasks

HQ will now set the OpenMP OMP_NUM_THREADS environment variable for each task. The amount of threads
will be set according to the number of requested cores. For example, this job submission:

$ hq submit --cpus=4 -- <program>

would pass OMP_NUM_THREADS=4 to the executed <program>.

New task OpenMP pinning mode was added. You can now use --pin=omp when submitting jobs. This
CPU pin mode will generate the corresponding OMP_PLACES and OMP_PROC_BIND environment variables
to make sure that OpenMP pins its threads to the exact cores allocated by HyperQueue.
Preview version of multi-node tasks. You may submit multi-node task by hq submit --nodes=X ...

CLI

Less verbose log output by default. You can use "--debug" to turn on the old behavior.

Changes

Scheduler

When there is only a few tasks, scheduler tries to fit tasks on fewer workers.
Goal is to enable earlier stopping of workers because of idle timeout.

CLI

The --pin boolean option for submitting jobs has been changed to take a value. You can get the
original behaviour by specifying --pin=taskset.

Fixes

Automatic allocation

PBS/Slurm allocations using multiple workers will now correctly spawn a HyperQueue worker on all
allocated nodes.

Artifact summary:

hq-v0.10.0-rc1-*: Main HyperQueue build containing the hq binary. Download this archive to
use HyperQueue from the command line.
hyperqueue-0.10.0-rc1-*: Wheel containing the hyperqueue package with HyperQueue Python
bindings.

Assets 6

16 Mar 08:47

github-actions

v0.9.0

30f26a2

v0.9.0

HyperQueue 0.9.0

New features

Tasks

Task may be started with a temporary directory that is automatically deleted when the task is finished.
(flag --task-dir).
Task may provide its own error message by creating a file with name passed by environment variable
HQ_ERROR_FILENAME.

CLI

You can now use the hq task list <job-selector> command to display a list of tasks across multiple jobs.
Add --filter flag to worker list to allow filtering workers by their status.

Changes

Automatic allocation

Automatic allocation has been rewritten from scratch. It will no longer query PBS/Slurm allocation
statuses periodically, instead it will try to derive allocation state from workers that connect
to it from allocations.
When adding a new allocation queue, HyperQueue will now try to immediately submit a job into the queue
to quickly test whether the entered configuration is correct. If you want to avoid this behaviour, you
can use the --no-dry-run flag for hq alloc add <pbs/slurm>.
If too many submissions (10) or running allocations (3) fail in a succession, the corresponding
allocation queue will be automatically removed to avoid error loops.
hq alloc events command has been removed.
The --max-kept-directories parameter for allocation queues has been removed. HyperQueue will now keep
20 last allocation directories amongst all allocation queues.

Fixes

HQ will no longer warn that stdout/stderr path does not contain the %{TASK_ID} placeholder
when submitting array jobs if the placeholder is contained within the working directory path and
stdout/stderr contains the %{CWD} placeholder.

Assets 6

11 Mar 13:22

github-actions

v0.9.0-rc3

30f26a2

v0.9.0-rc3 Pre-release

Pre-release

HyperQueue 0.9.0-rc3

New features

Tasks

Task may be started with a temporary directory that is automatically deleted when the task is finished.
(flag --task-dir).
Task may provide its own error message by creating a file with name passed by environment variable
HQ_ERROR_FILENAME.

CLI

You can now use the hq task list <job-selector> command to display a list of tasks across multiple jobs.
Add --filter flag to worker list to allow filtering workers by their status.

Changes

Automatic allocation

Automatic allocation has been rewritten from scratch. It will no longer query PBS/Slurm allocation
statuses periodically, instead it will try to derive allocation state from workers that connect
to it from allocations.
When adding a new allocation queue, HyperQueue will now try to immediately submit a job into the queue
to quickly test whether the entered configuration is correct. If you want to avoid this behaviour, you
can use the --no-dry-run flag for hq alloc add <pbs/slurm>.
If too many submissions (10) or running allocations (3) fail in a succession, the corresponding
allocation queue will be automatically removed to avoid error loops.
hq alloc events command has been removed.
The --max-kept-directories parameter for allocation queues has been removed. HyperQueue will now keep
20 last allocation directories amongst all allocation queues.

Fixes

HQ will no longer warn that stdout/stderr path does not contain the %{TASK_ID} placeholder
when submitting array jobs if the placeholder is contained within the working directory path and
stdout/stderr contains the %{CWD} placeholder.

Assets 6

24 Feb 13:34

github-actions

v0.9.0-rc2

28f8433

v0.9.0-rc2 Pre-release

Pre-release

HyperQueue 0.9.0-rc2

New features

Tasks

Task may be started with a temporary directory that is automatically deleted when the task is finished.
(flag --task-dir).

CLI

You can now use the hq task list <job-selector> command to display a list of tasks across multiple jobs.
Add --filter flag to worker list to allow filtering workers by their status.

Changes

Automatic allocation

When adding a new allocation queue, HyperQueue will now try to immediately submit a job into the queue
to quickly test whether the entered configuration is correct. If you want to avoid this behaviour, you
can use the --no-dry-run flag for hq alloc add <pbs/slurm>.
The automatic allocator will now be invoked much less frequently, which should reduce stress put
on the used HPC job manager (e.g. PBS). You might thus see up to 10-minute delays before the HQ
allocation list will display updated information or before a new allocation will be submitted.
We plan to rework the automatic allocator in future versions to allow more frequent updates while
avoiding generating too many requests to the HPC job manager.

Fixes

HQ will no longer warn that stdout/stderr path does not contain the %{TASK_ID} placeholder
when submitting array jobs if the placeholder is contained within the working directory path and
stdout/stderr contains the %{CWD} placeholder.
The automatic allocator will query PBS allocation statuses less often. It will now ask for status
of all allocations per allocation queue in a single qstat call, and it now also contains backoff
that will slow down new allocations if there are submission errors. If too many submissions (10) or
running allocations (3) fail in a succession, its corresponding allocation queue will be automatically
removed.

Assets 6

17 Feb 09:57

github-actions

v0.9.0-rc1

f2aea37

v0.9.0-rc1 Pre-release

Pre-release

HyperQueue 0.9.0-rc1

New features

Tasks

Task may be started with a temporary directory that is automatically deleted when the task is finished.
(flag --task-dir)

CLI

You can now use the hq task list <job-selector> command to display a list of tasks across multiple jobs.
Add --filter flag to worker list to allow filtering workers by their status.

Changes

Automatic allocation

When adding a new allocation queue, HyperQueue will now try to immediately submit a job into the queue
to quickly test whether the entered configuration is correct. If you want to avoid this behaviour, you
can use the --no-dry-run flag for hq alloc add <pbs/slurm>.

Fixes

HQ will no longer warn that stdout/stderr path does not contain the %{TASK_ID} placeholder
when submitting array jobs if the placeholder is contained within the working directory path and
stdout/stderr contains the %{CWD} placeholder.
The automatic allocator will query PBS allocation statuses less often. It will now ask for status
of all allocations per allocation queue in a single qstat call, and it now also contains backoff
that will slow down new allocations if there are submission errors. If too many submissions (50) or
allocations (10) fail in a succession, its corresponding allocation queue will be automatically removed.

Assets 6

Releases: It4innovations/hyperqueue

v0.12.0-rc1

HyperQueue 0.12.0-rc1

New features

Automatic allocation

Resiliency

CLI

Python API

Changes

CLI

Automatic allocation

Jobs

Fixes

Job submission

Artifact summary:

v0.11.0-ligate1

HyperQueue 0.11.0-ligate1

New features

CLI

Resource management

Stream log

Server

Changes

CLI

Resource management

Artifact summary:

v0.11.0

HyperQueue 0.11.0

New features

CLI

Resource management

Stream log

Server

Changes

CLI

Resource management

Artifact summary:

v0.11.0-rc1

HyperQueue 0.11.0-rc1

New features

CLI

Resource management

Stream log

Server

Changes

CLI

Resource management

Artifact summary:

v0.10.0

HyperQueue 0.10.0

New features

Running tasks

CLI

Changes

Scheduler

CLI

Fixes

Automatic allocation

Artifact summary:

v0.10.0-rc1

HyperQueue 0.10.0-rc1

New features

Running tasks

CLI

Changes

Scheduler

CLI

Fixes

Automatic allocation

Artifact summary:

v0.9.0

HyperQueue 0.9.0

New features

Tasks

CLI

Changes

Automatic allocation

Fixes

v0.9.0-rc3

HyperQueue 0.9.0-rc3