Skip to content

Releases: It4innovations/hyperqueue

Nightly build 2025-01-11

09 Dec 11:19
Compare
Choose a tag to compare
Pre-release

HyperQueue dev

Breaking change

  • Pre-built HyperQueue releases available from our GitHub repository are now built with GLIBC 2.28, instead of 2.17. If you need to run HyperQueue on a system with an older GLIBC version, you might need to recompile it from source on your system. If you encounter any issues, please let us know.

Changes

  • hq event-log command renamed to hq journal
  • hq dashboard has been re-enabled by default.

New features

  • Added hq journal prune for pruning journal file.
  • Added hq journal flush for forcing server to flush the journal.

Artifact summary:

  • hq-vdev-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-dev-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.20.0

24 Sep 09:34
Compare
Choose a tag to compare

HyperQueue 0.20.0

New features

  • It is now possible to dynamically submit new tasks into an existing job (we call this concept "Open jobs").
    See Open jobs documentation

  • Worker streaming. Before, you could stream task stderr/stdout to the server over the network using the --log parameter of hq submit.
    This approach had various issues and was not scalable. Therefore, we have replaced this functionality with worker streaming,
    where the streaming of task output to a set of files on disk is performed by workers instead.
    This new streaming approach creates more files than original solution (where it was always one file per job),
    but the number of files stays small and independent on the number of executed tasks.
    The new architecture also allows parallel I/O writing and storing of multiple job streams in one stream handle.
    You can use worker streaming using the --stream parameter of hq submit. Check out the documentation for more information.

  • Optimization of journal size

  • Tasks' crash counters are not increased when worker is stopped by hq worker stop or by time limit.

Removed

  • Because worker streaming fully replaces original streaming, the original server streaming was removed.
    For most cases, you can rename --log to --stream and hq log to hq output-log. See the docs for more details.

Fixes

  • HQ should no longer crash while printing job info when a failed task does not have any workers
    attached (#731).

Note

  • Dashboard still not enabled in this version

Artifact summary:

  • hq-v0.20.0-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.20.0-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.20.0-rc2

20 Sep 18:41
Compare
Choose a tag to compare
v0.20.0-rc2 Pre-release
Pre-release

HyperQueue 0.20.0-rc2

New features

  • It is now possible to dynamically submit new tasks into an existing job (we call this concept "Open jobs").
    See Open jobs documentation

  • Worker streaming. Before, you could stream task stderr/stdout to the server over the network using the --log parameter of hq submit.
    This approach had various issues and was not scalable. Therefore, we have replaced this functionality with worker streaming,
    where the streaming of task output to a set of files on disk is performed by workers instead.
    This new streaming approach creates more files than original solution (where it was always one file per job),
    but the number of files stays small and independent on the number of executed tasks.
    The new architecture also allows parallel I/O writing and storing of multiple job streams in one stream handle.
    You can use worker streaming using the --stream parameter of hq submit. Check out the documentation for more information.

  • Optimization of journal size

  • Tasks' crash counters are not increased when worker is stopped by hq worker stop or by time limit.

Removed

  • Because worker streaming fully replaces original streaming, the original server streaming was removed.
    For most cases, you can rename --log to --stream and hq log to hq output-log. See the docs for more details.

Fixes

  • HQ should no longer crash while printing job info when a failed task does not have any workers
    attached (#731).

Note

  • Dashboard still not enabled in this version

Artifact summary:

  • hq-v0.20.0-rc2-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.20.0-rc2-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.19.0

31 May 11:00
Compare
Choose a tag to compare

HyperQueue 0.19.0

New features

  • Server resilience. Server state can be loaded back from a journal when it crashes. This will restore the state of submitted jobs and also autoallocator queues. Find out more here.

  • HQ_NUM_NODES for multi-node tasks introduced. It contains the number of nodes assigned to task.
    You do not need to manually count lines in HQ_NODE_FILE anymore.

Changes

  • Dashboard is disabled in this version. We expect to reneeble it in 1-2 release cycles

  • Node file generated for multi-node tasks now contains only short hostnames
    (e.g. if hostname is "cn690.karolina.it4i.cz", only "cn690" is written into node list)
    You can read HQ_HOST_FILE if you need to get full hostnames without stripping.

Fixes

  • Enable passing of empty stdout/stderr to Python function tasks in the Python
    API (#691).
  • hq alloc add --name <name> will now correctly use the passed <name> to name allocations submitted to Slurm/PBS.

Artifact summary:

  • hq-v0.19.0-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.19.0-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.19.0-rc1

29 May 09:53
Compare
Choose a tag to compare
v0.19.0-rc1 Pre-release
Pre-release

HyperQueue 0.19.0-rc1

New features

  • Server resilience. Server state can be loaded back from journal when server crashes.

  • HQ_NUM_NODES for multi-node tasks introduced. It contains the number of nodes assigned to task.
    You do not need to manually count lines in HQ_NODE_FILE anymore.

Changes

  • Dashboard is disabled in this version. We expect to reneeble it in 1-2 release cycles

  • Node file generated for multi-node tasks now contains only short hostnames
    (e.g. if hostname is "cn690.karolina.it4i.cz", only "cn690" is written into node list)
    You can read HQ_HOST_FILE if you need to get full hostnames without stripping.

Fixes

  • Enable passing of empty stdout/stderr to Python function tasks in the Python
    API (#691).
  • hq alloc add --name <name> will now correctly use the passed <name> to name allocations submitted to Slurm/PBS.

Artifact summary:

  • hq-v0.19.0-rc1-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.19.0-rc1-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.18.0

14 Feb 10:12
Compare
Choose a tag to compare

HyperQueue 0.18.0

Breaking changes

New features

  • Combination of --time-request and --nodes is now allowed

  • Allow setting a time request for a task (min_time resource value) using the Python API.

  • Optimizations related to job submit & long term memory saving

  • The CLI dashboard is now enabled by default. You can try it with the hq dashboard command. Note that it is still
    very experimental and a lot of useful features are missing.

Artifact summary:

  • hq-v0.18.0-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.18.0-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.18.0-rc1

11 Feb 09:35
Compare
Choose a tag to compare
v0.18.0-rc1 Pre-release
Pre-release

HyperQueue 0.18.0-rc1

Breaking change

New features

  • Combination of --time-request and --nodes is now allowed

  • Allow setting a time request for a task (min_time resource value) using the Python API.

  • Optimizations related to job submit & long term memory saving

  • The CLI dashboard is now enabled by default. You can try it with the hq dashboard command. Note that it is still
    very experimental and a lot of useful features are missing.

Artifact summary:

  • hq-v0.18.0-rc1-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.18.0-rc1-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.17.0-liberec

13 Nov 09:38
0054371
Compare
Choose a tag to compare
v0.17.0-liberec Pre-release
Pre-release

HyperQueue 0.17.0-liberec

Breaking change

Memory resource in megabytes

  • Automatically detected resource "mem" that is the size of RAM of a worker is now using megabytes as a unit.
    i.e. --resource mem=100 asks now for 100 MiB (previously 100 bytes).

New features

Non-integer resource requests

  • You may now ask of non-integer amount of a resource. e.g. for 0.5 of GPU.
    This enables resource sharing on the logical level of HyperQueue scheduler and allows to utilize remaining part the resource
    by another tasks.

Job submission

  • You can now specify cleanup modes when passing stdout/stderr paths to tasks. Cleanup mode decides what should
    happen with the file once the task has finished executing. Currently, a single cleanup mode is implemented, which removes
    the file if the task has finished successfully:
$ hq submit --stdout=out.txt:rm-if-finished /my-program

Fixes

  • Fixed crash when task fails during its initialization

Artifact summary:

  • hq-v0.17.0-liberec-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.17.0-liberec-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.17.0

01 Nov 10:12
Compare
Choose a tag to compare

HyperQueue 0.17.0

Breaking change

Memory resource in megabytes

  • Automatically detected resource "mem" that is the size of RAM of a worker is now using megabytes as a unit.
    i.e. --resource mem=100 asks now for 100 MiB (previously 100 bytes).

New features

Non-integer resource requests

  • You may now ask of non-integer amount of a resource. e.g. for 0.5 of GPU.
    This enables resource sharing on the logical level of HyperQueue scheduler and allows to utilize remaining part the resource
    by another tasks.

Job submission

  • You can now specify cleanup modes when passing stdout/stderr paths to tasks. Cleanup mode decides what should
    happen with the file once the task has finished executing. Currently, a single cleanup mode is implemented, which removes
    the file if the task has finished successfully:
$ hq submit --stdout=out.txt:rm-if-finished /my-program

Fixes

  • Fixed crash when task fails during its initialization

Artifact summary:

  • hq-v0.17.0-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.17.0-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.

v0.17.0-rc1

25 Oct 11:52
Compare
Choose a tag to compare
v0.17.0-rc1 Pre-release
Pre-release

HyperQueue 0.17.0-rc1

Breaking change

Memory resource in megabytes

  • Automatically detected resource "mem" that is the size of RAM of a worker is now using megabytes as a unit.
    i.e. --resource mem=100 asks now for 100 MiB (previously 100 bytes).

New features

Non-integer resource requests

  • You may now ask of non-integer amount of a resource. e.g. for 0.5 of GPU.
    This enables resource sharing on the logical level of HyperQueue scheduler and allows to utilize remaining part the resource
    by another tasks.

Job submission

  • You can now specify cleanup modes when passing stdout/stderr paths to tasks. Cleanup mode decides what should
    happen with the file once the task has finished executing. Currently, a single cleanup mode is implemented, which removes
    the file if the task has finished successfully:
$ hq submit --stdout=out.txt:rm-if-finished /my-program

Fixes

  • Fixed crash when task fails during its initialization

Artifact summary:

  • hq-v0.17.0-rc1-*: Main HyperQueue build containing the hq binary. Download this archive to
    use HyperQueue from the command line
    .
  • hyperqueue-0.17.0-rc1-*: Wheel containing the hyperqueue package with HyperQueue Python
    bindings.