Releases: RyaxTech/ryax-engine
We are proud to announce the release of:
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 24.12.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Stability and better UI experience
New features
- GPU support for SSH SLURM with Singularity
- Runtime Class support for the Kubernetes add-on
- Action Builder now uses a persistent cache (Nix Store)
Bug fixes and Improvements
- Use PostgreSQL instead of SQLite as database by default for the Worker
- Fix Dynamic Output edition form UI reload too often
- Better deploy constraints settings site type filter
- Fix CPU per task option for in Slurm execution mode
- Fix Worker configuration update is now taken into account
- Fix missing logs for very short execution
- Avoid workflow error when double deploy
- Avoid action stuck on Building in case of Action Builder failure
- Fix RabbitMQ memory leaking due to dangling queues
Upgrade to this version
Because the Worker changes from SQLite database to PostgreSQL, you have to reset its state.
To do so, remove the worker with:
helm uninstall ryax-worker -n ryaxns
We also need to clean broker state to clean internal state:
helm uninstall rabbitmq -n ryaxns
kubectl delete pvc -n ryaxns data-ryax-broker-0
Runner should also be cleaned:
ryax-adm clean runner
For external workers, be sure that you have the values.yaml file that you used
for the previous installation and then run, for example:
helm uninstall ryax-other-worker -n ryaxns
helm install -n ryaxns --values ryax-other-worker.yaml
Then apply the update as usual.
We are proud to announce the release of:
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 24.10.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Multi-site full power!
New features
- A new service called Ryax Worker can now be used to attached any Slurm or Kubernetes cluster resources
- Ryax can now run any action on SLURM and Kubernetes seamlessly
- Action are now scheduled according to user defined constraints and objectives
- Add the possibility to pin Ryax services to a dedicated resources (nodeSelector)
- Enhance Ryax documentation with updated content (doc)
- New Jupyter Notebook action with GPU support in default actions
- Action builds now can be canceled
- Kubernetes addon now support injection of service
Bug fixes and Improvements
- Fix volume permission for NFS based storage volumes (defaults to 1200 now)
- Fix fail properly when a pip install fails during builds
Upgrade to this version
This is a major release of Ryax which implies some extra step for the upgrade.
Update configuration
This release introduce a new service, the Worker. In order to define the nodes that will be used by your actions, the Worker requires a site configuration. Please, add a configuration in your Ryax installation configuration file using the following example: in your local cluster has a node pool named default with a label default
on each node, it has 4 CPU and 8G of memory per node.
name: local
- cpu: 4
memory: 8G
name: default
selector: default
See the Worker configuration documentation for more details.
Update DNS
If you use public IP with TLS enabled, you will need to create a new DNS entry to support all subdomain for your cluster. This is used for example for an external Worker to access the internal container repository.
Please add an entry in your DNS using star notation:
See installation doc for more details.
Add HPC site
The users of HPC actions have to install a Worker dedicated to each cluster following this documentation.
Apply and clean
Once configured, you can apply the configuration with ryax-adm
as usual.
The log capture service, Loki, was moved into the ryaxns
namespace. Thus, the old Loki deployment can be removed.
After applying, we have to remove the old deployment:
helm uninstall -n ryaxns-monitoring loki
kubectl delete pvc -n ryaxns-monitoring storage-loki-0
The Worker is now handling deployment. So, to avoid dangling actions and failing deployment, you have to clean the Runner state.
Be aware that, this will reset the execution history and stop all running workflows.
ryax-adm clean runner worker
We are proud to announce the release of:
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 24.06.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Control and stability.
New features
- Add a Kubernetes Addon to customize action deployment (label, nodeSelector, annotations, serviceAccount)
Bug fixes and Improvements
- Fix impossible to add dynamic output enum Values
- Fix addon default values from ryax_metadata.yaml no available in UI
- Better error handling for action deployments
- Fix hpc addon support of files in custom script
- Fix python-cuda build fails in some case
- Fix UID overlap when using NFS CSI Driver
- Fix OutOfMemory during git scan lead to inconsistent state
Upgrade to this version
Usual process: update the version in the config file and apply!
We are proud to announce the release of
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 24.01.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
This release focus on Reliability and Security 💪
The changelog:
Bugfixes and Improvements
- Fix connection issues on broker restart
- migrate Helm chart repository to an OCI standard repository
- Fix SSH Slurm execution issue with files
- Do not use root user inside the action builder container
Upgrade to this version
If you have set the chartRegistry
(you probably didn't) in your configuration values file please change the Chart repository URL to url:
We are proud to announce the release of
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 24.02.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
This release brings better HPC offloading support !
The changelog:
Bug Fixes and Improvements
- Add HPC offloading capability to run custom script on nodes directly for
parallel jobs - Better error handling in HPC offloading deployment and execution
- Fix HPC Offloading log capture
- Runs can now be canceled and deleted from the UI
- Fix dynamic outputs edition and improve display
- Fix action not undeployed in some corner case
Upgrade to this Version
HPC action have to be deleted and recreated to have the custom script
parameters available.
We are proud to announce the release of
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 23.12.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
This release focus on Scaling and Performance 🚀
The changelog:
New features
- Improve HPC offloading with optimized IO and image build
- 1 to N scaling of actions with better Kubernetes autoscale support
- Show a clear error message on Action failure due to resource limits
- Optional IO for Actions
Bug fixes and Improvements
- Improve database query performance and Runner responsiveness
- Fix actions undeploying during Runner restarts
- Fix monitoring configuration for KubeProxy
- Fix workflow deletion failed in some conditions
- Fix RabbitMQ failure to respond to liveness probe
Upgrade to this version
The RabbitMQ deployment needs to be replaced. To do so, uninstall it before the
update (communication between services will stop during update):
helm uninstall -n ryaxns rabbitmq
Then, proceed with the normal upgrade process.
To avoid errors on connections between service, restart them after the upgrade with:
kubectl delete pod -n ryaxns -l
We are proud to announce the release of
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 23.10.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
The changelog:
New features and improvements
- Trigger with python with CUDA support
- Graceful stop for running workflows
- Auto reload on UI update (PWA support)
Bug fixes
- Better error message when scanning badly formatted action metadata
- Fix add repository modal layout
- Fix refresh error on OpenAPI UI in some cases
Upgrade to this version
No action needed!
We are proud to announce the release of
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 23.09.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
This release focus on Observability and Performance, enjoy!
The changelog:
New features
- Instant logs on Triggers
- Better logs display for the Runs
- Update of Prometheus to the latest version
- Performance metrics are now exported and available in a dashboard in Grafana
- Add internal tracing in the Runner with Tempo to query traces in Grafana
- Run details panel rework
Bug fixes
- Improve database query performance and Runner responsiveness
- Fix errors on version change in some cases
- Fix error when stored file size is too big
Upgrade to this version
Admins should take care of the following elements when upgrading to this version.
Instant log
To get instant log, you have to rebuild the Actions. To do so, just run
"Build All" on the Library on your repository and the next deployment will use
the updated version.
Prometheus update
The update of Prometheus requires the following manual operation, before running the update. This will update the CRD and remove the old version of Prometheus.
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
kubectl apply --server-side -f --force-conflicts
helm uninstall -n ryaxns-monitoring prometheus
Now you can run the update to reinstall the new Prometheus version with the
usual ryax-adm apply
Grafana's credentials are reset by this update, user is ryax
and the password can be obtained with:
kubectl get secret --namespace ryaxns-monitoring grafana-cedentials -o jsonpath="{.data.admin-password}" | base64 -d
We are proud to announce the release of
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 23.07.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
The changelog:
New features and improvements
- Allows users to set a specific HTTP status code with its API result
- Actions can send a user-defined error with custom HTTP status code
- Update Loki (logs capture) and Cert Manager (SSL certificate manager) to the latest version
Bug fixes
- Fix OpenAPI page not always in sync with deployed workflows
Upgrade to this version
This update requires uninstalling the old Loki version before installing the
new one.
Before the update, just remove the old Loki version with:
helm uninstall -n ryaxns-monitoring loki
Be aware that some logs might not be captured before the new version is up and
More details on Loki upgrade:
We are proud to announce the release of
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
Ryax 23.06.0
✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨
The changelog:
New features and improvements
- HPC offloading using Singularity with multi user support
- Cuda GPU supports with the python3-cuda language
- Resources request support (CPU, Memory, Time, GPU)
- Keep the home .cache directory between runs
- Allows user to rebuild already built actions
- Better internal runs state management
- Use the latest Minio version
Bug fixes
- Show deployment error details when it happen
- Always show a notification when an error happen
- Fix certificate injection for our internal registry with docker daemon
- UI now show the deployment errors if any
Upgrade to this version
WARNING: This update requires an update which implies a maintenance period to
copy the data from one store to another.
Minio migration (for production)
The internal filestore, Minio, upgrade requires to migrate the data from the old instance to the
new. For more details, see
Get old filestore credentials
kubectl get secret --namespace "ryaxns" ryax-filestore-secret -o jsonpath="{.data.filestore}" | base64 -d
kubectl get secret --namespace "ryaxns" ryax-filestore-secret -o jsonpath="{.data.filestore-access}" | base64 -d
kubectl get secret --namespace "ryaxns" ryax-filestore-secret -o jsonpath="{.data.filestore-secret}" | base64 -d
Connect to the new Minio pod
MINIO_POD="$(kubectl -n ryaxns get pods --selector -o jsonpath='{.items[0]}')"
kubectl -n ryaxns exec -ti $MINIO_POD -- bash
Now inside the Minio pod (replace the variables by the values from previous
mc alias set new http://localhost:9000 ryax $MINIO_ROOT_PASSWORD
mc mb new/ryax-filestore
mc mirror --preserve old/ryax-filestore new/ryax-filestore
You can now safely remove the old filestore deployment with:
helm uninstall -n ryaxns ryax-filestore
Clean (for dev)
Clean the internal state of the services to avoid error of missing file when
upgrading minio
ryax-adm clean studio runner