Releases: nebuly-ai/nos
Releases · nebuly-ai/nos
0.1.2
0.1.1
Release notes nos
0.1.1
This is a minor release fixing some bugs and adding minor improvements.
Changelog
- 🔨 Fix typo in Helm chart preventing the mount of the scheduler config into the GPU Partitioner (thanks @nickpetrovic!)
- 🔨 Fix bug preventing Dynamic MIG Partitioning from working correctly on multi-GPU nodes (thanks to @likku123 and @WindowsXp-Beta for their help detecting and troubleshooting the issue!)
- ✨ Initialize the GPUs of the MIG-enabled nodes with the largest available MIG profile
- ✨ Include NVIDIA-A100-SXM4-80GB and NVIDIA-A100-SXM4-40GB models in known MIG geometries
- ✨ Update documentation for including k8s and CUDA version constraints
Contributors
0.1.0
Release Notes nos
This is the first major release of nos, the Nebuly Operating System. It implements two main features:
- ✂️ Dynamic GPU Partitioning: you can think of this as a cluster autoscaler for GPUs: instead of scaling up the number of nodes and GPUs, it dynamically partitions them to ensure that each workload only uses the GPU resources it actually needs, resulting in spare GPU capacity that could be used for other workloads. To partition GPUs, nos leverages NVIDIA’s MPS and MIG, finally making them dynamic.
- 🤝 Elastic Resource Quota management: it allows to increase the number of Pods running on the cluster by allowing teams (namespaces) to borrow quotas of reserved resources from other teams as long as they are not using them.