diff --git a/release-notes/runtime/aws-neuronx-dkms/index.rst b/release-notes/runtime/aws-neuronx-dkms/index.rst index 6ba7b140..41f6b3c5 100644 --- a/release-notes/runtime/aws-neuronx-dkms/index.rst +++ b/release-notes/runtime/aws-neuronx-dkms/index.rst @@ -42,6 +42,7 @@ New in this release Bug Fixes ^^^^^^^^^ * Fixed compatibility issues for the Linux 6.3 kernel +* Resolved issue where device reset handling code was not properly checking the failure metric Neuron Driver release [2.16.7.0] diff --git a/release-notes/runtime/aws-neuronx-runtime-lib/index.rst b/release-notes/runtime/aws-neuronx-runtime-lib/index.rst index 1f9a0817..1b5e6ac8 100644 --- a/release-notes/runtime/aws-neuronx-runtime-lib/index.rst +++ b/release-notes/runtime/aws-neuronx-runtime-lib/index.rst @@ -31,6 +31,19 @@ NEFF Version Runtime Version Range Notes 2.0 >= 1.6.5.0 Starting support for 2.0 NEFFs ============ ===================== =================================== +Neuron Runtime Library [2.22.14.0] +--------------------------------- +Date: 09/16/2024 + +New in this release +^^^^^^^^^^^^^^^^^^^ +* Improved the inter-node mesh algorithm to scales better for larger number of nodes and larger allreduce problem sizes + +Bug fixes +^^^^^^^^^ +* Implemented a fix that differentiate between out-of-memory (OOM) conditions occurring on the host system versus the device when an OOM event occurs +* Resolved a performance issue with transpose operations, which was caused by an uneven distribution of work across DMA engines + Neuron Runtime Library [2.21.41.0] --------------------------------- Date: 07/03/2024 diff --git a/tools/neuron-sys-tools/neuron-sysfs-user-guide.rst b/tools/neuron-sys-tools/neuron-sysfs-user-guide.rst index 976ae78a..6e0c73da 100644 --- a/tools/neuron-sys-tools/neuron-sysfs-user-guide.rst +++ b/tools/neuron-sys-tools/neuron-sysfs-user-guide.rst @@ -161,7 +161,18 @@ Description for Each Field * ``device_mem/``: The amount of memory that Neuron Runtime uses for weights, instructions and DMA rings. - * This device memory per NeuronCore is further categorized into five types: ``constants/``, ``model_code/``, ``model_shared_scratchpad/``, ``runtime_memory/``, and ``tensors/``. Definitions for these categories can be found in the :ref:`Device Used Memory ` section. Each of these categories has total, present, and peak. + * This device memory per NeuronCore is further categorized into five types: ``collectives/``, ``constants/``, ``dma_rings/``, ``driver_memory/``, ``model_code/``, ``model_shared_scratchpad/``, ``nonshared_scratchpad/``, ``notifications/``, ``runtime_memory/``, ``tensors/``, and ``uncategorized/``. Each of these categories has total, present, and peak. + * ``collectives`` - amount of device memory used for collective communication between workers + * ``constants`` - amount of device memory used for constants (for applications running training) or weights (for applications running inferences) + * ``dma_rings`` - amount of device memory used for storing model executable code used for data movements + * ``driver_memory`` - amount of device memory used by the Neuron Driver + * ``model_code`` - amount of device memory used for storing model executable code + * ``model_shared_scratchpad`` - amount of device memory used for the shared model scratchpad, a buffer shared between models on the same Neuron Core used for internal model variables and other auxiliary buffers + * ``nonshared_scratchpad`` - amount of device memory used for non-shared model scratchpad, a buffer used by a single model for internal model variables and other auxiliary buffers + * ``notifications`` - amount of device memory used to store instruction level trace information used to profile workloads ran on the device + * ``runtime_memory`` - amount of device memory used by the Neuron Runtime (outside of the previous categories) + * ``tensors`` - amount of device memory used for tensors + * ``uncategorized`` - amount of device memory that does not belong in any other catagory in this list * ``host_mem/``: The amount of memory that Neuron Runtime uses for input and output tensors.