Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neuron Runtime Release Notes for SDK 2.20 #985

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions release-notes/runtime/aws-neuronx-dkms/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ New in this release
Bug Fixes
^^^^^^^^^
* Fixed compatibility issues for the Linux 6.3 kernel
* Resolved issue where device reset handling code was not properly checking the failure metric


Neuron Driver release [2.16.7.0]
Expand Down
13 changes: 13 additions & 0 deletions release-notes/runtime/aws-neuronx-runtime-lib/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,19 @@ NEFF Version Runtime Version Range Notes
2.0 >= 1.6.5.0 Starting support for 2.0 NEFFs
============ ===================== ===================================

Neuron Runtime Library [2.22.14.0]
---------------------------------
Date: 09/16/2024

New in this release
^^^^^^^^^^^^^^^^^^^
* Improved the inter-node mesh algorithm to scales better for larger number of nodes and larger allreduce problem sizes

Bug fixes
^^^^^^^^^
* Implemented a fix that differentiate between out-of-memory (OOM) conditions occurring on the host system versus the device when an OOM event occurs
* Resolved a performance issue with transpose operations, which was caused by an uneven distribution of work across DMA engines

Neuron Runtime Library [2.21.41.0]
---------------------------------
Date: 07/03/2024
Expand Down
13 changes: 12 additions & 1 deletion tools/neuron-sys-tools/neuron-sysfs-user-guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,18 @@ Description for Each Field

* ``device_mem/``: The amount of memory that Neuron Runtime uses for weights, instructions and DMA rings.

* This device memory per NeuronCore is further categorized into five types: ``constants/``, ``model_code/``, ``model_shared_scratchpad/``, ``runtime_memory/``, and ``tensors/``. Definitions for these categories can be found in the :ref:`Device Used Memory <neuron_top_device_mem_usage>` section. Each of these categories has total, present, and peak.
* This device memory per NeuronCore is further categorized into five types: ``collectives/``, ``constants/``, ``dma_rings/``, ``driver_memory/``, ``model_code/``, ``model_shared_scratchpad/``, ``nonshared_scratchpad/``, ``notifications/``, ``runtime_memory/``, ``tensors/``, and ``uncategorized/``. Each of these categories has total, present, and peak.
* ``collectives`` - amount of device memory used for collective communication between workers
* ``constants`` - amount of device memory used for constants (for applications running training) or weights (for applications running inferences)
* ``dma_rings`` - amount of device memory used for storing model executable code used for data movements
* ``driver_memory`` - amount of device memory used by the Neuron Driver
* ``model_code`` - amount of device memory used for storing model executable code
* ``model_shared_scratchpad`` - amount of device memory used for the shared model scratchpad, a buffer shared between models on the same Neuron Core used for internal model variables and other auxiliary buffers
* ``nonshared_scratchpad`` - amount of device memory used for non-shared model scratchpad, a buffer used by a single model for internal model variables and other auxiliary buffers
* ``notifications`` - amount of device memory used to store instruction level trace information used to profile workloads ran on the device
* ``runtime_memory`` - amount of device memory used by the Neuron Runtime (outside of the previous categories)
* ``tensors`` - amount of device memory used for tensors
* ``uncategorized`` - amount of device memory that does not belong in any other catagory in this list

* ``host_mem/``: The amount of memory that Neuron Runtime uses for input and output tensors.

Expand Down