AWS Neuron Samples

This repository contains samples for AWS Neuron, the software development kit (SDK) that enables machine learning (ML) inference and training workloads on the AWS ML accelerator chips Inferentia and Trainium.

The samples in this repository provide an indication of the types of deep learning models that can be used with Trainium and Inferentia, but do not represent an exhaustive list of supported models. If you have additional model samples that you would like to contribute to this repository, please submit a pull request following the repository's contribution guidelines.

Samples are organized by use case (training, inference) and deep learning framework (PyTorch, TensorFlow) below:

Training

Framework	Description	Instance Type
PyTorch NeuronX (torch-neuronx)	Sample training scripts for training various PyTorch models on AWS Trainium	Trn1, Trn1n & Inf2

Usage	Description	Instance Type
Nemo Megatron for Neuron	A library that enables large-scale distributed training of language models such as Llama and is adapted from Nemo Megatron.	Trn1, Trn1n
AWS Neuron samples for ParallelCluster	How to use AWS ParallelCluster to build HPC compute cluster that uses trn1 compute nodes to run your distributed ML training job.	Trn1, Trn1n
AWS Neuron samples for EKS	The samples in this repository demonstrate the types of patterns that can be used to deliver inference and distributed training on EKS using Inferentia and Trainium.	Trn1, Trn1n
AWS Neuron samples for SageMaker	SageMaker Samples using ml.trn1 instances for machine learning (ML) training workloads on the AWS ML accelerator chips Trainium.	Trn1, Trn1n

Inference

Framework	Description	Instance Type
PyTorch NeuronX (torch-neuronx)	Sample Jupyter notebooks demonstrating model compilation and inference for various PyTorch models on AWS Inferentia2 and Trainium	Inf2 & Trn1
PyTorch NeuronX (transformers-neuronx)	Sample Jupyter Notebooks demonstrating tensor parallel inference for various PyTorch large language models (LLMs) on AWS Inferentia2 and Trainium	Inf2 & Trn1
PyTorch Neuron (torch-neuron)	Sample Jupyter notebooks demonstrating model compilation and inference for various PyTorch models on AWS Inferentia	Inf1
TensorFlow Neuron (tensorflow-neuron)	Sample Jupyter notebooks demonstrating model compilation and inference for various TensorFlow models on AWS Inferentia	Inf1

Usage	Description	Instance Type
AWS Neuron samples for SageMaker	SageMaker Samples using ml.inf2 and ml.trn1 instances for machine learning (ML) inference workloads on the AWS ML accelerator chips Inferentia2 and Trainium.	Inf2 & Trn1

Getting Help

If you encounter issues with any of the samples in this repository, please open an issue via the GitHub Issues feature.

Contributing

Please refer to the CONTRIBUTING document for details on contributing additional samples to this repository.

Release Notes

Please refer to the Change Log.

Known Issues

Model	Framework	Training/Inference	Instance Type	Status
Fairseq	PyTorch	Inference	Inf1	RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace!
Yolof	PyTorch	Inference	Inf1	RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

AWS Neuron Samples

Training

Inference

Getting Help

Contributing

Release Notes

Known Issues

Files

README.md

Latest commit

History

README.md

File metadata and controls

AWS Neuron Samples

Training

Inference

Getting Help

Contributing

Release Notes

Known Issues