Release v0.1.0: AMD Instinct GPUs, Ryzen AI preliminary support · huggingface/optimum-amd

We are glad to release the first version of Optimum-AMD, extending the support of Hugging Face libraries for AMD ROCm GPUs and Ryzen AI laptops. More to come in the coming weeks!

`RyzenAIModelForImageClassification` for Ryzen AI NPU

Optimum-AMD allows to leverage the Ryzen AI NPU (Neural Processing Unit) for image classification through the RyzenAIModelForImageClassification class for faster local inference. Check out the documentation for more details!

`amdrun` wrapper on `torchrun` to dispatch on the most optimal GPUs

When using multiple GPUs that need to communicate (tensor parallel, data parallel, etc.), the choice of which devices is used is crutial for optimal performances. amdrun command line that comes along Optimum-AMD allows to automatically dispatch a torchrun job on a single node to the optimal devices:

amdrun --ngpus <num_gpus> <script> <script_args>

ONNX Runtime `ROCMExecutionProvider` support

Optimum ONNX Runtime integration supports ROCm natively: https://huggingface.co/docs/optimum/onnxruntime/usage_guides/amdgpu

Text Generation Inference library for LLM inference supports ROCm

Text Generation Inference supports ROCm natively: https://huggingface.co/docs/text-generation-inference/quicktour

GPTQ quantization support

AutoGPTQ library supports ROCm natively: https://github.com/PanQiWei/AutoGPTQ#quick-installation

Flash Attention 2 support for ROCm

Transformers supports natively Flash Attention 2 for ROCm: https://huggingface.co/docs/transformers/main/en/perf_infer_gpu_one#flashattention-2

What's Changed

Use MIT license instead of Apache 2.0 by @fxmarty in #11
Pip installable version of optimum-amd by @fxmarty in #12
Add dockerfile for Transformers + flash attention by @fxmarty in #27
Add RyzenAIModelForImageClassification by @mht-sharma in #16
Add amdrun by @IlyasMoutawwakil in #26
Add documentation by @fxmarty in #28
[Benchmark] adding optimum-benchmark compatible config files by @IlyasMoutawwakil in #14
Precise documentation by @fxmarty in #29
Improve documentation and add ort dockerfile by @fxmarty in #30
Fix makefile by @fxmarty in #31
Update ORT docker base image by @mht-sharma in #32
Update README.md by @mht-sharma in #33
Udpate documentation by @echarlaix in #34
Update Ryzen description by @mht-sharma in #35
Add pip installation by @fxmarty in #36

New Contributors

@fxmarty made their first contribution in #11
@mht-sharma made their first contribution in #16
@IlyasMoutawwakil made their first contribution in #26
@echarlaix made their first contribution in #34

Full Changelog: https://github.com/huggingface/optimum-amd/commits/v0.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0: AMD Instinct GPUs, Ryzen AI preliminary support

`RyzenAIModelForImageClassification` for Ryzen AI NPU

`amdrun` wrapper on `torchrun` to dispatch on the most optimal GPUs

ONNX Runtime `ROCMExecutionProvider` support

Text Generation Inference library for LLM inference supports ROCm

GPTQ quantization support

Flash Attention 2 support for ROCm

What's Changed

New Contributors

Contributors

v0.1.0: AMD Instinct GPUs, Ryzen AI preliminary support

RyzenAIModelForImageClassification for Ryzen AI NPU

amdrun wrapper on torchrun to dispatch on the most optimal GPUs

ONNX Runtime ROCMExecutionProvider support

Text Generation Inference library for LLM inference supports ROCm

GPTQ quantization support

Flash Attention 2 support for ROCm

What's Changed

New Contributors

Contributors

`RyzenAIModelForImageClassification` for Ryzen AI NPU

`amdrun` wrapper on `torchrun` to dispatch on the most optimal GPUs

ONNX Runtime `ROCMExecutionProvider` support