Mamba SSM operations not able to compile for Inferentia #1111

sophieker · 2025-02-12T18:28:50Z

Model we are trying to compile in inferentia:
https://github.com/MCG-NJU/VFIMamba

We are experiencing issues with compilation when the model uses mamba_ssm
from mamba_ssm.ops.selective_scan_interface import selective_scan_fn, selective_scan_ref
Line 5, model/feature_extractor.py

The selective_scan_fn relies on cuda. selective_scan_ref performs the same operation without cuda but is significantly slower, so we would prefer to compile with selective_scan_fn. Additionally, selective_scan_ref contains conditional logic which conflicts with inferentia compilation.

We tried to reference these docs for creating a custom NKI kernel for mamba, but we were not able to translate it over properly:
https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/tutorials/fused_mamba.html

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mamba SSM operations not able to compile for Inferentia #1111

Mamba SSM operations not able to compile for Inferentia #1111

sophieker commented Feb 12, 2025

Mamba SSM operations not able to compile for Inferentia #1111

Mamba SSM operations not able to compile for Inferentia #1111

Comments

sophieker commented Feb 12, 2025