We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Here are my ideas:
optimum.onnxruntime.ORTTrainer
Use flash_attn_rms_norm: true in config after installing. Speedup: 15-20%.
flash_attn_rms_norm: true
pip install 'git+https://github.com/Dao-AILab/flash-attention.git#egg=dropout_layer_norm&subdirectory=csrc/layer_norm'
Use flash_attn_cross_entropy: true in config after installing. Speedup: 0%.
flash_attn_cross_entropy: true
pip install 'git+https://github.com/Dao-AILab/flash-attention.git#egg=xentropy_cuda_lib&subdirectory=csrc/xentropy'
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Here are my ideas:
optimum.onnxruntime.ORTTrainer
)Already implemented
Flash Attention - RMSNorm
Use
flash_attn_rms_norm: true
in config after installing. Speedup: 15-20%.Flash Attention - Cross Entropy
Use
flash_attn_cross_entropy: true
in config after installing. Speedup: 0%.The text was updated successfully, but these errors were encountered: