Ideas for faster training #742

casper-hansen · 2023-10-18T17:00:57Z

Here are my ideas:

Already implemented

Use flash_attn_rms_norm: true in config after installing. Speedup: 15-20%.

pip install 'git+https://github.com/Dao-AILab/flash-attention.git#egg=dropout_layer_norm&subdirectory=csrc/layer_norm'

Use flash_attn_cross_entropy: true in config after installing. Speedup: 0%.

pip install 'git+https://github.com/Dao-AILab/flash-attention.git#egg=xentropy_cuda_lib&subdirectory=csrc/xentropy'

The text was updated successfully, but these errors were encountered:

casper-hansen added the enhancement New feature or request label Oct 18, 2023