Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable CK Attention for Navi31 #285

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open

Enable CK Attention for Navi31 #285

wants to merge 6 commits into from

Conversation

hyoon1
Copy link

@hyoon1 hyoon1 commented Nov 18, 2024

@hyoon1 hyoon1 requested review from maleksan85 and gshtras November 18, 2024 23:27
- Enables CK Attention for Navi31
- Requires this branch of Flash Attention:
- https://github.com/ROCm/flash-attention/tree/howiejay/navi_support
Copy link
Collaborator

@gshtras gshtras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this similar to #281
Can the 2 efforts be combined?

Please also try to simplify the conditions, with this level of nested elseifs it's hard to follow the logic.

from flash_attn import flash_attn_varlen_func # noqa: F401
self.attn_func = flash_attn_varlen_func
if flash_attn_available:
if current_platform.has_device_capability(110):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check isn't equivalent to is_navi
On Cuda device_capability is meant to increase with each new architecture, and is meant to differentiate by new features support, such as FP8, etc.
On ROCm it is 1st digit of gfx * 10 + 2nd digit, which doesn't mean much, especially for any future architectures.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@hyoon1
Copy link
Author

hyoon1 commented Dec 17, 2024

Is this similar to #281 Can the 2 efforts be combined?

Please also try to simplify the conditions, with this level of nested elseifs it's hard to follow the logic.

PR #281 is specific to a particular vision model and does not call the path used in general LLM models. Therefore, it seems difficult to merge. The condition has been simplified to match the depth in the existing code.

@hyoon1 hyoon1 requested a review from gshtras December 19, 2024 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants