Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support fused_bias_residual_activation for medusa #1199

Closed
wants to merge 2 commits into from
Closed

feat: support fused_bias_residual_activation for medusa #1199

wants to merge 2 commits into from

Conversation

zhyncs
Copy link
Collaborator

@zhyncs zhyncs commented Feb 27, 2024

Motivation

To support the ResBlock in FasterDecoding/Medusa https://github.com/FasterDecoding/Medusa/blob/700ff848f4cbfc66dfc2da30485130d64904241f/medusa/model/medusa_model.py#L43-L72, @b4b4o and I implemented fused_bias_residual_activation by referencing and porting the implementation from FasterTransformer https://github.com/NVIDIA/FasterTransformer/blob/df4a7534860137e060e18d2ebf019906120ea204/src/fastertransformer/kernels/activation_kernels.cu#L166-L284. Since this kernel is a relatively independent component, we created a separate pull request for it to be submitted first.

Hi all @lvhan028 @lzhangzz @grimoire @irexyc We have confirmed the accuracy of the entire ResBlock unit in the unit test. The corresponding code will be included in an upcoming PR, which is still undergoing refinement. Do you have any feedback or suggestions for this PR? Thanks.

Modification

add fused_bias_residual_activation in src/turbomind/kernels activation_kernels

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
  3. If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

@zhyncs
Copy link
Collaborator Author

zhyncs commented Feb 27, 2024

Hi @lvhan028 May you help trigger the unit-test workflow again?

2024-02-27T06:41:57.4645820Z W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal/universe/binary-amd64/Packages  502  Bad Gateway [IP: 10.1.8.50 33128]
2024-02-27T06:41:57.4647771Z W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal/main/binary-amd64/Packages  502  Bad Gateway [IP: 10.1.8.50 33128]
2024-02-27T06:41:57.4649433Z W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal-updates/universe/binary-amd64/Packages  502  Bad Gateway [IP: 10.1.8.50 33128]
2024-02-27T06:41:57.4651166Z W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal-updates/restricted/binary-amd64/Packages  502  Bad Gateway [IP: 10.1.8.50 33128]
2024-02-27T06:41:57.4652848Z W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal-updates/main/binary-amd64/Packages  502  Bad Gateway [IP: 10.1.8.50 33128]
2024-02-27T06:41:57.4654305Z W: Some index files failed to download. They have been ignored, or old ones used instead.
2024-02-27T06:41:57.7085901Z Reading package lists...
2024-02-27T06:41:57.7462058Z Building dependency tree...
2024-02-27T06:41:57.7464031Z Reading state information...
2024-02-27T06:41:57.7528857Z E: Unable to locate package rapidjson-dev
2024-02-27T06:41:57.7529450Z E: Unable to locate package libgoogle-glog-dev
2024-02-27T06:41:57.7594874Z ##[error]Process completed with exit code 100.

Based on the error log, this failure was not due to the PR code.

@zhyncs
Copy link
Collaborator Author

zhyncs commented Feb 28, 2024

refer to #1213 just close this

@zhyncs zhyncs closed this Feb 28, 2024
@zhyncs zhyncs deleted the medusa-kernel branch February 28, 2024 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant