Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/cubecl fusion #2815

Merged
merged 19 commits into from
Feb 17, 2025
Merged

Refactor/cubecl fusion #2815

merged 19 commits into from
Feb 17, 2025

Conversation

nathanielsimard
Copy link
Member

@nathanielsimard nathanielsimard commented Feb 14, 2025

Improve Burn's compilation time significantly :)

  • Moved backend re-exports to the root burn crate, so we don't need to recompile burn-core when working on a backend (better caching).
  • Created a new crate, burn-cubecl-fusion, that exports all optimizations used by burn-cubecl when fusion is activated.
  • Refactored all fusion optimizations to use a dynamic element type, avoiding Rust generics.

Overall, the fusion part of burn-cubecl went from 41s compilation time to 6.5s! burn-cubecl now takes around 10s and compiles in parallel with burn-cubecl-fusion.

Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from what was discussed offline, just some minor comments.

Approving in advance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why this is alone in a new file? Setting the table for incoming stuff? 👀

We can probably remove this file otherwise

Comment on lines 156 to 182
// #[derive(CubeLaunch, Default)]
// /// Global arguments that are used for fusing [element wise operations](ElemwiseOp).
// pub struct GlobalArgs {
// pub t_f32: Sequence<Tensor<Line<f32>>>,
// pub t_f16: Sequence<Tensor<Line<f16>>>,
// pub t_bf16: Sequence<Tensor<Line<bf16>>>,
// pub t_i64: Sequence<Tensor<Line<i64>>>,
// pub t_i32: Sequence<Tensor<Line<i32>>>,
// pub t_i16: Sequence<Tensor<Line<i16>>>,
// pub t_i8: Sequence<Tensor<Line<i8>>>,
// pub t_u64: Sequence<Tensor<Line<u64>>>,
// pub t_u32: Sequence<Tensor<Line<u32>>>,
// pub t_u16: Sequence<Tensor<Line<u16>>>,
// pub t_u8: Sequence<Tensor<Line<u8>>>,
// pub s_f32: Sequence<f32>,
// pub s_f16: Sequence<f16>,
// pub s_bf16: Sequence<bf16>,
// pub s_i64: Sequence<i64>,
// pub s_i32: Sequence<i32>,
// pub s_i16: Sequence<i16>,
// pub s_i8: Sequence<i8>,
// pub s_u64: Sequence<u64>,
// pub s_u32: Sequence<u32>,
// pub s_u16: Sequence<u16>,
// pub s_u8: Sequence<u8>,
// }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code

Comment on lines 201 to 228
// impl<R: Runtime> Default for GlobalArgsLaunch<'_, R> {
// fn default() -> Self {
// Self::new(
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// Default::default(),
// )
// }
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code

Comment on lines +186 to +187
pub tensors: Sequence<GlobalTensor>,
pub scalars: Sequence<GlobalScalar>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's clean 👌

Copy link

codecov bot commented Feb 17, 2025

Codecov Report

Attention: Patch coverage is 47.34940% with 874 lines in your changes missing coverage. Please review.

Project coverage is 82.23%. Comparing base (136eeb6) to head (bbcb438).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/burn-cubecl-fusion/src/on_write/kernel.rs 14.68% 430 Missing ⚠️
crates/burn-cubecl-fusion/src/on_write/io.rs 54.22% 157 Missing ⚠️
crates/burn-cubecl-fusion/src/on_write/tensor.rs 49.37% 81 Missing ⚠️
.../burn-cubecl-fusion/src/on_write/trace/executor.rs 66.66% 60 Missing ⚠️
crates/burn-cubecl-fusion/src/matmul/args.rs 0.00% 39 Missing ⚠️
...es/burn-cubecl-fusion/src/elemwise/optimization.rs 60.52% 30 Missing ⚠️
crates/burn-cubecl-fusion/src/base.rs 69.14% 29 Missing ⚠️
crates/burn-cubecl-fusion/src/on_write/ir.rs 78.78% 14 Missing ⚠️
...ates/burn-cubecl-fusion/src/matmul/optimization.rs 71.42% 10 Missing ⚠️
crates/burn-cubecl/src/fusion.rs 85.48% 9 Missing ⚠️
... and 5 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2815      +/-   ##
==========================================
+ Coverage   81.70%   82.23%   +0.52%     
==========================================
  Files         852      853       +1     
  Lines      114337   113454     -883     
==========================================
- Hits        93424    93296     -128     
+ Misses      20913    20158     -755     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nathanielsimard nathanielsimard merged commit f3dfd05 into main Feb 17, 2025
11 checks passed
@nathanielsimard nathanielsimard deleted the refactor/cubecl-fusion branch February 17, 2025 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants