-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/cubecl fusion #2815
Refactor/cubecl fusion #2815
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from what was discussed offline, just some minor comments.
Approving in advance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why this is alone in a new file? Setting the table for incoming stuff? 👀
We can probably remove this file otherwise
// #[derive(CubeLaunch, Default)] | ||
// /// Global arguments that are used for fusing [element wise operations](ElemwiseOp). | ||
// pub struct GlobalArgs { | ||
// pub t_f32: Sequence<Tensor<Line<f32>>>, | ||
// pub t_f16: Sequence<Tensor<Line<f16>>>, | ||
// pub t_bf16: Sequence<Tensor<Line<bf16>>>, | ||
// pub t_i64: Sequence<Tensor<Line<i64>>>, | ||
// pub t_i32: Sequence<Tensor<Line<i32>>>, | ||
// pub t_i16: Sequence<Tensor<Line<i16>>>, | ||
// pub t_i8: Sequence<Tensor<Line<i8>>>, | ||
// pub t_u64: Sequence<Tensor<Line<u64>>>, | ||
// pub t_u32: Sequence<Tensor<Line<u32>>>, | ||
// pub t_u16: Sequence<Tensor<Line<u16>>>, | ||
// pub t_u8: Sequence<Tensor<Line<u8>>>, | ||
// pub s_f32: Sequence<f32>, | ||
// pub s_f16: Sequence<f16>, | ||
// pub s_bf16: Sequence<bf16>, | ||
// pub s_i64: Sequence<i64>, | ||
// pub s_i32: Sequence<i32>, | ||
// pub s_i16: Sequence<i16>, | ||
// pub s_i8: Sequence<i8>, | ||
// pub s_u64: Sequence<u64>, | ||
// pub s_u32: Sequence<u32>, | ||
// pub s_u16: Sequence<u16>, | ||
// pub s_u8: Sequence<u8>, | ||
// } | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code
// impl<R: Runtime> Default for GlobalArgsLaunch<'_, R> { | ||
// fn default() -> Self { | ||
// Self::new( | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// Default::default(), | ||
// ) | ||
// } | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code
pub tensors: Sequence<GlobalTensor>, | ||
pub scalars: Sequence<GlobalScalar>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's clean 👌
Improve Burn's compilation time significantly :)
burn
crate, so we don't need to recompileburn-core
when working on a backend (better caching).burn-cubecl-fusion
, that exports all optimizations used byburn-cubecl
when fusion is activated.Overall, the fusion part of
burn-cubecl
went from 41s compilation time to 6.5s!burn-cubecl
now takes around 10s and compiles in parallel withburn-cubecl-fusion
.