-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drafting polymorphic value support for dtensors. #3940
base: main
Are you sure you want to change the base?
Conversation
Description
Changes walkthrough 📝
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
I think it'll help if I do https://github.com/NVIDIA/Fuser/blob/main/csrc/python_frontend/fusion_definition.h#L210. I can do that today. I didn't do that because I like to bundle up mesh, shadings and tensor for clarity. Can you remind me why returning PValue all the way to python frontend helps latency? Thunder talks in at::tensor so at some point we'll have to extract at::tensor out of PValue |
Was giving this a quick shot but was having some difficulty. Just posting as a heads up for @wujingyue @syed-ahmed
I would prefer not to modify polymorphic_value.h so heavily (looks like I also removed some functions that are needed). To prevent modifying polymorphic value so much we'd need a simplier DistributedTensor implementation that doen'st depend on DeviceMesh and ParallelType
We just end up getting in some circular dependencies with type.h
A bit more context:
I intend to change all return types of nvFuser to KernelArgumentHolder which holds a vector of PolymorphicValue. So, if we want to return distributed tensors they should work with PolymorphicValue.