Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative approach to support torch.compile #1006

Merged
merged 25 commits into from
Sep 23, 2024

Conversation

Giuseppe5
Copy link
Collaborator

@Giuseppe5 Giuseppe5 commented Aug 23, 2024

This works by assuming that most of the quantization process has already taken place, and it is no longer needed to propagate QuantTensors.

Typical usage:

with inference_mode(model):
     model(input)
     compile_model = torch.compile(model, fullgraph=True)
     # Rest of the computation under compile goes here. Once out of the context manager, the original model must be used.

@Giuseppe5 Giuseppe5 force-pushed the alternative_compile branch from d9a29b1 to 73dc0cf Compare August 25, 2024 12:53
Copy link
Collaborator

@nickfraser nickfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs tests!

src/brevitas/core/function_wrapper/clamp.py Show resolved Hide resolved
self.max_clamp = max_int(module.is_signed, module.is_narrow_range, self.bit_width)

def quantize(self, x):
return torch.clamp(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like these won't work with Groupwise quantization, correct? So inference_mode + MX won't work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I forgot to add the export handler for MX INT and MX Float

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postponed to another update

src/brevitas/proxy/float_runtime_quant.py Show resolved Hide resolved
tests/brevitas_end_to_end/test_torchvision_models.py Outdated Show resolved Hide resolved
@Giuseppe5 Giuseppe5 requested review from nickfraser and removed request for nickfraser September 17, 2024 01:38
@nickfraser
Copy link
Collaborator

LGTM!

@Giuseppe5 Giuseppe5 merged commit b28ac0f into Xilinx:dev Sep 23, 2024
373 of 374 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants