Skip to content

Conversation

@cehongwang
Copy link
Collaborator

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@meta-cla meta-cla bot added the cla signed label Oct 16, 2025
@github-actions github-actions bot added component: tests Issues re: Tests component: lowering Issues re: The lowering / preprocessing passes component: conversion Issues re: Conversion stage component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Oct 16, 2025
@github-actions github-actions bot requested a review from peri044 October 16, 2025 22:23
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Oct 29, 2025
@cehongwang cehongwang force-pushed the cpu-memory-optimization-rebase-main branch from b9b6aeb to 51f64f0 Compare October 29, 2025 17:30

@needs_refit # type: ignore[misc]
def _insert_engine_to_cache(self, hash_val: str, serialized_engine: bytes) -> None:
def _insert_engine_to_cache(self, hash_val: str, engine: trt.ICudaEngine) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zewenli98 when do these calls run? will this conflict with the goal of keeping mem usage under 3x?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do caching in a post processing step?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we can give the cache entry as one of the Interpreter Result fields

settings: CompilationSettings = CompilationSettings(),
arg_inputs: Optional[Sequence[Input]] = None,
kwarg_inputs: Optional[dict[str, Any]] = None,
engine_cache: Optional[BaseEngineCache] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to move the cache insert here

@cehongwang cehongwang force-pushed the cpu-memory-optimization-rebase-main branch from 51f64f0 to f77df5a Compare October 30, 2025 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: lowering Issues re: The lowering / preprocessing passes component: tests Issues re: Tests documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants