Skip to content

Commit

Permalink
Model: Add torch.inference_mode() to generator function
Browse files Browse the repository at this point in the history
Provides a speedup to model forward.

Signed-off-by: kingbri <[email protected]>
  • Loading branch information
bdashore3 committed Mar 30, 2024
1 parent e8b6a02 commit b11aac5
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions backends/exllamav2/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -648,6 +648,7 @@ async def generate_gen(
async for value in iterate_in_threadpool(sync_generator):
yield value

@torch.inference_mode()
def generate_gen_sync(
self, prompt: str, abort_event: Optional[threading.Event] = None, **kwargs
):
Expand Down

0 comments on commit b11aac5

Please sign in to comment.