Skip to content

Commit

Permalink
Supports dynamic number of samples in continuous batching.
Browse files Browse the repository at this point in the history
Prefill is done once, followed by multiple inserts.

Using an aux field to store finished samples, and performs postprocessing after all samples are received for a request.

Releases resource for batching after each sample is done. typically, num_live_batches should be set to num_slots // prefill_batch_size or larger.

Since prefill produces first token, we require the method to provide a new function to resample initial tokens.

PiperOrigin-RevId: 671886877
Change-Id: Id4ddec1f99e8e13d755bbeb5343aab8ca12f688e
  • Loading branch information
ukoxyz authored and copybara-github committed Sep 6, 2024
1 parent 138be5f commit 049e123
Show file tree
Hide file tree
Showing 3 changed files with 187 additions and 53 deletions.
Loading

0 comments on commit 049e123

Please sign in to comment.