Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run-time structured generation benchmarks #549

Open
lapp0 opened this issue Jan 17, 2024 · 1 comment · May be fixed by #925
Open

Run-time structured generation benchmarks #549

lapp0 opened this issue Jan 17, 2024 · 1 comment · May be fixed by #925
Assignees

Comments

@lapp0
Copy link
Collaborator

lapp0 commented Jan 17, 2024

Initialization benchmarks are introduced in #542

We should extend these benchmarks to measure the performance of inference.

Goal

Outlines shouldn't be a bottleneck for most inference. A reasonable goal can be set based on

Benchmarks will help us achieve and maintain that goal.

What must be benchmarked

Proposed method

It's annoying to need a GPU to run tests. We shouldn't do actual inference in performance benchmarks.

    1. Create a mock inference engine
    1. Simple benchmark to ensure unguided mock inference engine takes infinitesimally small time
    1. Guided benchmarks that show true throughput of outlines
@rlouf
Copy link
Member

rlouf commented Jan 18, 2024

At this point I think that it would only make sense to benchmark the CFG-guided generation. Regex-guided generation is only a dictionary call at each step, so there really isn't anything we could do that would move the needle.

@brandonwillard brandonwillard changed the title Performance Test Inference Run-time structured generation benchmarks May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment