Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give the possibility to obtain the full response when calling the vLLM generate function #1199

Open
alonsosilvaallende opened this issue Oct 8, 2024 · 4 comments · May be fixed by #1212
Open

Comments

@alonsosilvaallende
Copy link
Contributor

alonsosilvaallende commented Oct 8, 2024

I'm using InspectAI to evaluate language models. In particular, I'm evaluating the benefits of structured text generation using Outlines with language models. I would like to obtain the full response when calling the vLLM generate function since InspectAI expects to get the full response. Would it be possible to give the possibility to the user to get the full response. The default should still be the same as now which is a filtered response.

@rlouf
Copy link
Member

rlouf commented Oct 8, 2024

What do you mean by full response?

@alonsosilvaallende
Copy link
Contributor Author

alonsosilvaallende commented Oct 8, 2024

What do you mean by full response?

InspectAI needs the full "results" variable returned by the model.generate call of the vLLM API, see line 131 here

results = self.model.generate(

Currently only the texts are returned in a list starting at line 137:

results = [[sample.text for sample in batch.outputs] for batch in results]
batch_size = len(results)
sample_size = len(results[0])
if batch_size == 1 and sample_size == 1:
return results[0][0]
elif batch_size == 1:
return results[0]
elif sample_size == 1:
return [batch[0] for batch in results]
return results

Copy link
Member

rlouf commented Oct 8, 2024

Couldn't you just implement a [custom solver](https://inspect.ai-safety-institute.org.uk/solvers.html) for InspectAI?

@LouSalaun
Copy link

LouSalaun commented Oct 9, 2024

Hi @rlouf, I'm also interested in this. Custom solvers and custom models are indeed the way to go. However, there is still the issue that we loose information when using Outlines' generate function. For example, with Outlines' wrapper of vLLM, we don't have the stop_reason, logprobs and output_tokens fields of the LLM's output.

Would adding an optional argument to output directly the results of vLLM make sense to you? The default behavior would remain the same. I can send a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants