-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add probability distribution to choices #479
Comments
Greate Idea. Here are some quick thoughts on how we might be able to implement this although I'm not completely sure if this would work.. On in line 75 of serve.py we have:
https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py Request_output.output should be a VLLM CompletionOutput?, which has log probs as an argument. If that was the case you could just add an optionl to return that as well. https://github.com/vllm-project/vllm/blob/main/vllm/outputs.py |
There's a small subtlety here. There may be several combinations of tokens that lead to either of the choices. In this case do we need to return the logprob that corresponds to all possible paths or only the path that was sampled? If we go with "all possible paths", the basic idea is to find all paths in the FSM that lead to either choice and pass the corresponding token ids + prompt through the model to get the corresponding logprobs. |
Maybe off-topic: What about doings this for the openai part. For that we should give (optional) access to the logprobs values (https://cookbook.openai.com/examples/using_logprobs). Of course, there is the subtlety that a generation may be the output of "n" api-calls, so there may be some decisions to be made on how to returns the aggregates. A pre-condition for this may be using the pickleserializer for the persistence, so that we can save the whole api response. But that change would affect also the "transformers" part of the module. (Should this go to a separate issue?) |
I am still not sure what the API would look like, especially since we still want |
Can we have a new method, "probabilities"? Also, can you point out where in the codebase the actual decision on which class to select is made? |
We might consider returning a
Note that if choices have multiple tokens, we aren't guaranteed we know the probabilities. However, with a beam search sampler we can guarantee we know the @dnhkng you might run into issues implementing this before beam search is available. The actual decision on which class to select is determined by the language model, not based on post-processing. https://github.com/outlines-dev/outlines/blob/main/outlines/generate/choice.py |
Ahh, ok. I thought the selection was done by post-processing the probabilities. Otherwise, you might select categories with high initial token probability, but with a beam search you would find the overall most likely category. I have an interesting use case that would require the probabilities. |
With beam search, you are guaranteed to explore all legal paths given that the number of legal paths is equal to the number of beams. This is why I suggest beam search. Although, reconsidering - there may be multiple legal paths for each choices, e.g. I agree that this is a valuable and interesting use case. Here are a few steps that would need to be done to accomplish this:
|
This is overkill
You can walk the FSM created when calling |
Sum of the average probability per token of each combination? Some care needs to be taken with the target categories. Imagine a character level LLM, and we want the probabilities of 'yes' or 'no' for some prompt question. Not only are there more letters in 'yes', but there are also many more words that start with 'no', biasing the selection. In this case, although we want just 'yes' or 'no' we should use something like 'yes.' or 'yes ', as the probability on the ' ' or '.' will compensate the letters when we average over all characters. |
I'm concerned about the number of combinations of tokens, it would have exploding growth. Is there something I'm missing here?
I don't think we can explore all tokenization paths for a given choice. It seems the best we can do is calculate the probability the best path for each choice (via greedy for now, beam later) and compare, OR strictly limit the size of probabilistic choices. |
Although the number of combinations feels n^2, I think the paths overlap, and it resolves to n. Feels like a dynamic programming coding interview question 😅 Break down the input string into subchunks recursively, and then do a batch on an LLM to get the logits, and fill in the graph. Finally, calculate all the paths based on the probabilities, calculate the average probability per token per path, and sum them? |
The simplest here would still be approximate by taking multiple samples once #533 is merged. SMC on the roadmap should give better results. |
Yes, monte carlo might be fine ;) BTW, can someone tell me what FSM stands for? Finite state machine maybe? |
Finite State Machine indeed. |
Is anyone still actively working on this ? @dnhkng ? If not, I can give it a try myself, I also need it. |
No, not working on this feature. |
+1 for this feature - this would be very useful! |
For By default, Pros:
Cons:
I implemented this in a PR I just submitted (#895) |
This is part of a wider set of requests for being able to view the logprobs of each token (#614). I think the API here should probably come in two parts. The simplest API is just to view the next-token distribution, rather than the combination of all tokens that map to a given choice, particularly since obtaining log probabilities of entire sequences is quite difficult (as noted). One API I could imagine using without too much fuss is a generator = outlines.generate.choice(model, ["a", "b"])
result = generator.logprobs("Pick a or b.") which would return an object (
For an arbitrarily sampled sequence, we'd have a list of This is pretty easy to work with if you use This way you'd get a rough approximation of sequence probabilities you can work with, and run all the math yourself. |
I also think this would be a good feature. There is another repo that provides this functionality, but I'm not proficient enough to tell if this can be implemented in outlines as well. Maybe someone can take a look? |
No, it currently isn't.
Happy to provide guidance to anyone seeking to tackle this issue. The easiest approach is likely beam search since Outlines has interfaces for generation, but not for The cleanest approach is to implement |
Presentation of the new feature
It would be very helpful to include the probability distribution of the different options (both log probabilities and real probabilities) present in
outlines.generate.choice()
. This is useful for evaluating the certainty of the model for any given classification.We use it as a pre-filter step for deciding if we should generate more expensive reasoning (for example, CoT) to arrive at a more certain classification.
Two areas of complexity that I'm aware of:
Are you willing to open a PR?
Yes, though I would need pointers on where to start.
The text was updated successfully, but these errors were encountered: