Allow to rewrite LLM result in an OutputGuardrail #1021

mariofusco · 2024-10-29T16:26:13Z

Closes: #1020

cescoffier · 2024-10-30T07:27:13Z

I looked at my note as I remembered looking into this in the first implementation. There is a philosophical question that I was not able to crack.

Let's imagine 3 output guardrails: A, B, C.

A pass - all good
B pass but modify the output value
C fails and reprompt

Now the problem is that we are going to reprompt based on a modified value, not the original value. So, the LLM may answer the same, or actually, do not understand the reprompt instruction and start hallucinating.

We could imagine modifying the LLM response in the context to update the value to the modified value, but again, the LLM may see this and can freaks out.

Once possibility is that only the last guardrails can change the value.
Or, we can say that once the value is changed, retry and reprompt are not allowed anymore.

lordofthejars · 2024-10-30T08:20:54Z

Not allowed or ignored, so no exception is thrown; we just log why they are not executed. What I see here is we can assume that most of time there will be one output guardrails (or you can implement one and put all logic there), so I guess that ignoring might be the best option.

mariofusco · 2024-10-30T09:11:48Z

Once possibility is that only the last guardrails can change the value.

I don't like this solution, mostly because I can totally see 2 guardrails changing the result in sequence one after the other (like in the test case that I added).

Or, we can say that once the value is changed, retry and reprompt are not allowed anymore.

That is a better option already. Another possibility is to keep both the original and modified value in the response object and allow a guardrail to inspect both before deciding for a retry or a reprompt, but probably this uselessly overcomplicates things for users.

I would avoid to overthink on this (using more than a guardrails in a row is already an edge case imo) and simply ignore the problem or at most preventing retry/reprompt on a modified value as you suggested.

@cescoffier please let me know which option you prefer.

lordofthejars · 2024-10-30T09:21:59Z

I totally agree with you @mariofusco just if my vote counts 😆

cescoffier · 2024-10-30T09:33:33Z

I may be bossier but I often use chains of guardrails.

But yes, maybe we could just add an option enabling/disabling retry and reprompt after a change of value.

The thing is that it is likely to work with smart LLM and totally break on less smart ones.

mariofusco · 2024-10-30T09:44:16Z

But yes, maybe we could just add an option enabling/disabling retry and reprompt after a change of value.

Ok, I will try to implement this.

mariofusco · 2024-10-30T10:51:58Z

@cescoffier Done, please give it a second look.

cescoffier · 2024-10-30T10:52:57Z

@cescoffier Done, please give it a second look.

First look... first look :-)

cescoffier

It looks great.

I would add a few tests with streamed responses, as, unfortunately, streams require a different approach for output guardrails.

cescoffier · 2024-10-30T10:53:38Z

...yment/src/test/java/io/quarkiverse/langchain4j/test/guardrails/OutputGuardrailChainTest.java

+    }
+
+    @Test
+    @ActivateRequestContext


Question for @geoand - do you know if we can use @ActivateRequestContext on the class itself?

cescoffier · 2024-10-30T10:55:15Z

core/runtime/src/main/java/io/quarkiverse/langchain4j/runtime/aiservice/GuardrailsSupport.java

-                return result;
+                return accumulatedResults.isRewrittenResult() ? (GR) result.blockRetry() : result;
+            }
+            if (result.isRewrittenResult()) {


I can't remember if this method is invoked when using streamed responses. Streams make things slightly more convoluted.

Yes, this method is used only for streamed response. I'm keeping this rewriting here regardless, but now if I find that this rewriting happened while streaming I throw an exception as discussed.

mariofusco · 2024-10-30T11:00:01Z

I would add a few tests with streamed responses, as, unfortunately, streams require a different approach for output guardrails.

Sure, I will add tests for streamed responses. I guess that in that case rewriting the output is never allowed, correct?

mariofusco · 2024-10-30T14:55:40Z

I would add a few tests with streamed responses, as, unfortunately, streams require a different approach for output guardrails.

Sure, I will add tests for streamed responses. I guess that in that case rewriting the output is never allowed, correct?

@cescoffier I'm adding some tests to cover the streamed scenarios, but I'm no longer sure of how this is supposed to work. The problem is that in this case the output guardrail is invoked for each and every chunk. Now my rewriting implementation changes all of them one after the other, which probably makes the problem even worse, but even without this I don't see which kind of meaningful validation an output guardrail could perform on a single chunk.

In essence I believe that in general an output guardrail is not compatible with a streamed LLM output and the 2 features should be used in a mutually exclusive way. Can you please clarify your point of view on this?

/cc @geoand

geoand · 2024-10-30T15:04:06Z

In essence I believe that in general an output guardrail is not compatible with a streamed LLM output and the 2 features should be used in a mutually exclusive way. Can you please clarify your point of view on this?

/cc @geoand

Tools for example work with streamed responses. I was actually surprised to find this, but it does work :)

cescoffier · 2024-10-30T15:07:53Z

@cescoffier I'm adding some tests to cover the streamed scenarios, but I'm no longer sure how this should work. The problem is that in this case the output guardrail is invoked for each and every chunk. Now my rewriting implementation changes all of them one after the other, which probably makes the problem even worse, but even without this I don't see which kind of meaningful validation an output guardrail could perform on a single chunk.

No, that's not the case, it depends on the "accumulator" strategy. It can accumulate the full response (which defeat a bit the purpose of streaming), or each token (which makes things very hard to validate), or anything in between (sentence, paragraph, JSON object...).

You can have plenty of validation with the right accumulation strategy. But, yes, the guardrails are called for every accumulated item.

In essence I believe that in general an output guardrail is not compatible with a streamed LLM output and the 2 features should be used in a mutually exclusive way. Can you please clarify your point of view on this?

It is not exclusive. As I wrote, it depends on the accumulation strategy. I am implementing a chatbot that is streamed because otherwise, the experience is pretty bad, and I still check for hallucinations, out of topic content...

mariofusco · 2024-10-30T16:41:13Z

It is not exclusive. As I wrote, it depends on the accumulation strategy. I am implementing a chatbot that is streamed because otherwise, the experience is pretty bad, and I still check for hallucinations, out of topic content...

Ok, I see your point. Let me rephrase this and contextualize only to this pull request. For what regards these rewriting output guardrails I don't see how this feature could be usable and useful on a partial response. In other words I don't see a scenario when you may want to rewrite each and every chunk of a streamed response. And yes, this also depends on the "accumulator" strategy, but even here, unless you don't accumulate the whole response (which as you said totally defeats the goal of using a streaming API) I don't think that rewriting different subparts of a response could be of any usefulness.

If you think that my point of view is wrong or limited and still believe that a rewriting guardrail could be useful also on a partial response, then I confirm that everything works as expected, I can add the streamed tests that I just wrote to this pull request and my work is finished.

Conversely, if you agree with me, maybe we should add a mechanism that prevents the response rewriting (throwing an exception?) when using the streamed version.

What do you think?

lordofthejars · 2024-10-30T17:06:48Z

A good solution could be to use a stream and try to modify the output by throwing an exception to explain the situation. I think it is understandable and also let's not rethink a lot at the end the limitation is only if you use stream and you want to modify the output, in any case it affects other output guards.

cescoffier · 2024-10-30T17:37:12Z

I took a few minutes to think about it, and I tend to agree that for streaming modifying the value does not make a lot of sense. It's technically possible (the guard would modify the emitted item and it would be emitted downstream).

We can revisit if there is a use case emerging.

mariofusco · 2024-10-30T17:40:23Z

I took a few minutes to think about it, and I tend to agree that for streaming modifying the value does not make a lot of sense. It's technically possible (the guard would modify the emitted item and it would be emitted downstream).

We can revisit if there is a use case emerging.

OK, I will change my pull request to throw an exception in case an output guardrail attempts to rewrite the LLM response while streaming.

mariofusco · 2024-10-31T09:15:36Z

I implemented all discussed changes, please @cescoffier review again.

cescoffier · 2024-11-03T09:33:32Z

@mariofusco can you rebase?

mariofusco · 2024-11-03T18:13:31Z

@cescoffier Done.

mariofusco requested a review from a team as a code owner October 29, 2024 16:26

geoand requested a review from cescoffier October 30, 2024 06:06

cescoffier mentioned this pull request Oct 30, 2024

Add prompt template and variables to input / output guardrails #992

Merged

mariofusco force-pushed the out_guard_with_result branch from 76201c2 to 7ffeb74 Compare October 30, 2024 10:51

cescoffier requested changes Oct 30, 2024

View reviewed changes

mariofusco force-pushed the out_guard_with_result branch from 7ffeb74 to 4e13caa Compare October 31, 2024 09:12

geoand requested a review from cescoffier November 1, 2024 13:24

cescoffier approved these changes Nov 3, 2024

View reviewed changes

Allow to rewrite LLM result in an OutputGuardrail

4ac0106

mariofusco force-pushed the out_guard_with_result branch from 4e13caa to 4ac0106 Compare November 3, 2024 18:13

cescoffier merged commit ed12693 into quarkiverse:main Nov 3, 2024
59 checks passed

mariofusco mentioned this pull request Nov 4, 2024

OutputGuardrails not permitting override generated output #1020

Closed

mariofusco deleted the out_guard_with_result branch November 4, 2024 07:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to rewrite LLM result in an OutputGuardrail #1021

Allow to rewrite LLM result in an OutputGuardrail #1021

mariofusco commented Oct 29, 2024 •

edited by geoand

Loading

cescoffier commented Oct 30, 2024

lordofthejars commented Oct 30, 2024

mariofusco commented Oct 30, 2024

lordofthejars commented Oct 30, 2024

cescoffier commented Oct 30, 2024

mariofusco commented Oct 30, 2024

mariofusco commented Oct 30, 2024

cescoffier commented Oct 30, 2024

cescoffier left a comment

cescoffier Oct 30, 2024

cescoffier Oct 30, 2024

mariofusco Oct 31, 2024

mariofusco commented Oct 30, 2024

mariofusco commented Oct 30, 2024

geoand commented Oct 30, 2024

cescoffier commented Oct 30, 2024

mariofusco commented Oct 30, 2024

lordofthejars commented Oct 30, 2024

cescoffier commented Oct 30, 2024

mariofusco commented Oct 30, 2024

mariofusco commented Oct 31, 2024

cescoffier commented Nov 3, 2024

mariofusco commented Nov 3, 2024

Allow to rewrite LLM result in an OutputGuardrail #1021

Allow to rewrite LLM result in an OutputGuardrail #1021

Conversation

mariofusco commented Oct 29, 2024 • edited by geoand Loading

cescoffier commented Oct 30, 2024

lordofthejars commented Oct 30, 2024

mariofusco commented Oct 30, 2024

lordofthejars commented Oct 30, 2024

cescoffier commented Oct 30, 2024

mariofusco commented Oct 30, 2024

mariofusco commented Oct 30, 2024

cescoffier commented Oct 30, 2024

cescoffier left a comment

Choose a reason for hiding this comment

cescoffier Oct 30, 2024

Choose a reason for hiding this comment

cescoffier Oct 30, 2024

Choose a reason for hiding this comment

mariofusco Oct 31, 2024

Choose a reason for hiding this comment

mariofusco commented Oct 30, 2024

mariofusco commented Oct 30, 2024

geoand commented Oct 30, 2024

cescoffier commented Oct 30, 2024

mariofusco commented Oct 30, 2024

lordofthejars commented Oct 30, 2024

cescoffier commented Oct 30, 2024

mariofusco commented Oct 30, 2024

mariofusco commented Oct 31, 2024

cescoffier commented Nov 3, 2024

mariofusco commented Nov 3, 2024

mariofusco commented Oct 29, 2024 •

edited by geoand

Loading