-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trajectory replay: Fix a few corner cases #6380
base: main
Are you sure you want to change the base?
Conversation
if isinstance(event, MessageAction) and event.wait_for_response: | ||
# For any message waiting for response that is not the last | ||
# event, we override wait_for_response to True, as a response | ||
# would have been included in the next event, and we don't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a response would have been included in the next event
The next event is a MessageAction with source='user', which is the response. Is this event you mean, or do you mean the next source=agent event?
I ask because I'm curious about something: I feel like if the replay process ends, then we close the controller, it will be saved in a new trajectory, and that should reflect perfectly what happened, just like the initial trajectory: so IMHO it should contain... 🤔
- the agent actions, including this MessageAction
- the correct user actions, including task, including... the response to this MessageAction? The response is a MessageAction with source='user'
(I mean enough events should be retrieved so that an agent with this history can continue normally, with all information it had in the past. Or do you see a reason why that won't work?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next event is a MessageAction with source='user', which is the response. Is this event you mean
Yes, literally the next event.
I mean enough events should be retrieved so that an agent with this history can continue normally, with all information it had in the past
I think that's what I've been trying to achieve? Do you see any place that would break this assumption? The response from user is indeed included in the trajectory. For example, in demo2.json, step 16 contains the user response.
The logic here is to NOT pause the control flow. The controller "replays" the "recorded" user response from the trajectory, rather than a new user response from the actual user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see and I agree, thank you, we're on the same page on the goal. I have some small hesitation though, but I am still to look in detail at the .json files, so please take it with a grain of salt, and feel free to ignore it atm (I'll look closer at it tonight):
-
I don't see clearly how the controller can replay the "recorded" user response, since this code says that all actions with
source='user'
are not replayable. What am I missing? -
idk, it also seems to me that we're getting an extra MessageAction that wasn't in history before? The null/null message is new in demo2. Unless I'm hallucinating worse than my Opus. 😅
I wonder if there's an alternative: during replay, interpret wait_for_response
as "don't wait, read next message". But it might be more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see clearly how the controller can replay the "recorded" user response, since this code says that all actions with source='user' are not replayable. What am I missing?
You are right, I was cheating! It's not being "replayed" because there's nothing to replay. It's skipped from replay manager perspective.
it also seems to me that we're getting an extra MessageAction that wasn't in history before? The null/null #6380 (comment) is new in demo2
Yeah that might be a side-effect of setting wait_for_response = False
. Let me think about your alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's an alternative: during replay, interpret wait_for_response as "don't wait, read next message". But it might be more complex.
This sounds like the right way to do stuff, but... it means more coupling between agent controller and replay manager 💭
Looking at
A MessageAction with null content, null image, and wait_for_response = true ? Aaahh I think I see how that happened, you literally said it, you added something. The previous is a MessageAction with content where the agent is asking the user a question, but its wait_for_response = false... because this PR is setting it false, right? |
That's correct! |
Aside, I do realize this would become a bug farm... and I'll make sure to add some E2E tests before checking in the user-facing replay functionality in #6348 |
End-user friendly description of the problem this fixes or functionality that this introduces
Fix two corner cases handling in trajectory replay feature.
Give a summary of what the PR does, explaining any non-trivial design decisions
Two corner cases were missing in the previous PR #6215:
wait_for_response
message, replay gets stuck, waiting for user's response, which doesn't make sense when in the middle of a replay. This is demonstrated in demo2.json and demo3.json.demo1.json - GUI mode: downloaded from web GUI
demo2.json - Headless mode: after demo1 replay, add a user message, and finish
demo3.json - Headless mode: a replay of demo2. Note: demo2.json and demo3.json only differ in step id, timestamp, and hostname.
Link of any specific issues this addresses
Part of #6049
To run this PR locally, use the following command: