-
Notifications
You must be signed in to change notification settings - Fork 136
Agents SDK: startActivity spans should finish when the activity has been sent to the server #1172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tconley1428
wants to merge
6
commits into
main
Choose a base branch
from
openai/context_detach
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+30
−41
Open
Changes from 3 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
d01159f
Don't set startActivity as the parent context
tconley1428 e1120d9
Merge branch 'main' into openai/context_detach
tconley1428 f7545d8
Merge branch 'main' into openai/context_detach
tconley1428 0a8fab6
Leave parent spans in place
tconley1428 1373a25
Remove prints
tconley1428 2b9b5da
Fix comment
tconley1428 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this code changed here and in a few places. IIRC, it was intentional that we did
set_header_from_context
inside thewith custom_span
so the header has that current span serialized (and therefore spans that occur inside workflow will have this as a parent).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was, but I think it wasn't actually correct to do. Since the signal span would more or less immediately end, I think it is misleading to parent the child workflow execution to the signal span. Rather, the child workflow is parented to the parent workflow's span (or whatever user defined span is active at that point in the workflow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same change I am making for startActivity essentially, the activity execution isn't parented to the startActivity, it's parented to the workflow, because really the start has already completed.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not parenting a child workflow execution, it's parenting the child signal handler in this case (though in some cases it does). Not sure I agree here necessarily. At least in OTel at
TracingWorkflowInboundInterceptor.handle_signal
, we have chosen to model the signal span as "linked", not the parent, but in the absence of a concept of linking, I think the hierarchy should represent the most targeted span out there regardless of span duration.It doesn't make sense to me that you have a bunch of orphan outbound spans just on a common workflow span and orphan inbound spans just on a common workflow span that don't relate to each other when they actually do relate to each other. It is totally reasonable in tracing contexts to have a starting span be the parent of the thing it starts even if the start was quick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that execution just isn't a child of the start in the normal sense. If we wanted to make a new span which covered the entire lifetime of the activity and was the parent of both start and execute, we could, though I don't really see the benefit that would provide. Doing so will also get us back in the detach handling game.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a child in the normal sense, a child when you start workflow (top-level from client or child workflow from workflow), a child when you start activity (top-level on standalone activities or regular activity from workflow), a child when you start Nexus operation, etc, etc. This is how OTel tracing is done.
I think in this case, the consistency of matching SDK behavior elsewhere is valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Would you agree with the other portion of the change that even if
startActivity
is the parent ofexecuteActivity
, it should terminate when the scheduling is completed?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In OTel, since we can't have an in-workflow duration, it's completed even before scheduling completed. There is no concept of "start completed", so IMO it makes sense for
startActivity
to span the entire run of the activity including all attempts, andexecuteActivity
is actually per attempt (you can have 0 or manyexecuteActivity
spans in astartActivity
).So it's hard to say since OpenAI tracing is unique in that it lets us represent durations accurately. May be worth checking what TypeScript does here in OTel since that's the only other SDK I know of that lets us track span durations accurately. IMO, spans for starting activity, child, nexus op, signaling, etc should last their entire duration if they can, even though it is called "start" in the name to match our OTel span naming conventions. But it's not a strong opinion.