-
Notifications
You must be signed in to change notification settings - Fork 94
Opencensus proto Spans should optionally contain Node/Process information. #169
Comments
+1, it's the approach we took in Jaeger: each Span in a Batch can have its own Process entry, otherwise we use the shared Batch.Process. |
Should there be a containing object like CC: @mtwo |
From what I am seeing here Node is miss-used here. Node was added as an internal message to be used by library->agent->collector to know who generated the Span (I know that some data in the Node can be determined from the RPC connection, but that is not possible for example in the collector because the library talks to the agent not to the collector directly). What you probably need is the "Resource" which main scope is to "logically" (pod, container, task number, service name, etc.) identify the task in the system. I think having the Resource in the Span is fine. Maybe I don't understand what you try to achieve here but at least this is my mental model. @yurishkuro What do you have in "Process" entry? |
@bogdandrutu if I tried to restate what you meant -> Node represents the "sender" of the last batch of spans (more for book-keeping of internal collection infrastructure, could be agent or collector, keeps changing for the same span), as opposed to the "originator/creator task" of the spans (always stays the same for the span) - which you are calling resource here. In that case, there is definitely overloading happening of the Node concept in exporters and in multiple places in the project, causing many bugs. We should clearly define the resource concept, and include the resource in the span proto. As I understand, in Jaeger, there is no separate concept of Node only resource (known as process in jaeger). The Process identifies the "resource" concept above. Further, there is an on-the-wire optimization that when the same client task sends a batch of several spans to an agent/collector, then a single process instance is send and associated with all the spans in the batch. |
@bogdandrutu I did indeed mean |
@bogdandrutu how does this mesh with some of the resource definition work that you've been involved in? |
Discussed offline. Path forward is to add Resource to Span, since Resource is what I had thought Node was originally. |
@mtwo @bogdandrutu should there be wrapper for other types of contextual properties that may be introduced in future? This may help to align with whatever we will do with |
Problem
Opencensus-proto Spans are the building block for transmitting oc span data. With the current data models, there is no way to transmit Node information without using the agent's ExportTraceServiceRequest which disallows batching of spans from different nodes. This batching strategy is suboptimal for collector or agent processes that are accepting spans from many nodes. This additionally hampers potential benefits from sending compressed batches of spans, since it limits the batches to that of spans from a single node.
Proposed Solution
In order to allow for arbitrary batching of Spans, I propose adding optional Node/Process information as a field on the span proto structure. To do this the following changes would be needed:
This would allow for collector or agent services to batch the sending of spans from an arbitrary number of services.
Impact on current agent and client libraries:
Since this field is optional, it would not prevent current optimizations (such as the pre-sending of Node data) from working as-is.
The text was updated successfully, but these errors were encountered: