Change recording to create a span of type "record_root" #1703

sfc-gh-gtokernliang · 2024-12-19T17:35:27Z

Description

Updated the app context manager to be able to create a span of type "record_root"

Other details good to know for developers

For reference, this is the current set of spans as produced in the notebook:

{
    "name": "nested2",
    "context": {
        "trace_id": "0xac43c20656ef1ed158f02f4e62c97a04",
        "span_id": "0x424b5fe0f1f1a867",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0xe197ec362bed5183",
    "start_time": "2024-12-19T17:17:48.214895Z",
    "end_time": "2024-12-19T17:17:48.828808Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "name": "nested2",
        "kind": "SPAN_KIND_TRULENS",
        "parent_span_id": 4777017249493002343,
        "trulens.record_id": "bbd1b4dc-75fb-4d93-993d-e989e161330a",
        "nested2_ret": "nested2: test",
        "nested2_args[0]": "test",
        "status": "STATUS_CODE_UNSET"
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.25.0",
            "service.name": "trulens"
        },
        "schema_url": ""
    }
}
{
    "name": "nested",
    "context": {
        "trace_id": "0xac43c20656ef1ed158f02f4e62c97a04",
        "span_id": "0xe197ec362bed5183",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x90ebf55e6bebc620",
    "start_time": "2024-12-19T17:17:47.576714Z",
    "end_time": "2024-12-19T17:17:49.875737Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "name": "nested",
        "kind": "SPAN_KIND_TRULENS",
        "parent_span_id": 16255721097426456963,
        "trulens.record_id": "bbd1b4dc-75fb-4d93-993d-e989e161330a",
        "nested_attr1": "value1",
        "status": "STATUS_CODE_UNSET"
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.25.0",
            "service.name": "trulens"
        },
        "schema_url": ""
    }
}
{
    "name": "respond_to_query",
    "context": {
        "trace_id": "0xac43c20656ef1ed158f02f4e62c97a04",
        "span_id": "0x90ebf55e6bebc620",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x47c29031b69b4971",
    "start_time": "2024-12-19T17:17:46.523158Z",
    "end_time": "2024-12-19T17:17:50.869185Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "name": "respond_to_query",
        "kind": "SPAN_KIND_TRULENS",
        "parent_span_id": 10442709946874971680,
        "trulens.record_id": "bbd1b4dc-75fb-4d93-993d-e989e161330a",
        "status": "STATUS_CODE_UNSET"
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.25.0",
            "service.name": "trulens"
        },
        "schema_url": ""
    }
}
{
    "name": "root",
    "context": {
        "trace_id": "0xac43c20656ef1ed158f02f4e62c97a04",
        "span_id": "0x47c29031b69b4971",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": null,
    "start_time": "2024-12-19T17:17:42.002877Z",
    "end_time": "2024-12-19T17:17:51.852615Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "kind": "SPAN_KIND_TRULENS",
        "name": "root",
        "trulens.span_type": "record_root",
        "trulens.record_root.app_name": "default_app",
        "trulens.record_root.app_version": "base",
        "trulens.record_root.app_id": "app_hash_baf7b2cb6402e84fa3b0b3a028d4bf65",
        "trulens.record_root.record_id": "bbd1b4dc-75fb-4d93-993d-e989e161330a"
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.25.0",
            "service.name": "trulens"
        },
        "schema_url": ""
    }
}

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to
not work as expected)
New Tests
This change includes re-generated golden test results
This change requires a documentation update

Important

Add record_root span type to app context manager for enhanced tracing with OpenTelemetry.

Behavior:
- Add span type record_root to app context manager in app.py.
- Update App class in instrument.py to manage record_root spans with OpenTelemetry.
Attributes:
- Add SPAN_TYPE and RECORD_ID attributes in trace.py for span identification.
Imports:
- Change import paths from trulens.experimental.otel_tracing.core.app to trulens.experimental.otel_tracing.core.instrument in app.py.

^{This description was created by}^{for 14869e1. It will automatically update as commits are pushed.}

review-notebook-app · 2024-12-19T17:35:32Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

examples/experimental/otel_exporter.ipynb

src/otel/semconv/trulens/otel/semconv/trace.py

sfc-gh-dkurokawa · 2024-12-20T22:00:31Z

src/core/trulens/experimental/otel_tracing/core/instrument.py

+        tracer = trace.get_tracer_provider().get_tracer(TRULENS_SERVICE_NAME)
+
+        # Calling set_baggage does not actually add the baggage to the current context, but returns a new one
+        # To avoid issues with remembering to add/remove the baggage, we attach it to the runtime context.


I also don't really get this token business, can you enlighten me?

from what I can tell - to set the baggage you need to:

call set_baggage, which returns a new, updated context

attach it to the current context with context_api.attach - this returns a token that can then be used to remove it.

Once the record is over, we want to remove it, and the way to do it via OTEL context is by telling it the baggage you want it to detach via context.detach(token)

sfc-gh-dkurokawa · 2024-12-20T22:07:11Z

src/core/trulens/experimental/otel_tracing/core/instrument.py

+        root_span.set_attribute(SpanAttributes.RECORD_ROOT.APP_ID, self.app_id)
+        root_span.set_attribute(
+            SpanAttributes.RECORD_ROOT.RECORD_ID, otel_record_id
+        )


the semantic convention that Piotr laid out also has the MAIN_INPUT/MAIN_OUTPUT/MAIN_ERROR, but I don't see how we can get that even during __exit__, so I'm inclined to say we can remove them. Though that raises the question of, how will the UI know what the input and output are to display. It's too close to the finish line of 2024 that I don't want to investigate how this is determined currently but can next year haha.

There's also TOTAL_COST but I'm not as worried about that.

yeah we should chat about this - I think one idea I had while typing this up was to omit the record_root span type, and instead:

Attributes that are initially proposed to be stored in record_root should now be stored in the baggage name/version/id/record_id)

Every trace in the record should track all of the attributes above

the root of the record is essentially the span with no parent

We still need the root span type though to denote when we start a record though since if you create a otel trace before you start calling your app a bunch of times they'll all be part of the same trace but we want them to be separate records.

But more importantly, I'm confused, how does this help us determine the MAIN_INPUT/MAIN_OUTPUT/MAIN_ERROR?

if you create a otel trace before you start calling your app a bunch of times they'll all be part of the same trace but we want them to be separate records.

I know we discussed this before, but my memory's failing me - what does the pseudocode for that look like? is it:

with tru_app as recording: tru_app.respond_to_query(query) tru_app.do_something_else(query)

just so I can experiment with it a little more :)

But more importantly, I'm confused, how does this help us determine the MAIN_INPUT/MAIN_OUTPUT/MAIN_ERROR?

we aren't doing this in code as of yet, but I'm thinking that we should track the input/output/error for every span, so the way we determine the main input/output/error semantically would be:

Find the span with no parent

Use its input/output/error as the main input/output/error for the span.

sfc-gh-dkurokawa · 2024-12-20T22:11:36Z

src/core/trulens/experimental/otel_tracing/core/instrument.py

@@ -51,6 +58,10 @@ def wrapper(*args, **kwargs):
                span.set_attribute(
                    "parent_span_id", parent_span.get_span_context().span_id
                )
+                span.set_attribute(


we want app/run id in the baggage as well I guess, but given that it's not totally clear yet no point in doing it now I suppose.

I'll put the app id in baggage for now I guess since that seems pretty good to me. For run_id it's a little more ambiguous what the shape of the API will look like so I'll leave that out for now.

…4054

src/core/trulens/experimental/otel_tracing/core/instrument.py

…4070

sfc-gh-dkurokawa · 2024-12-24T03:58:11Z

src/core/trulens/core/app.py

@@ -421,6 +422,18 @@ def tru(self) -> core_connector.DBConnector:
        pydantic.PrivateAttr(default_factory=dict)
    )

+    tokens: list[object] = []


I think list[object] is worse than List[object] since the former doesn't work in python 3.8 which we do support.

also, is the type of the token varying or something? Why object?

thanks! didn't know about the difference between the two before.

object is chosen because that's the signature of the attach function in the OTEL context API

sfc-gh-dkurokawa · 2024-12-24T03:59:31Z

src/core/trulens/experimental/otel_tracing/core/instrument.py

@@ -84,6 +90,20 @@ def wrapper(*args, **kwargs):
                    # It's on the user to deal with None as a return value.
                    func_exception = e

+                span.set_attribute("name", func.__name__)


is this something that the event table expects or something?

yeah I'm trying to follow the schema where possible - see https://docs.snowflake.com/en/developer-guide/logging-tracing/event-table-columns#for-span-record-type

sfc-gh-dkurokawa · 2024-12-24T04:03:47Z

src/core/trulens/experimental/otel_tracing/core/instrument.py

+        root_span.set_attribute(SpanAttributes.RECORD_ROOT.APP_ID, self.app_id)
+        root_span.set_attribute(
+            SpanAttributes.RECORD_ROOT.RECORD_ID, otel_record_id
+        )


We still need the root span type though to denote when we start a record though since if you create a otel trace before you start calling your app a bunch of times they'll all be part of the same trace but we want them to be separate records.

But more importantly, I'm confused, how does this help us determine the MAIN_INPUT/MAIN_OUTPUT/MAIN_ERROR?

src/core/trulens/experimental/otel_tracing/core/instrument.py

sfc-gh-dkurokawa · 2024-12-24T04:09:49Z

src/core/trulens/experimental/otel_tracing/core/instrument.py

+            # See https://github.com/open-telemetry/opentelemetry-python/issues/2432#issuecomment-1593458684
+            context_api.detach(self.tokens.pop())
+
+        if self.span_context:


would this ever not be true?

I doubt it, but didn't think it would hurt to include it, if nothing, at least for the type checking haha

sfc-gh-gtokernliang added 10 commits December 17, 2024 12:29

draft

2b4a3a9

update

49d79dc

prefix

7207252

updates

f433019

touchups

2b03079

add event_id

a8e4955

add type schema

a6c0b63

add event_id

0ca7ebc

minor updates

1c6cbc2

pr feedback

3059d79

sfc-gh-gtokernliang requested a review from sfc-gh-dkurokawa December 19, 2024 17:35

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 19, 2024

update

021a43c

sfc-gh-gtokernliang force-pushed the garett/SNOW-1854278 branch from 9ab5db6 to 049b404 Compare December 19, 2024 20:24

sfc-gh-gtokernliang force-pushed the garett/SNOW-1854070 branch from 712f97b to c9db9a1 Compare December 19, 2024 20:33

sfc-gh-dkurokawa reviewed Dec 20, 2024

View reviewed changes

sfc-gh-gtokernliang added 13 commits December 20, 2024 18:05

ORM update

42f0575

Merge branch 'main' of github.com:truera/trulens into garett/SNOW-185…

fb7cf3b

…4054

save

9c313b4

update exporter

20bb651

update

d0b4398

nits

99dcb44

fix parent

834d6e7

save

eb61153

save

7fd7de2

update typing

aad3ffa

update

9e27b0f

update

8004a55

update exporter to have the attributes in attributes

f35f2b0

sfc-gh-dkurokawa and others added 16 commits December 21, 2024 12:56

Add framework for test.

4528258

Add in better test.

a1c461c

Clean up some issues and fix import issues.

1c34cd8

Use Dict instead of dict for types.

7814807

Allow otel test to run with experimental flag.

c390f94

Don't compare sdk versions.

fe12d1b

Handle all rows that need to have the sdk version ignored.

a7e3f53

Fix handling None issue for ignore_locators arg.

5dbc83a

save

3f1607c

draft

2a6e699

update

41b37b0

save

2067909

add back debugger

c61dda6

update notebook

402c0e5

fix

364f6aa

update semcov

17619e6

Base automatically changed from garett/SNOW-1854278 to main December 23, 2024 19:31

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Dec 23, 2024

remove artifacts

d300f67

sfc-gh-gtokernliang force-pushed the garett/SNOW-1854070 branch from 6b063f5 to d300f67 Compare December 24, 2024 02:09

ellipsis-dev bot reviewed Dec 24, 2024

View reviewed changes

src/core/trulens/experimental/otel_tracing/core/instrument.py Outdated Show resolved Hide resolved

Merge branch 'main' of github.com:truera/trulens into garett/SNOW-185…

659f82c

…4070

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Dec 24, 2024

sfc-gh-gtokernliang added 3 commits December 23, 2024 21:17

remove span_types from SpanAttributes

bc26f34

modified it to accept multiple tokens

86850df

fix bug with multiple func calls

57b6809

sfc-gh-dkurokawa reviewed Dec 24, 2024

View reviewed changes

PR feedback

14869e1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change recording to create a span of type "record_root" #1703

Change recording to create a span of type "record_root" #1703

sfc-gh-gtokernliang commented Dec 19, 2024 •

edited by ellipsis-dev bot

Loading

review-notebook-app bot commented Dec 19, 2024

sfc-gh-dkurokawa Dec 20, 2024

sfc-gh-dkurokawa Dec 20, 2024

sfc-gh-gtokernliang Dec 23, 2024

sfc-gh-dkurokawa Dec 20, 2024

sfc-gh-gtokernliang Dec 23, 2024 •

edited

Loading

sfc-gh-dkurokawa Dec 24, 2024

sfc-gh-gtokernliang Dec 24, 2024

sfc-gh-dkurokawa Dec 20, 2024

sfc-gh-gtokernliang Dec 24, 2024

sfc-gh-dkurokawa Dec 24, 2024

sfc-gh-dkurokawa Dec 24, 2024

sfc-gh-gtokernliang Dec 24, 2024

sfc-gh-gtokernliang Dec 24, 2024

sfc-gh-dkurokawa Dec 24, 2024

sfc-gh-gtokernliang Dec 24, 2024

sfc-gh-dkurokawa Dec 24, 2024

sfc-gh-dkurokawa Dec 24, 2024

sfc-gh-gtokernliang Dec 24, 2024

Change recording to create a span of type "record_root" #1703

Are you sure you want to change the base?

Change recording to create a span of type "record_root" #1703

Conversation

sfc-gh-gtokernliang commented Dec 19, 2024 • edited by ellipsis-dev bot Loading

Description

Other details good to know for developers

Type of change

review-notebook-app bot commented Dec 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-gtokernliang Dec 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-gtokernliang commented Dec 19, 2024 •

edited by ellipsis-dev bot

Loading

sfc-gh-gtokernliang Dec 23, 2024 •

edited

Loading