Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change recording to create a span of type "record_root" #1703

Open
wants to merge 62 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
2b4a3a9
draft
sfc-gh-gtokernliang Dec 11, 2024
49d79dc
update
sfc-gh-gtokernliang Dec 12, 2024
7207252
prefix
sfc-gh-gtokernliang Dec 17, 2024
f433019
updates
sfc-gh-gtokernliang Dec 18, 2024
2b03079
touchups
sfc-gh-gtokernliang Dec 18, 2024
a8e4955
add event_id
sfc-gh-gtokernliang Dec 18, 2024
a6c0b63
add type schema
sfc-gh-gtokernliang Dec 18, 2024
0ca7ebc
add event_id
sfc-gh-gtokernliang Dec 18, 2024
1c6cbc2
minor updates
sfc-gh-gtokernliang Dec 18, 2024
3059d79
pr feedback
sfc-gh-gtokernliang Dec 19, 2024
021a43c
update
sfc-gh-gtokernliang Dec 19, 2024
42f0575
ORM update
sfc-gh-gtokernliang Dec 20, 2024
fb7cf3b
Merge branch 'main' of github.com:truera/trulens into garett/SNOW-185…
sfc-gh-gtokernliang Dec 20, 2024
9c313b4
save
sfc-gh-gtokernliang Dec 11, 2024
20bb651
update exporter
sfc-gh-gtokernliang Dec 12, 2024
d0b4398
update
sfc-gh-gtokernliang Dec 12, 2024
99dcb44
nits
sfc-gh-gtokernliang Dec 12, 2024
834d6e7
fix parent
sfc-gh-gtokernliang Dec 16, 2024
eb61153
save
sfc-gh-gtokernliang Dec 18, 2024
7fd7de2
save
sfc-gh-gtokernliang Dec 18, 2024
aad3ffa
update typing
sfc-gh-gtokernliang Dec 18, 2024
9e27b0f
update
sfc-gh-gtokernliang Dec 18, 2024
8004a55
update
sfc-gh-gtokernliang Dec 19, 2024
f35f2b0
update exporter to have the attributes in attributes
sfc-gh-gtokernliang Dec 20, 2024
4ed4546
save
sfc-gh-gtokernliang Dec 12, 2024
c1516c7
save
sfc-gh-gtokernliang Dec 12, 2024
85f2e62
update
sfc-gh-gtokernliang Dec 12, 2024
47672f2
add try-catch
sfc-gh-gtokernliang Dec 12, 2024
aafae55
updatE
sfc-gh-gtokernliang Dec 18, 2024
3d23e0b
PR feedback
sfc-gh-gtokernliang Dec 18, 2024
fbc6e60
updates
sfc-gh-gtokernliang Dec 18, 2024
efa8c6e
update a bit
sfc-gh-gtokernliang Dec 18, 2024
cf97904
remove redundant print
sfc-gh-gtokernliang Dec 18, 2024
dc98c62
remove snowflake
sfc-gh-gtokernliang Dec 19, 2024
7dae6ff
remove instrument
sfc-gh-gtokernliang Dec 19, 2024
e1b685c
prepend namespace
sfc-gh-gtokernliang Dec 21, 2024
bf80a11
update semcov
sfc-gh-gtokernliang Dec 21, 2024
e64f8ee
Merge remote-tracking branch 'origin/main' into garett/SNOW-1854278
sfc-gh-dkurokawa Dec 21, 2024
79b157f
Fix api tests.
sfc-gh-dkurokawa Dec 21, 2024
0e3564b
Incorporate my own comments in the review except for the adding of a …
sfc-gh-dkurokawa Dec 21, 2024
4528258
Add framework for test.
sfc-gh-dkurokawa Dec 21, 2024
a1c461c
Add in better test.
sfc-gh-dkurokawa Dec 22, 2024
1c34cd8
Clean up some issues and fix import issues.
sfc-gh-dkurokawa Dec 22, 2024
7814807
Use `Dict` instead of `dict` for types.
sfc-gh-dkurokawa Dec 22, 2024
c390f94
Allow otel test to run with experimental flag.
sfc-gh-dkurokawa Dec 22, 2024
fe12d1b
Don't compare sdk versions.
sfc-gh-dkurokawa Dec 22, 2024
a7e3f53
Handle all rows that need to have the sdk version ignored.
sfc-gh-dkurokawa Dec 23, 2024
5dbc83a
Fix handling `None` issue for `ignore_locators` arg.
sfc-gh-dkurokawa Dec 23, 2024
3f1607c
save
sfc-gh-gtokernliang Dec 16, 2024
2a6e699
draft
sfc-gh-gtokernliang Dec 19, 2024
41b37b0
update
sfc-gh-gtokernliang Dec 19, 2024
2067909
save
sfc-gh-gtokernliang Dec 20, 2024
c61dda6
add back debugger
sfc-gh-gtokernliang Dec 20, 2024
402c0e5
update notebook
sfc-gh-gtokernliang Dec 20, 2024
364f6aa
fix
sfc-gh-gtokernliang Dec 21, 2024
17619e6
update semcov
sfc-gh-gtokernliang Dec 21, 2024
d300f67
remove artifacts
sfc-gh-gtokernliang Dec 24, 2024
659f82c
Merge branch 'main' of github.com:truera/trulens into garett/SNOW-185…
sfc-gh-gtokernliang Dec 24, 2024
bc26f34
remove span_types from SpanAttributes
sfc-gh-gtokernliang Dec 24, 2024
86850df
modified it to accept multiple tokens
sfc-gh-gtokernliang Dec 24, 2024
57b6809
fix bug with multiple func calls
sfc-gh-gtokernliang Dec 24, 2024
14869e1
PR feedback
sfc-gh-gtokernliang Dec 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions examples/experimental/otel_exporter.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@
"session = TruSession()\n",
"session.experimental_enable_feature(\"otel_tracing\")\n",
sfc-gh-gtokernliang marked this conversation as resolved.
Show resolved Hide resolved
"session.reset_database()\n",
"\n",
"init(session, debug=True)"
]
},
Expand Down
39 changes: 29 additions & 10 deletions src/core/trulens/core/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from abc import ABC
from abc import ABCMeta
from abc import abstractmethod
import contextlib
import contextvars
import datetime
import inspect
Expand Down Expand Up @@ -421,6 +422,18 @@ def tru(self) -> core_connector.DBConnector:
pydantic.PrivateAttr(default_factory=dict)
)

tokens: List[object] = []
"""
OTEL context tokens for the current context manager. These tokens are how the OTEL
context api keeps track of what is changed in the context, and used to undo the changes.
"""

span_context: Optional[contextlib.AbstractContextManager] = None
"""
Span context manager. Required to help keep track of the appropriate span context
to enter/exit.
"""

def __init__(
self,
connector: Optional[core_connector.DBConnector] = None,
Expand Down Expand Up @@ -1048,27 +1061,29 @@ def __enter__(self):
if self.session.experimental_feature(
core_experimental.Feature.OTEL_TRACING
):
from trulens.experimental.otel_tracing.core.app import _App
from trulens.experimental.otel_tracing.core.instrument import (
App as OTELApp,
)

return _App.__enter__(self)
return OTELApp.__enter__(self)

ctx = core_instruments._RecordingContext(app=self)

token = self.recording_contexts.set(ctx)
ctx.token = token

# self._set_context_vars()

return ctx

# For use as a context manager.
def __exit__(self, exc_type, exc_value, exc_tb):
if self.session.experimental_feature(
core_experimental.Feature.OTEL_TRACING
):
from trulens.experimental.otel_tracing.core.app import _App
from trulens.experimental.otel_tracing.core.instrument import (
App as OTELApp,
)

return _App.__exit__(self, exc_type, exc_value, exc_tb)
return OTELApp.__exit__(self, exc_type, exc_value, exc_tb)

ctx = self.recording_contexts.get()
self.recording_contexts.reset(ctx.token)
Expand All @@ -1085,9 +1100,11 @@ async def __aenter__(self):
if self.session.experimental_feature(
core_experimental.Feature.OTEL_TRACING
):
from trulens.experimental.otel_tracing.core.app import _App
from trulens.experimental.otel_tracing.core.instrument import (
App as OTELApp,
)

return await _App.__aenter__(self)
return OTELApp.__enter__(self)

ctx = core_instruments._RecordingContext(app=self)

Expand All @@ -1103,9 +1120,11 @@ async def __aexit__(self, exc_type, exc_value, exc_tb):
if self.session.experimental_feature(
core_experimental.Feature.OTEL_TRACING
):
from trulens.experimental.otel_tracing.core.app import _App
from trulens.experimental.otel_tracing.core.instrument import (
App as OTELApp,
)

return await _App.__aexit__(self, exc_type, exc_value, exc_tb)
return OTELApp.__exit__(self, exc_type, exc_value, exc_tb)

ctx = self.recording_contexts.get()
self.recording_contexts.reset(ctx.token)
Expand Down
86 changes: 86 additions & 0 deletions src/core/trulens/experimental/otel_tracing/core/instrument.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
from functools import wraps
import logging
from typing import Any, Callable, Dict, Optional, Union
import uuid

from opentelemetry import trace
from opentelemetry.baggage import get_baggage
from opentelemetry.baggage import remove_baggage
from opentelemetry.baggage import set_baggage
import opentelemetry.context as context_api
from trulens.core import app as core_app
from trulens.experimental.otel_tracing.core.init import TRULENS_SERVICE_NAME
from trulens.otel.semconv.trace import SpanAttributes

Expand Down Expand Up @@ -84,6 +90,20 @@ def wrapper(*args, **kwargs):
# It's on the user to deal with None as a return value.
func_exception = e

span.set_attribute("name", func.__name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this something that the event table expects or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

span.set_attribute("kind", "SPAN_KIND_TRULENS")
span.set_attribute(
"parent_span_id", span.get_span_context().span_id
)
span.set_attribute(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want app/run id in the baggage as well I guess, but given that it's not totally clear yet no point in doing it now I suppose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put the app id in baggage for now I guess since that seems pretty good to me. For run_id it's a little more ambiguous what the shape of the API will look like so I'll leave that out for now.

SpanAttributes.RECORD_ID,
str(get_baggage(SpanAttributes.RECORD_ID)),
)
span.set_attribute(
SpanAttributes.APP_ID,
str(get_baggage(SpanAttributes.APP_ID)),
)

try:
attributes_to_add = {}

Expand Down Expand Up @@ -137,3 +157,69 @@ def wrapper(*args, **kwargs):
return wrapper

return inner_decorator


class App(core_app.App):
# For use as a context manager.
def __enter__(self):
logger.debug("Entering the OTEL app context.")

# Note: This is not the same as the record_id in the core app since the OTEL
# tracing is currently separate from the old records behavior
otel_record_id = str(uuid.uuid4())

tracer = trace.get_tracer_provider().get_tracer(TRULENS_SERVICE_NAME)

# Calling set_baggage does not actually add the baggage to the current context, but returns a new one
# To avoid issues with remembering to add/remove the baggage, we attach it to the runtime context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/To/to

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't really get this token business, can you enlighten me?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what I can tell - to set the baggage you need to:

  1. call set_baggage, which returns a new, updated context
  2. attach it to the current context with context_api.attach - this returns a token that can then be used to remove it.
  3. Once the record is over, we want to remove it, and the way to do it via OTEL context is by telling it the baggage you want it to detach via context.detach(token)

self.tokens.append(
context_api.attach(
set_baggage(SpanAttributes.RECORD_ID, otel_record_id)
)
)
self.tokens.append(
context_api.attach(set_baggage(SpanAttributes.APP_ID, self.app_id))
)

# Use start_as_current_span as a context manager
self.span_context = tracer.start_as_current_span("root")
root_span = self.span_context.__enter__()

logger.debug(str(get_baggage(SpanAttributes.RECORD_ID)))

# Set general span attributes
root_span.set_attribute("kind", "SPAN_KIND_TRULENS")
root_span.set_attribute("name", "root")
root_span.set_attribute(
SpanAttributes.SPAN_TYPE, SpanAttributes.SpanType.RECORD_ROOT
)
root_span.set_attribute(SpanAttributes.APP_ID, self.app_id)
root_span.set_attribute(SpanAttributes.RECORD_ID, otel_record_id)

# Set record root specific attributes
root_span.set_attribute(
SpanAttributes.RECORD_ROOT.APP_NAME, self.app_name
)
root_span.set_attribute(
SpanAttributes.RECORD_ROOT.APP_VERSION, self.app_version
)
root_span.set_attribute(SpanAttributes.RECORD_ROOT.APP_ID, self.app_id)
root_span.set_attribute(
SpanAttributes.RECORD_ROOT.RECORD_ID, otel_record_id
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the semantic convention that Piotr laid out also has the MAIN_INPUT/MAIN_OUTPUT/MAIN_ERROR, but I don't see how we can get that even during __exit__, so I'm inclined to say we can remove them. Though that raises the question of, how will the UI know what the input and output are to display. It's too close to the finish line of 2024 that I don't want to investigate how this is determined currently but can next year haha.

There's also TOTAL_COST but I'm not as worried about that.

Copy link
Contributor Author

@sfc-gh-gtokernliang sfc-gh-gtokernliang Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we should chat about this - I think one idea I had while typing this up was to omit the record_root span type, and instead:

  1. Attributes that are initially proposed to be stored in record_root should now be stored in the baggage name/version/id/record_id)
  2. Every trace in the record should track all of the attributes above
  3. the root of the record is essentially the span with no parent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need the root span type though to denote when we start a record though since if you create a otel trace before you start calling your app a bunch of times they'll all be part of the same trace but we want them to be separate records.

But more importantly, I'm confused, how does this help us determine the MAIN_INPUT/MAIN_OUTPUT/MAIN_ERROR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you create a otel trace before you start calling your app a bunch of times they'll all be part of the same trace but we want them to be separate records.

I know we discussed this before, but my memory's failing me - what does the pseudocode for that look like? is it:

with tru_app as recording:
  tru_app.respond_to_query(query)
  tru_app.do_something_else(query)

just so I can experiment with it a little more :)

But more importantly, I'm confused, how does this help us determine the MAIN_INPUT/MAIN_OUTPUT/MAIN_ERROR?

we aren't doing this in code as of yet, but I'm thinking that we should track the input/output/error for every span, so the way we determine the main input/output/error semantically would be:

  1. Find the span with no parent
  2. Use its input/output/error as the main input/output/error for the span.


return root_span

def __exit__(self, exc_type, exc_value, exc_tb):
remove_baggage(SpanAttributes.RECORD_ID)
remove_baggage(SpanAttributes.APP_ID)

logger.debug("Exiting the OTEL app context.")

while self.tokens:
# Clearing the context once we're done with this root span.
# See https://github.com/open-telemetry/opentelemetry-python/issues/2432#issuecomment-1593458684
context_api.detach(self.tokens.pop())

if self.span_context:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this ever not be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt it, but didn't think it would hurt to include it, if nothing, at least for the type checking haha

self.span_context.__exit__(exc_type, exc_value, exc_tb)
6 changes: 6 additions & 0 deletions src/otel/semconv/trulens/otel/semconv/trace.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,12 @@ class SpanAttributes:
User-defined selector name for the current span.
"""

RECORD_ID = BASE + "record_id"
"""ID of the record that the span belongs to."""

APP_ID = BASE + "app_id"
"""ID of the app that the span belongs to."""

class SpanType(str, Enum):
"""Span type attribute values.

Expand Down
Loading