-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sandbox] OpenLLMetry #67
Comments
This project provides good functionalities. I would like to see the long-term roadmaps but the roadmap link does not work. I see development is ongoing. Could you provide information on how adoption is going as well as any plan for community growth beyond one company's contribution? I do not see a governance document. Could you add that? Have you presented it to the Tag-observability as stated in the "Project presentations" section? If so, what is the TAG feedback for your presentation? |
Thanks @cathyhongzhang. Changed the roadmap privacy setting and added a governance doc. I just presented on the TAG-observability a few hours ago, and will update here with the presentation and comments. |
Can you explain why OpenLLMetry should be an autonomous CNCF project, rather than a subproject of OpenTelemetry? |
Sure @jberkus. While our initial focus is only tracing and observability for LLM apps, we do want to work on standardizing other LLM related protocols in the near future which we thing closely relate to observability for LLMs. For example, prompt and model configuration. These may not be tied to opentelemetry necessarily. |
As mentioned above, the project was presented to TAG Observabilty on our 2023-12-05 meeting. There was some discussion and feedback from TAG members, both in the meeting and in subsequent conversations. I'll cover some of this feedback below. I've also taken some time to review the presentation, materials, this proposal, and the github repositories, and have feedback as well. The project's goals around addressing an emerging concern around how to observe LLM based applications are clear. I've grouped my feedback into three sections:
Feedback re: (this) Sandbox Application
If this project is an extension to open-telemetry, then success as defined would obviate the need for a large portion of the project upon it's integration with open-telemetry (either via one of the project's *-contrib repositories, open-telemetry/semantic-conventions, or elsewhere).
Entering the CNCF Sandbox is not requisite for evolution and/or integration with open-telemetry, nor is that laudable goal itself a reason to join.
Vendor and End User community adoption will require engaging with those communities, and welcoming them to participate in the project. Presently the bulk of the contributions have come from two TraceLoop employees (CEO, CTO). Moreover there's an implication that Vendors (today) do not find it easy to adopt the set of conventions because the project isn't in the Sandbox. One would not would expect them to adopt because of Sandbox membership, they might adopt the proposed open-telemetry Semantic Conventions if the project worked with open-telemetry to land them.
What's referred to as "inherent limitations" is an open PR (open-telemetry/oteps#234) that's seen substantive and constructive feedback from the project which have been argued with in some cases (without resolution) or are as yet unaddressed.
The CNCF's mission isn't (specifically) to provide reliable open standards, although that is an outcome with projects like open-telemetry. The API's and Semantic Conventions provided by otel are widely and broadly adopted because they are the result of a consensus based process which has actively solicited feedback from, and has over time consistently engaged with Vendors and End Users; the former as they comprise a large portion of the engineering contribution(s), and the latter by being modelled as "the customer," with active community engagement and by demonstrating responsive feedback and investments in open, community curated and driven documentation.
TAG Observability and TAG Runtime are the hosting/supporting TAG's for the proposed AI working group, as this emerging area is within scope for each TAG's charter and the Working Group's focus bridges the 2 core domains.
OpenLLMetry is effectively carrying unmerged, unaccepted patches of open-telemetry's semantic conventions, making it not "fully compatible" with open-telemetry. It's unclear if it's a project goal to work with the open-telemetry community to land them, or to integrate as a plugin or extension, or to simply act as a de facto perpetual fork. I would encourage the OpenLLMetry project to engage with open-telemetry. The referenced pull request is a conversation in that dialog and is a great start!
This isn't consistent with the presentation to TAG Observability (see slide)
The design descision(s) mentioned above might best be discussed with the community. In the TAG meeting there were a few suggestions for other ways to acheive the goals. I also would like to understand some of the rationale behind the statement above, which was reiterated in the TAG meeting. It was suggested that because the number of spans present in these applications is so much lower than is typical for open-telemetry, the requirements it imposes around payload sizes for trace spans "don't apply." If the TraceLoop SDK (OpenLLMetry OSS SDK is named after TraceLoop (company) not the project) were used to instrument cloud native LLM applications - as are in-scope for the AI working group, why would the scale be so much lower? Given the rapid growth of LLM and AI cloud native applications and services, one might expect the traffic volumes (and number of spans generated) to be correspondingly large.
If I'm understanding what was presented on 12/5 at TAG Observability correctly, this need is driven by the approach the project is taking to carrying patches for not just open-telemetry but also for the other targeted (growing) list of integrations. I would like to understand the nature of the engagement and current state of the discussion(s) for the projects being instrumented by the TraceLoop SDK. Is it collaborative? Are they aware? Do they support the project and its approach?
Feedback from TAG Observability Presentation on 12/5
Additional Feedback and Concern(s)
ConclusionI think that this project is forward looking, and aims to address a meaningful emerging function - the observation of LLM application workloads. I would encourage the project's maintainers to continue to engage with open-telemetry and the other libraries that OpenLLMetry is presently carrying patches for. I would like to see the project return next cycle after addressing the feedback above. I do think the project is valuable and has a bright future! My comments above are intended as constructive feedback to the project that might help to realize that sentiment. |
I can certainly speak towards this:
I have four thoughts here:
As it stands, I think the statements OpenLLMetry made still hold up here, and I can confirm in my own use of the project that it's compatible with at least one OTLP backend that I use. |
@halcyondude thanks for this extremely helpful feedback! I'd love to address some of the issues you mentioned and some of your concerns, in addition to the comments @cartermp wrote above.
We are actively engaging with major vendors (Like @cartermp from Honeycomb, Dynatrace, SigNoz, New Relic and others) to make sure that OpenLLMetry is compatible and stays compatible. We also maintain a list of supported vendors and actively test that support.
These were decisions that we made earlier in the development, and will of course change when and if CNCF decides this fits under its umbrella. The docs are under our domain for convenience reasons only (small startup, etc.). There's a clear separation between OpenLLMetry docs and Traceloop docs. Traceloop is indeed a closed-source platform.
Definitely. We are working with LLM app frameworks, vector DBs and foundation models like LlamaIndex, LiteLLM, Chroma DB and others to build and maintain promote and get feedback on our instrumentations.
That is inaccurate (I'll work on clarifying the docs on our end). You can choose to disable or enable content tracing on the SDK. And you can also choose to enable it selectively using the SDK. We also provide a convenient way to control that through the platform, but it's addition to the aforementioned capabilities.
This is clearly stated across our repo README, our docs, and the main website. |
Excellent! The open-telemetry project has a lot of experience working with other projects as well as commercial solutions to provide instrumentation. Another example of the open source observability community engaging with Vendors can be found in Prometheus's curation of a set of metrics exporters. Both of these CNCF project communities have experience in navigating some of the technical and cross-project (and indeed cross-product) conversations and solution finding. I would encourage the project to engage, TAG Observability is a great place to do this, as it's where End Users, Vendors, and Project maintainers come together. The TAG and its WG's (e.g. AI working group referenced) are great places to collaboratively engage with the community in ideation, solution finding, and coordination of effort(s). |
One more question: given that a lot of LLMs are not open source, have you checked that their SDKs are sufficiently open source that they don't create licensing problems for your project? |
@nirga I think that it would help those learning about the project if there were a few tables in the documentation in the repo (.md)
If these summarized the integration and it's type (with links to issues in openllmetry or in other projects), it would help new contributors or others that are evaluating the project to engage both with OpenLLMetry but also the broader, and rapidly growing community around LLM's. |
@halcyondude thanks, I agree. Will do that |
We would like the project to reapply in 6 months and complete the following
|
Closing, project can reapply for June (or later) review |
some progress here as there's now an open PR to address this as an expansion of the GenAI Semantics SIG in otel open-telemetry/community#2326 🥳 |
Application contact emails
[email protected], [email protected]
Project Summary
Open-protocol extension for OpenTelemetry for observability and evaluation of GenAI applications
Project Description
OpenLLMetry started as we were looking for an easy way to send metrics and traces of executions of LLM applications. Many common patterns like RAG applications closely resemble micro-services architecture, so it seemed natural to rely on OpenTelemetry. Moreover, LLMs and GenAI systems often exist as a component within a larger system. Understanding both how these systems influence an LLM’s output, and how the LLM’s output influences the rest of a system, is essential for people building applications using this technology. OpenTelemetry provides the standard and toolset to enable this kind of understanding.
However, the standard set by other LLM observability applications like LangSmith required us to send additional information (like prompts and completions) on spans, which is subject to an ongoing debate in the OpenTelemetry community.
Thus, we decided to extend OpenTelemetry semantic conventions, and build a way to monitor and trace prompts, completions, calls to vector DBs, monitoring token usage and more; all while staying compliant with the OTel standard. This allowed us to be fully compatible with any platform that supports OpenTelemetry, while offering the same level of features provided by LLM-specific observability applications.
See also a blog post we published about this project.
Org repo URL (provide if all repos under the org are in scope of the application)
N/A
Project repo URL in scope of application
https://github.com/traceloop/openllmetry
Additional repos in scope of the application
https://github.com/traceloop/openllmetry-js
Website URL
https://www.traceloop.com/openllmetry
Roadmap
https://github.com/orgs/traceloop/projects/1
Roadmap context
No response
Contributing Guide
https://www.traceloop.com/docs/openllmetry/contributing/overview
Code of Conduct (CoC)
https://github.com/traceloop/openllmetry/blob/main/CODE_OF_CONDUCT.md
Adopters
No response
Contributing or Sponsoring Org
https://www.traceloop.com/
Maintainers file
https://github.com/traceloop/openllmetry/blob/main/MAINTAINERS.md
IP Policy
Trademark and accounts
Why CNCF?
We see the project as a natural extension of OpenTelemetry, and our hope is that with its maturity it may even fully integrate into OpenTelemetry. By moving this project under the CNCF umbrella, we allow this project to continue to evolve in synergy with OpenTelemetry.
Moreover, seeing how OpenTelemetry has changed the cloud observability landscape, providing much needed freedom and flexibility to users, we seek to do the same in the LLM observability domain, which is rapidly evolving but it’s still in its early stages. Our hope is that under the CNCF umbrella it will become easier for other vendors to adopt this as a standard, instead of opting for a proprietary protocol as many do today.
Benefit to the Landscape
OpenLLMetry extends the current CNCF observability landscape into the rapidly evolving gen AI domain. It provides a novel approach for tracing and monitoring LLM components like foundation models and vector databases which couldn’t be done with existing CNCF tools due to inherent limitations. And by basing these capabilities on OpenTelemetry, OpenLLMetry stays true to the CNCF’s mission of providing open standards that any developer can rely upon.
Cloud Native 'Fit'
OpenLLMetry is built with cloud-native technologies like OpenTelemetry and fits in the Observability and Analysis area. Additionally, it’s compatible with the nascent AI TAG forming - and within that TAG, Observability is also seen as an important area for AI.
Cloud Native 'Integration'
OpenLLMetry depends on OpenTelemetry as it extends it and is fully compatible with it. By emitting data as OTLP, OpenLLMetry is thus compatible with a wide array of tools, both open source (e.g., Jaeger) and proprietary.
Cloud Native Overlap
While there is an apparent overlap with OpenTelemetry, as noted before, we see this more of an extension and a complement to OpenTelemetry. There were some extensions that were mandatory for OpenLLMetry to work properly which couldn’t be implemented directly in OpenTelemetry given its scale and usage. For example, adding full prompts to spans makes sense for OpenLLMetry given the scale and number of traces per minute is much lower than a microservice application. Additionally, the AI space is moving at an incredibly fast pace today, necessitating broad and sweeping changes in the OpenLLMetry project if/when things change in upstream components. Because the OpenTelemetry project is seeking greater stability across all its components, this need to made rapid and broad changes may conflict with the current focus of many OpenTelemetry projects.
Similar projects
LangSmith - LangChain's proprietary platform for LLM observability. Natively integrates with the open-source LangChain framework for building LLM applications, but otherwise requires manually logging and tracing.
LangFuse - Open source platform for LLM observability. Uses a proprietary protocol, and does not support auto-instrumentation.
Arize Phoenix - More of an MLOps project, but does use OpenTelemetry code repurposed to fit a proprietary protocol today.
Landscape
Yes
Business Product or Service to Project separation
Traceloop, the product we’re building, is a destination for the OpenLLMetry SDK. It uses traces and metrics to provide its users with tools to evaluate the quality of model outputs and iterate on changes they’re making to their applications. Similar to other destinations that natively integrate with OpenTelemetry like Honeycomb, Lightstep, Splunk, Sentry, and others.
Project presentations
The project was presented to TAG Observability on 5/12/2023 (cc @halcyondude)
https://youtu.be/ksmPWR_ZybE?si=Z10pzfNf2QIDyZ3J
Project champions
Chris Aniszczyk
Additional information
No response
The text was updated successfully, but these errors were encountered: