Skip to content

TruLens Eval

Piotr Mardziel edited this page Jun 28, 2024 · 4 revisions

TruLens-Eval

Known Issues

  • Awaitable and generator method inputs/outputs not properly recorded.

Work Streams

Features

  • Conversation/session tracking, display, feedbacks.

Feedback analysis interfaces

  • Correlations between feedback results.

  • Feedback clustering (f1, f2 behave similarly, f3, f4 behave similarly).

  • Derived metrics/feedbacks: e.g. "Does high context relevance imply low abstain?"

  • Statistical validity computation and presentation.

Additional feedback functions

  • Safety with LlamaGuard

OpenTelemetry interfaces

  • Goal: Export records to otel.

  • Goal: Record via otel interface.

  • Goal: Record and view other tool's instrumented spans alongside/within ours.

Clone this wiki locally