Add SpanHandler abstraction to enable custom LLM tracing in Langfuse #1310

vblagoje · 2025-01-21T13:29:49Z

Problem

Currently, the LangfuseConnector has a hardcoded list of supported LLM generators (_SUPPORTED_GENERATORS and _SUPPORTED_CHAT_GENERATORS) and their tracing logic. This makes it difficult for users to:

Add support for their custom LLM generators
Customize how generator spans are processed and what metadata is captured
Extend tracing functionality without modifying the core code

Proposal

Add a SpanHandler abstraction that allows users to customize how spans are processed after they are yielded. This follows the Open/Closed principle and provides a clean interface for extending Langfuse tracing capabilities.

Code Changes

class SpanHandler(ABC):
    """
    Abstract base class for handling spans in LangfuseTracer.
    Implement this class to customize how spans are processed after they are yielded.
    """
    
    @abstractmethod
    def handle(self, span: LangfuseSpan, component_type: Optional[str]) -> None:
        """
        Handle a span after it has been yielded.
        
        :param span: The LangfuseSpan that was yielded
        :param component_type: The type of the component that created this span
        """
        pass

The default implementation preserves current behavior:

class DefaultSpanHandler(SpanHandler):
    """Default implementation of SpanHandler that provides the original Langfuse tracing behavior."""
    
    def handle(self, span: LangfuseSpan, component_type: Optional[str]) -> None:
        if component_type in _SUPPORTED_GENERATORS:
            meta = span._data.get(_COMPONENT_OUTPUT_KEY, {}).get("meta")
            if meta:
                m = meta[0]
                span._span.update(usage=m.get("usage") or None, model=m.get("model"))
        elif component_type in _SUPPORTED_CHAT_GENERATORS:
            replies = span._data.get(_COMPONENT_OUTPUT_KEY, {}).get("replies")
            if replies:
                meta = replies[0].meta
                completion_start_time = meta.get("completion_start_time")
                if completion_start_time:
                    try:
                        completion_start_time = datetime.fromisoformat(completion_start_time)
                    except ValueError:
                        logger.error(f"Failed to parse completion_start_time: {completion_start_time}")
                        completion_start_time = None
                span._span.update(
                    usage=meta.get("usage") or None,
                    model=meta.get("model"),
                    completion_start_time=completion_start_time,
                )

Usage Example

Users can implement their own handlers for custom tracing:

class CustomSpanHandler(SpanHandler):
    def handle(self, span: LangfuseSpan, component_type: Optional[str]) -> None:
        # Custom logic for handling spans
        if component_type == "MyCustomChatGenerator":
            output = span._data.get("haystack.component.output", {})
            span._span.update(
                usage=output.get("usage"),
                model="my-custom-model",
                custom_metadata=output.get("extra_info"),
                # Add any other custom metadata
            )

# Use in pipeline
connector = LangfuseConnector(
    name="My Pipeline",
    span_handler=CustomSpanHandler()
)

Benefits

Extensibility: Users can add support for new LLM generators without modifying core code
Flexibility: Complete control over span processing and metadata capture
Clean Interface: Simple abstraction that follows Python's ABC pattern
Backward Compatible: Default behavior preserved through DefaultSpanHandler
Maintainable: Separates span handling logic from tracing infrastructure
Future-Proof: Easy to add support for new LLM providers and metadata formats

Alternative Approaches Considered

List Extension: Simply allowing users to extend the list of supported generators. Too rigid, doesn't allow customizing metadata capture.

Obsoletes (to some extent) the following issues:

The text was updated successfully, but these errors were encountered:

vblagoje · 2025-01-21T13:32:08Z

cc @marcklingen @alex-stoica @ggdupont @lc-bo @julian-risch

vblagoje · 2025-01-28T10:48:30Z

Fixed and included in https://pypi.org/project/langfuse-haystack/0.8.0/

vblagoje added the feature request Ideas to improve an integration label Jan 21, 2025

vblagoje added the integration:langfuse label Jan 21, 2025

This was referenced Jan 21, 2025

feat: Add SpanHandler customization to langfuse integration #1311

Closed

feat: Add custom Langfuse span handling support #1313

Merged

julian-risch added the P2 label Jan 23, 2025

julian-risch assigned vblagoje Jan 27, 2025

vblagoje closed this as completed in #1313 Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SpanHandler abstraction to enable custom LLM tracing in Langfuse #1310

Add SpanHandler abstraction to enable custom LLM tracing in Langfuse #1310

vblagoje commented Jan 21, 2025

vblagoje commented Jan 21, 2025 •

edited

Loading

vblagoje commented Jan 28, 2025

Add SpanHandler abstraction to enable custom LLM tracing in Langfuse #1310

Add SpanHandler abstraction to enable custom LLM tracing in Langfuse #1310

Comments

vblagoje commented Jan 21, 2025

Problem

Proposal

Code Changes

Usage Example

Benefits

Alternative Approaches Considered

Obsoletes (to some extent) the following issues:

vblagoje commented Jan 21, 2025 • edited Loading

vblagoje commented Jan 28, 2025

vblagoje commented Jan 21, 2025 •

edited

Loading