Release v0.0.54 · pipecat-ai/pipecat

In order to create tasks in Pipecat frame processors it is now recommended to use FrameProcessor.create_task() (which uses the new utils.asyncio.create_task()). It takes care of uncaught exceptions, task cancellation handling and task management. To cancel or wait for a task there is FrameProcessor.cancel_task() and FrameProcessor.wait_for_task(). All of Pipecat processors have been updated accordingly. Also, when a pipeline runner finishes, a warning about dangling tasks might appear, which indicates if any of the created tasks was never cancelled or awaited for (using these new functions).
It is now possible to specify the period of the PipelineTask heartbeat frames with heartbeats_period_secs.
Added DailyMeetingTokenProperties and DailyMeetingTokenParams Pydantic models for meeting token creation in get_token method of DailyRESTHelper.
Added enable_recording and geo parameters to DailyRoomProperties.
Added RecordingsBucketConfig to DailyRoomProperties to upload recordings to a custom AWS bucket.

Enhanced UserIdleProcessor with retry functionality and control over idle monitoring via new callback signature (processor, retry_count) -> bool. Updated the 17-detect-user-idle.py to show how to use the retry_count.
Add defensive error handling for OpenAIRealtimeBetaLLMService's audio truncation. Audio truncation errors during interruptions now log a warning and allow the session to continue instead of throwing an exception.
Modified TranscriptProcessor to use TTS text frames for more accurate assistant transcripts. Assistant messages are now aggregated based on bot speaking boundaries rather than LLM context, providing better handling of interruptions and partial utterances.
Updated foundational examples 28a-transcription-processor-openai.py, 28b-transcript-processor-anthropic.py, and 28c-transcription-processor-gemini.py to use the updated TranscriptProcessor.

Fixed an GeminiMultimodalLiveLLMService issue that was preventing the user to push initial LLM assistant messages (using LLMMessagesAppendFrame).
Added missing FrameProcessor.cleanup() calls to Pipeline, ParallelPipeline and UserIdleProcessor.
Fixed a type error when using voice_settings in ElevenLabsHttpTTSService.
Fixed an issue where OpenAIRealtimeBetaLLMService function calling resulted in an error.
Fixed an issue in AudioBufferProcessor where the last audio buffer was not being processed, in cases where the _user_audio_buffer was smaller than the buffer size.

Replaced audio resampling library resampy with soxr. Resampling a 2:21s audio file from 24KHz to 16KHz took 1.41s with resampy and 0.031s with soxr with similar audio quality.

Provide feedback