Realtime: enable a playback tracker #1242

rm-openai · 2025-07-25T01:29:59Z

So far, we've been assuming that audio is played:

immediately (i.e. with 0 delay/latency)
at realtime

This causes issues with our interrupt tracking. The model wants to know how much audio the user has actually heard. For example in a phone call agent, this wouldn't work (bc theres a delay of a few hundred ms between model sending audio and the user hearing it). This PR allows you to pass a playback tracker.

Will need this for a followup. --- [//]: # (BEGIN SAPLING FOOTER) * #1243 * #1242 * __->__ #1235

seratch · 2025-07-25T01:45:02Z

it seems the file conflicts with main branch need to be resolved

seratch

minor suggestion of naming local variables; they are local variables so it's okay to use the same name though

seratch · 2025-07-29T02:37:24Z

src/agents/realtime/_default_tracker.py

+    def on_audio_delta(self, item_id: str, item_content_index: int, bytes: bytes) -> None:
+        """Called when an audio delta is received from the model."""
+        ms = calculate_audio_length_ms(self._format, bytes)
+        new_key = (item_id, item_content_index)
+
+        self._last_audio_item = new_key
+        if new_key not in self._states:
+            self._states[new_key] = ModelAudioState(datetime.now(), ms)
+        else:
+            self._states[new_key].audio_length_ms += ms


nit: In general, using built-in/reserved names like bytes for variables should be avoided. If audio does not sound great, data, delta, audio_data etc. should be fine too.

Suggested change

def on_audio_delta(self, item_id: str, item_content_index: int, bytes: bytes) -> None:

"""Called when an audio delta is received from the model."""

ms = calculate_audio_length_ms(self._format, bytes)

new_key = (item_id, item_content_index)

self._last_audio_item = new_key

if new_key not in self._states:

self._states[new_key] = ModelAudioState(datetime.now(), ms)

else:

self._states[new_key].audio_length_ms += ms

def on_audio_delta(self, item_id: str, item_content_index: int, audio: bytes) -> None:

"""Called when an audio delta is received from the model."""

ms = calculate_audio_length_ms(self._format, audio)

new_key = (item_id, item_content_index)

self._last_audio_item = new_key

if new_key not in self._states:

self._states[new_key] = ModelAudioState(datetime.now(), ms)

else:

self._states[new_key].audio_length_ms += ms

seratch · 2025-07-29T02:38:02Z

src/agents/realtime/_util.py

+def calculate_audio_length_ms(format: RealtimeAudioFormat | None, bytes: bytes) -> float:
+    if format and format.startswith("g711"):
+        return (len(bytes) / 8000) * 1000
+    return (len(bytes) / 24 / 2) * 1000


same as above

Suggested change

def calculate_audio_length_ms(format: RealtimeAudioFormat | None, bytes: bytes) -> float:

if format and format.startswith("g711"):

return (len(bytes) / 8000) * 1000

return (len(bytes) / 24 / 2) * 1000

def calculate_audio_length_ms(format: RealtimeAudioFormat | None, audio: bytes) -> float:

if format and format.startswith("g711"):

return (len(audio) / 8000) * 1000

return (len(audio) / 24 / 2) * 1000

src/agents/realtime/model.py

rm-openai changed the base branch from main to rm/pr1235 July 25, 2025 01:30

This was referenced Jul 25, 2025

Realtime: only cancel response if necessary #1243

Merged

Realtime: send audio item/content index #1235

Merged

rm-openai added a commit that referenced this pull request Jul 25, 2025

Realtime: send audio item/content index (#1235)

7b84678

Will need this for a followup. --- [//]: # (BEGIN SAPLING FOOTER) * #1243 * #1242 * __->__ #1235

Base automatically changed from rm/pr1235 to main July 25, 2025 01:30

rm-openai requested a review from seratch July 25, 2025 01:31

rm-openai assigned pakrym-oai Jul 25, 2025

rm-openai force-pushed the rm/pr1242 branch from 9e1f699 to 5f09130 Compare July 25, 2025 01:35

This was referenced Jul 25, 2025

Realtime: Twilio example #1216

Open

Tool Call Results are not Appearing in Realtime API and Tool Calls are Not Showing Up in Tracing #1156

Open

seratch added the feature:realtime label Jul 25, 2025

rm-openai force-pushed the rm/pr1242 branch 2 times, most recently from b77e33c to 0dee7f9 Compare July 25, 2025 17:45

rm-openai mentioned this pull request Jul 25, 2025

Realtime: forward all raw model events #1252

Open

seratch approved these changes Jul 29, 2025

View reviewed changes

Realtime: enable a playback tracker

a720f63

rm-openai force-pushed the rm/pr1242 branch from 0dee7f9 to a720f63 Compare July 29, 2025 18:11

rm-openai merged commit b459cc4 into main Jul 29, 2025
10 checks passed

rm-openai deleted the rm/pr1242 branch July 29, 2025 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Realtime: enable a playback tracker #1242

Realtime: enable a playback tracker #1242

rm-openai commented Jul 25, 2025 •

edited

Loading

Uh oh!

seratch commented Jul 25, 2025

Uh oh!

seratch left a comment

Uh oh!

seratch Jul 29, 2025

Uh oh!

seratch Jul 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Realtime: enable a playback tracker #1242

Realtime: enable a playback tracker #1242

Conversation

rm-openai commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seratch commented Jul 25, 2025

Uh oh!

seratch left a comment

Choose a reason for hiding this comment

Uh oh!

seratch Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

seratch Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rm-openai commented Jul 25, 2025 •

edited

Loading