-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Realtime: enable a playback tracker #1242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
it seems the file conflicts with main branch need to be resolved |
b77e33c
to
0dee7f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor suggestion of naming local variables; they are local variables so it's okay to use the same name though
def on_audio_delta(self, item_id: str, item_content_index: int, bytes: bytes) -> None: | ||
"""Called when an audio delta is received from the model.""" | ||
ms = calculate_audio_length_ms(self._format, bytes) | ||
new_key = (item_id, item_content_index) | ||
|
||
self._last_audio_item = new_key | ||
if new_key not in self._states: | ||
self._states[new_key] = ModelAudioState(datetime.now(), ms) | ||
else: | ||
self._states[new_key].audio_length_ms += ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: In general, using built-in/reserved names like bytes
for variables should be avoided. If audio
does not sound great, data
, delta
, audio_data
etc. should be fine too.
def on_audio_delta(self, item_id: str, item_content_index: int, bytes: bytes) -> None: | |
"""Called when an audio delta is received from the model.""" | |
ms = calculate_audio_length_ms(self._format, bytes) | |
new_key = (item_id, item_content_index) | |
self._last_audio_item = new_key | |
if new_key not in self._states: | |
self._states[new_key] = ModelAudioState(datetime.now(), ms) | |
else: | |
self._states[new_key].audio_length_ms += ms | |
def on_audio_delta(self, item_id: str, item_content_index: int, audio: bytes) -> None: | |
"""Called when an audio delta is received from the model.""" | |
ms = calculate_audio_length_ms(self._format, audio) | |
new_key = (item_id, item_content_index) | |
self._last_audio_item = new_key | |
if new_key not in self._states: | |
self._states[new_key] = ModelAudioState(datetime.now(), ms) | |
else: | |
self._states[new_key].audio_length_ms += ms |
src/agents/realtime/_util.py
Outdated
def calculate_audio_length_ms(format: RealtimeAudioFormat | None, bytes: bytes) -> float: | ||
if format and format.startswith("g711"): | ||
return (len(bytes) / 8000) * 1000 | ||
return (len(bytes) / 24 / 2) * 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
def calculate_audio_length_ms(format: RealtimeAudioFormat | None, bytes: bytes) -> float: | |
if format and format.startswith("g711"): | |
return (len(bytes) / 8000) * 1000 | |
return (len(bytes) / 24 / 2) * 1000 | |
def calculate_audio_length_ms(format: RealtimeAudioFormat | None, audio: bytes) -> float: | |
if format and format.startswith("g711"): | |
return (len(audio) / 8000) * 1000 | |
return (len(audio) / 24 / 2) * 1000 |
So far, we've been assuming that audio is played:
This causes issues with our interrupt tracking. The model wants to know how much audio the user has actually heard. For example in a phone call agent, this wouldn't work (bc theres a delay of a few hundred ms between model sending audio and the user hearing it). This PR allows you to pass a playback tracker.