Audio passthrough #400

j0sh · 2025-01-24T16:58:35Z

Passes through any audio in the input to the output.

Notes:

Had to switch to PyAV rather than ffmpeg for better access to timestamps. This resulted in a bunch of cleanup for supporting code that is no longer needed (but also a lot of new additions)
We also had to bump up the PyAV version in the Dockerfiles since it was relying on an ancient version of libavcodec/libavformat; hopefully this is not a problem for whatever other dependencies are using PyAV. We need a recent version to able to support Opus inputs.
If no audio is present in the input, no audio will be in the output.
Likewise, if no video if present in the input, then no video in the output ... we will just pass through the audio 🤷
AAC is the output codec. For WebRTC delivery, the audio will need to be transcoded to something like Opus or G711.
This downmixes to mono regardless of the input layout / channels (since I suspect that most inference will output mono) but it is easy to change that
There is no audio pipeline wired in - not even a noop pipeline. For any audio frames we pop from the input queue, we push directly into the output queue.
The output queue is currently unbounded. This is a pre-existing issue and not a new one in this PR; we can fix it later, however we might want to prioritize audio rather than video when dropping.
This PR does not affect event metrics (or at least, it tries not to). The existing event metrics are video-only. We probably should define and track some other metrics for audio, but that can be done later.
Only tested with trickle protocol / go-livepeer e2e

victorges · 2025-01-29T22:19:45Z

runner/app/live/streamer/process.py

+                        output_image = pipeline.process_frame(input_frame.image)
+                        output_frame = VideoOutput(input_frame.replace_image(output_image))
+                        self.output_queue.put(output_frame)
+                    elif isinstance(input_frame, AudioFrame):


varshith15 · 2025-01-31T16:22:19Z

runner/app/live/trickle/frame.py

+        return self.frame.time_base
+
+class AudioOutput(OutputFrame):
+    frames: List[AudioFrame]


why is this a list any specific reason? comfystream does the list handling, so this might not be needed?

Yes - and Comfy should output the list when it's done processing, to minimize IPC and pressure on the queue. Since we know exactly what needs to be sent, send it all at once.

but comfystream has a list already, you call a function to get the next frame https://github.com/varshith15/comfystream/blob/af132bedd665027fa02c20601b98e0c229e54f51/server/pipeline.py#L74

we want it to be frame by frame in audio as well, otherwise we gotta maintain all the frame instances that were sent into the pipeline to assign processed data back into the the frames(ref line no 150 in runner/app/live/streamer/process.py)

it's just simpler to have it as a single frame because a list is maintained inside the comfystream already

@varshith15 maybe you can handle this when you do the ai-runner integration and we leave as-is for now since first goal is to get passthru working?

cool, makes sense, can pick this up later then integrating comfystream package

varshith15 · 2025-01-31T18:26:44Z

runner/app/live/streamer/protocol/trickle.py


 class TrickleProtocol(StreamProtocol):
    def __init__(self, subscribe_url: str, publish_url: str, control_url: Optional[str] = None, events_url: Optional[str] = None):
        self.subscribe_url = subscribe_url
        self.publish_url = publish_url
        self.control_url = control_url
        self.events_url = events_url
-        self.subscribe_queue = queue.Queue[bytearray]()
-        self.publish_queue = queue.Queue[bytearray]()
+        self.subscribe_queue = queue.Queue[InputFrame]()


why not use asyncio queues? is it linked to this? https://github.com/j0sh/http-trickle/blob/b553cff7ff6530788188c0d809fab286a5aa454b/python/trickle/media.py#L112

Not sure; original code was not asyncio and this PR doesn't change anything there

emranemran · 2025-01-31T19:04:06Z

runner/docker/Dockerfile.live-base

                --extra-cflags="-I/ffmpeg_build/include -I/usr/local/cuda/include" \
                --extra-ldflags="-L/ffmpeg_build/lib -L/usr/local/cuda/lib64" \
-                --extra-libs="-lpthread -lm" \


is removing libpthread expected here? I'm assuming various parts of ffmpeg rely on it to run efficiently?

IIRC it was there to support static linking; I think there was a bug with the pkg-config declaration since that should also be pulling in pthreads.

It is still pulled in dynamically -

$ docker run -it --entrypoint /bin/bash livepeer/ai-runner:live-app-noop root@733b39a972a0:/app# ldd /usr/local/bin/ffmpeg linux-vdso.so.1 (0x00007ffd429f7000) libavdevice.so.61 => /usr/local/lib/libavdevice.so.61 (0x00007ff488af1000) libavfilter.so.10 => /usr/local/lib/libavfilter.so.10 (0x00007ff488660000) libavformat.so.61 => /usr/local/lib/libavformat.so.61 (0x00007ff4883cf000) libavcodec.so.61 => /usr/local/lib/libavcodec.so.61 (0x00007ff487058000) libpostproc.so.58 => /usr/local/lib/libpostproc.so.58 (0x00007ff487045000) libswresample.so.5 => /usr/local/lib/libswresample.so.5 (0x00007ff487024000) libswscale.so.8 => /usr/local/lib/libswscale.so.8 (0x00007ff486f8e000) libavutil.so.59 => /usr/local/lib/libavutil.so.59 (0x00007ff485e89000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ff485da2000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff485b7a000) libxcb.so.1 => /lib/x86_64-linux-gnu/libxcb.so.1 (0x00007ff485b50000) libnppig.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libnppig.so.12 (0x00007ff483400000) libnppicc.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libnppicc.so.12 (0x00007ff482a00000) libnppidei.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libnppidei.so.12 (0x00007ff481c00000) libnppif.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libnppif.so.12 (0x00007ff47be00000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007ff485b32000) libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007ff485b1f000) liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007ff4833d5000) libfdk-aac.so.2 => /lib/x86_64-linux-gnu/libfdk-aac.so.2 (0x00007ff4828cf000) libopus.so.0 => /lib/x86_64-linux-gnu/libopus.so.0 (0x00007ff483377000) libx264.so.163 => /lib/x86_64-linux-gnu/libx264.so.163 (0x00007ff47bb3f000) /lib64/ld-linux-x86-64.so.2 (0x00007ff488b6b000) libXau.so.6 => /lib/x86_64-linux-gnu/libXau.so.6 (0x00007ff483371000) libXdmcp.so.6 => /lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007ff483369000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ff483364000) libnppc.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libnppc.so.12 (0x00007ff47b600000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007ff48335f000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ff48335a000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ff48333a000) libbsd.so.0 => /lib/x86_64-linux-gnu/libbsd.so.0 (0x00007ff483322000) libmd.so.0 => /lib/x86_64-linux-gnu/libmd.so.0 (0x00007ff483315000) root@733b39a972a0:/app#

emranemran · 2025-01-31T19:15:11Z

runner/app/live/trickle/decoder.py

+            if audio_stream and packet.stream == audio_stream:
+                # Decode audio frames
+                for aframe in packet.decode():
+                    if aframe.pts is None:


Under what conditions would a decoded audio packet have no PTS set?

emranemran · 2025-01-31T19:15:46Z

runner/app/live/trickle/decoder.py

+
+            elif video_stream and packet.stream == video_stream:
+                # Decode video frames
+                for frame in packet.decode():


Does this do sw or hw decode behind the scenes now that ffmpeg is not in this flow?

Software only for now. The old ffmpeg code was also software-only (there was logic for GPU but we never enabled it)

We can enable GPU here eventually but want to keep this like-for-like to avoid introducing new issues, IIRC we've seen problems with transcoding + AI workloads

emranemran · 2025-01-31T20:14:46Z

runner/app/live/trickle/media.py

+            loop.call_soon_threadsafe(do_schedule)
+
+        # hold tasks since `loop.create_task` is a weak reference that gets GC'd
+        # TODO use asyncio.TaskGroup once all pipelines are on Python 3.11+


Can we not switch to python 3.11 right now? Is there a limitation? The logic here looks alright but it feels like using TaskGroup would simplify it a bit to avoid manual tracking of tasks. We can merge liek this for now and re-test but perhaps should consider revisiting before launch if we see issues? wdyt?

cc @victorges as well for comments on this since I'm still relearning Pythonic concurrency.

I believe there are dependency limitations in some pipelines that preclude us from using newer Python although I am not sure what the minimum version would be - @victorges might know better. Didn't want to introduce the risk of a Python upgrade and pipeline validation alongside this PR which is already invasive enough as-is

emranemran · 2025-01-31T20:17:11Z

Left a few generic comments on things we may need follow-up on - but I think it looks good to merge to staging and get some testing going before spending too much time with local testing and iteration. wdyt?

j0sh added 4 commits January 24, 2025 16:57

Audio passthrough

93a3427

Simplify handling in ingress loop

6539ec5

Remove ffmpeg input handling and jpeg parser.

7b49cfb

Use PyAV for encoding output

05b7ab3

j0sh force-pushed the ja/audio-passthrough branch from 8a0f3fd to 05b7ab3 Compare January 28, 2025 15:20

j0sh added 3 commits January 29, 2025 07:10

Plumbing for PTS and audio passthrough

3763227

fixup with 3763227

cdfa1e4

fixup! Audio passthrough

c8897c0

victorges reviewed Jan 29, 2025

View reviewed changes

j0sh added 6 commits January 30, 2025 08:32

Configure output via input metadata

ea6322d

fixup! Configure output via input metadata

d4c0718

Resize video in decoder to conserve resources.

522772c

fixup! Audio passthrough

0049a9b

fixup! Configure output via input metadata

64d3834

Merge branch 'main' into ja/audio-passthrough

2a762d7

j0sh marked this pull request as ready for review January 30, 2025 20:41

j0sh requested review from victorges, leszko, emranemran and varshith15 January 30, 2025 21:00

varshith15 reviewed Jan 31, 2025

View reviewed changes

emranemran reviewed Jan 31, 2025

View reviewed changes

emranemran approved these changes Jan 31, 2025

View reviewed changes

j0sh mentioned this pull request Jan 31, 2025

test docker for audio passthrough #408

Closed

j0sh merged commit 35fca9c into livepeer:main Jan 31, 2025
6 of 7 checks passed

j0sh deleted the ja/audio-passthrough branch January 31, 2025 22:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio passthrough #400

Audio passthrough #400

j0sh commented Jan 24, 2025 •

edited

Loading

victorges Jan 29, 2025

varshith15 Jan 31, 2025 •

edited

Loading

j0sh Jan 31, 2025

varshith15 Jan 31, 2025

emranemran Jan 31, 2025

varshith15 Jan 31, 2025

varshith15 Jan 31, 2025 •

edited

Loading

j0sh Jan 31, 2025

emranemran Jan 31, 2025

j0sh Jan 31, 2025

emranemran Jan 31, 2025

emranemran Jan 31, 2025

j0sh Jan 31, 2025

emranemran Jan 31, 2025

j0sh Jan 31, 2025

emranemran commented Jan 31, 2025

Audio passthrough #400

Audio passthrough #400

Conversation

j0sh commented Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

varshith15 Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

varshith15 Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emranemran commented Jan 31, 2025

j0sh commented Jan 24, 2025 •

edited

Loading

varshith15 Jan 31, 2025 •

edited

Loading

varshith15 Jan 31, 2025 •

edited

Loading