Skip to content

Hardware decoding and encoding on amd64

Mark Van den Borre edited this page Nov 24, 2019 · 17 revisions

We can use cheap atom hardware for decoding and encoding in software.

Prerequisites

Install some required packages:

apt install gstreamer1.0-vaapi i965-va-driver vainfo

Capabilities

Let's see what hardware decoding and encoding capabilities our hardware has:

vainfo
libva info: VA-API version 1.4.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_4
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.4 (libva 2.4.0)
vainfo: Driver version: Intel i965 driver for Intel(R) Sandybridge Mobile - 2.3.0
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointEncSlice
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointEncSlice
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointEncSlice
      VAProfileH264StereoHigh         :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileNone                   :	VAEntrypointVideoProc

Our hardware (Sandy Bridge) does not support h264 Baseline decoding or encoding! We have this confirmed by testing!

Luckily, our boxes' h264 streams are high profile:

ffprobe tcp://185.175.218.128:8898/?timeout=2000000
Input #0, mpegts, from 'tcp://185.175.218.128:8898/?timeout=2000000':
  Duration: N/A, start: 13814.131344, bitrate: N/A
  Program 1 
    Metadata:
      service_name    : Service01
      service_provider: FFmpeg
    Stream #0:0[0x100]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 59.94 fps, 59.94 tbr, 90k tbn, 119.88 tbc
    Stream #0:1[0x101]: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 234 kb/s

Ffmpeg

Decoding and reencoding

Prerequisites:

apt install i965-va-driver-shaders

ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -hwaccel_device /dev/dri/renderD128 -i vaapi_test/h264high_audio.mp4 -c:v h264_vaapi /tmp/output.mp4

https://trac.ffmpeg.org/wiki/Hardware/VAAPI http://www.ffmpeg.org/ffmpeg-codecs.html#VAAPI-encoders

Minimal example:

ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -hwaccel_device /dev/dri/renderD128 -i tcp://185.175.218.128:8898 -c:v h264_vaapi /tmp/output.mp4

Working source-something script. Note that you need to do hwdownload to get the data to you from the GPU, and it helps if you do the scaling in the card before getting it to the rest. Not sure if FPS and SAR/DAR can be handled in there.

#!/bin/sh

confdir="`dirname "$0"`/../config/"
. ${confdir}/defaults.sh
. ${confdir}/config.sh

ffmpeg -y -nostdin -hwaccel vaapi -hwaccel_output_format vaapi -hwaccel_device /dev/dri/renderD128 \
	-i "${SOURCE_CAM}" \
	-ac 2 \
	-filter_complex "
		[0:v] scale_vaapi=w=$WIDTH:h=$HEIGHT  [v2]; [v2] hwdownload,format=nv12,format=yuv420p [v1]; [v1] scale=$WIDTH:$HEIGHT,fps=$FRAMERATE,setdar=16/9,setsar=1 [v] ;
		[0:a] aresample=$AUDIORATE [a]
	" \
	-map "[v]" -map "[a]" \
	-pix_fmt yuv420p \
	-c:v rawvideo \
	-c:a pcm_s16le \
	-f matroska \
	tcp://localhost:10000

Hardware

Legacy crap test using gstreamer

Decoding test

We're running wayland, so let's start a weston session:

apt install weston
LANG=C weston
weston-terminal
time gst-launch-1.0 -v filesrc location=/tmp/test_video_only.mp4 ! qtdemux ! vaapidecodebin ! vaapisink fullscreen=true
real=2m21.558s
user=0m2.238s
sys=0m1.516s

Let's compare to non-accelerated:

time mpv /tmp/test.mp4 
Playing: /tmp/test.mp4
 (+) Video --vid=1 (*) (h264 1920x1080 25.000fps)
VO: [gpu] 1920x1080 yuv420p
V: 00:02:21 / 00:02:21 (99%)

Exiting... (End of file)

real	2m21,818s
user	0m57,474s
sys	0m1,475s

Two steps at once: hardware decoding and recoding

time gst-launch-1.0 -v filesrc location=/tmp/test.mp4 ! qtdemux ! vaapidecodebin ! vaapih264enc rate-control=cbr tune=high-compression ! qtmux ! filesink location=/tmp/testdecode_reencode.mp4

real	0m17.031s
user	0m1.738s
sys	0m1.386s

Audio and video decoding

This works, but it's not using vaapi h264 decoding:

gst-launch-1.0 filesrc location=/tmp/test.mp4 ! qtdemux name=demux demux.audio_0 ! queue ! decodebin ! audioconvert ! audioresample ! autoaudiosink demux.video_0 ! queue ! decodebin ! autovideosink 

Setting gst debug levels might be helpful to discover some pipeline elements:

gst-launch-1.0  --gst-debug-level=2 filesrc location=/tmp/test.mp4 ! qtdemux name=demux demux.audio_0 ! queue ! decodebin ! audioconvert ! audioresample ! autoaudiosink demux.video_0 ! queue ! decodebin ! autovideosink 

One can visually examine the pipeline playbin generates:

cd /tmp
export GST_DEBUG_DUMP_DOT_DIR='/tmp/'
gst-launch-1.0 -v playbin uri=file:///tmp/test.mp4
ls -1 *.dot | xargs -I{} dot -Tpng {} -o{}.png
eog *.png
Clone this wiki locally