Skip to content

ms2131_stability_tests

Mark Van den Borre edited this page Oct 29, 2023 · 35 revisions

Stability testing

Preliminary conclusions:

  • The interplay of the uniaccessories hub and the x270's usb-c uplink causes instability.

Test log

device fw hub ms2131 case case vid src location uptime qty stack polling remarks
hagibis stock n n n tester office >12h 4 stock Debian n stopped test
hagibis stock y n y camera hv30 initlab 10m 1 standard y (standard) crashed, boxbg2
hagibis stock y n y camera hv30 initlab 10m 1 standard y (0x0000) crashed, boxbg2
hagibis stock y n y camera hv30 initlab 10m 1 standard y (0x1fd0) crashed, boxbg2
hagibis stock y n y camera hv30 initlab 10m 1 standard n crashed, boxbg2
hagibis stock n n y camera hv30 initlab >30m 1 standard n survived, boxbg2
hagibis stock n n y camera hv30 initlab >30m 1 standard y (0x1fd0) survived, boxbg2
hagibis fosdem n y n laptop gerry 10h 1 desktop w/Arch n stopped test
hagibis fosdem n y n laptop gerry 3h30m 1 desktop w/Arch y (0x0000 1ms) stopped test
hagibis stock y n n tester office 01:23:18 1 stock Debian n hub in metal case
hagibis stock y n n tester office 1 01:17:08 1 stock Debian n hub in metal case
hagibis stock y y n tester office 3 02:29 1 stock Debian n hub in metal case, loopout connected,
hagibis stock y y n tester office 2 01:08:51 1 stock Debian n hub in metal case, no loopout connected
hagibis stock y y n tester office 2 01:46 1 stock Debian n hub in metal case, no loopout connected, after soft reboot
hagibis stock y n n tester office 1 01:07 1 stock Debian n hub in metal case, after cold reboot w/power unplugged
hagibis stock y n n tester office 0 47:41 1 stock Debian n hub in metal case, after cold reboot w/power unplugged
hagibis stock n n y camera hv30 initlab >10h 1 standard y (v1.2.0) survived, boxbg2
hagibis stock y n y camera hv30 initlab >30m 1 standard y (v1.2.0, 1ms poll) survived, boxbg2, usb-c-to-usb-a
hagibis stock y n y camera hv30 initlab 5m 1 standard y (v1.0.0, 1ms poll) crash, boxbg2, usb-c-to-usb-a, old poll range
hagibis stock y n y x270 hdmi initlab >24h 1 standard y (1.2.0) survived, boxbg1
hagibis stock n n y camera hv30 initlab 30m 1 standard y (v1.2.0, 1ms poll) survived, Vasil's t14
  • test n (Gerry):

  • hagibis, stock firmware, metal case

  • usb-c hub, metal case, manufacturer supplied usb-c cable plus a usb-a adapter, connected to x270 usb-a port

  • ms2131 polling, 1ms frequency

  • running for > 05:30:00 and counting

  • test 0 (office):

  • x270 1

  • Anker PowerExpand 8-in-1 usb-c 10 Gbps Data Hub, connected to usb-c

  • ms2131 connected to usb-c hub

  • no usb ethernet hub connected

  • stable for 26 hours, manually stopped

  • test p (office):

    • x270 3
    • Anker 7-in-1 usb=c 1H2C2AC-SIM, connected to usb-c
    • ms2131 connected to usb-c hub
    • no usb ethernet hub connected
    • barfed out after approximately 03:20:00
  • test q (office):

    • x270 2
    • uniaccessories usb-c hub
    • usb-c-to-usb-a adaptor, connected to x270 usb-a left
    • ms2131 connected to usb-c hub
    • no usb ethernet hub connected
    • stable for 26 hours, manually stopped
  • test r (office):

    • x270 3
    • cheap Benfei usb3 switch, connected to x270 usb-a left
    • ms2131 connected to usb hub
    • ethernet connected to usb hub, ip .246
  • test s (office):

    • x270 2
    • cheap Benfei usb3 switch, connected to x270 usb-a left
    • ms2131 connected to usb hub
    • ethernet connected to usb hub, ip .247
  • test t (office):

    • x270 1
    • capture box
    • ms2131 connected directly to x270 left
    • ethernet connected directly to x270 right, ip .154
  • test u (office):

    • x270 0
    • capture box
    • ms2131 connected directly to x270 left
    • ethernet connected directly to x270 right, ip .158

20231022 test scenario

  • have video-status, video-receiver services running;
  • have a loop-out (helps see this faster);
  • have a ffmpeg with blackdetect running for the tcp stream (ffmpeg -i tcp://185.175.218.154:8899/ -vf "blackdetect=d=0.1" -an -f null -)
  • in parallel to the above, run either service 'video-statuskeeper' or 'ms213x-status status --json --loop 1000 --filename /tmp/ms213x-status' for the normal situation
  • for the faster trigger, run
    ms213x-status status --loop 1

The faster trigger should blank the loop out within few minutes. The yuncun cheap seems to hold out the most in this case. The demo board dies the fastest.

observations and remarks

  • Running with 720p30 is less unstable than 1080p.
  • We are reading 16 bytes of ms2131 device RAM to check for resolution and signal. This triggers instability:
    • Both loopout screen and capture start blinking.
    • The MS2131 device sometimes starts reporting bogus resolution and sometimes bogus signal.
    • The MS2131 device sometimes shows just black capture output with proper output on the hdmi loopout.
    • The MS2131 device finally sometimes stops working properly alltogether.
  • RAM polling instability. We can reproduce the issue within a minute by polling continuously.
    • Lowering the polling frequency
    • Not optimal. We are triggering dual reads of the ms2131 device RAM area with our current setup.
  • The usb hub is running quite hot. Heat sinks could help here.

Wonky resolution and/or signal polling: ideas to improve stability or get rid of it entirely (hare brained and other)

  • two ms21xx devices: one to poll for resolution, one to capture video
  • something with black detection
  • compare time until it falls over on different ms2131 devices
  • minimise resolution and/or signal polling frequency
  • minimise the stability impact of a single resolution and/or signal polling event
    • avoid reading RAM twice (TODO)
  • avoid polling alltogether
    • get both 4/3 and 16/9 regions with overlays
    • detect source resolution through hdmi splitter's serial port
  • poll different memory regions
  • keeping pushing the same picture for a few seconds when the ms2131 freezes is not too bad of a fsckup

Individual test results

Yuncun cheap

Yuncun less cheap

Lin-ker

  • On engineering sample A, polling for signal by
    time sudo ./cli read RAM 0xf6e9 1 --loop=1
    lasts a very long time.

Hagibis

In Sofia
  • 5 separate tests with hub included, different components stopped (the last were with all possible), video source an old camera, hangs within 10-15 minutes. Also disabling the usb_suspend thing was tested and had no effect.
Clone this wiki locally