Figure out why the grey frame after joining the call is shown for so long #131

daniel-abramov · 2023-02-13T13:44:10Z

It takes on average 2 seconds to disappear, but sometimes it takes a bit more. We have to check if it's some sort of a bug in the SFU or something different.

daniel-abramov · 2023-03-24T14:18:06Z

@SimonBrandner, @robintown do you by chance know which state the grey frame corresponds to? (in EC)

AFAIK, we never display grey frames in the full-mesh mode, I would like to know when we do this in the SFU mode. Is it when the track negotiation has finished, but no RTPs arrived yet?

According to webrtcforthecurious, the ontrack is called when the first RTPs arrive, which means that if we show the video tile after getting a track, we potentially should be able to display something at the moment (unless the timestamps are somehow off?)

But according to the MDN, it's a bit different.

My current hypothesis is that the EC does get the RTPs, that's why the callback is called and we display the tile, but since the first packet is not necessarily a keyframe (see #100), nothing is said until the key frame arrives. In this case, this bug seems to be an extension (or maybe even a duplicate) of #100, but I would need a bit more input to get a broader perspective and to understand if it's worth addressing #100 first.

daniel-abramov · 2023-03-29T18:07:12Z

UPD. After running it with OpenTelemetry and also trying to match it with the logs, there are several interesting findings I've done. Essentially, the grey frame is shown when there are no RTP packets received on a track. If we don't receive any RTP packets at the beginning of the call (i.e. us joining the call and trying to display other participants' videos or a new participant joining the call), then that's where we see a grey frame. If the frames were received on a certain simulcast layer but stopped being sent by the client after a certain time, it will be observed as a frozen frame.

Essentially it happens because although we do get informed about new tracks/transceivers, we don't know upfront when and how often the peer's browser will send RTP packets on a certain simulcast layer. The browser adjusts this behavior based on many factors and we can't assume that we always get it.

The solution is to track the state of each publisher and automatically switch between layers by the SFU. This would solve the issues caused by long grey frames for the participants and frozen participants' videos.

It's important to note that short-term grey frames (1-2 seconds) that happen right after joining the call happen for a different reason. I've noticed that each time we join, there are a couple of logs that we always get on the SFU side:

INFO[2023-03-27T23:10:51+02:00] New subscription                              conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=LTWKPUMJQI layer= subscriber="@copper-fluffy-constrictor:call.ems.host/VIHCKRDRXL" track=9950e423-c103-472f-b81e-06de083d30ec user_id="@teal-electoral-gorilla:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] New subscription                              conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=LTWKPUMJQI layer=low subscriber="@copper-fluffy-constrictor:call.ems.host/VIHCKRDRXL" track=278e9fbf-6a0d-4178-80df-b3724de6370e user_id="@teal-electoral-gorilla:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiating, sending SDP offer (3 streams)  conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=VIHCKRDRXL user_id="@copper-fluffy-constrictor:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] New subscription                              conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=LTWKPUMJQI layer= subscriber="@green-accurate-caribou:call.ems.host/QAKNDEILIZ" track=9950e423-c103-472f-b81e-06de083d30ec user_id="@teal-electoral-gorilla:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] New subscription                              conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=LTWKPUMJQI layer=low subscriber="@green-accurate-caribou:call.ems.host/QAKNDEILIZ" track=278e9fbf-6a0d-4178-80df-b3724de6370e user_id="@teal-electoral-gorilla:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiating, sending SDP offer (3 streams)  conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=QAKNDEILIZ user_id="@green-accurate-caribou:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] New subscription                              conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=LTWKPUMJQI layer= subscriber="@pink-middle-panther:call.ems.host/KBSMMCHYQX" track=9950e423-c103-472f-b81e-06de083d30ec user_id="@teal-electoral-gorilla:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] New subscription                              conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=LTWKPUMJQI layer=low subscriber="@pink-middle-panther:call.ems.host/KBSMMCHYQX" track=278e9fbf-6a0d-4178-80df-b3724de6370e user_id="@teal-electoral-gorilla:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiating, sending SDP offer (3 streams)  conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=KBSMMCHYQX user_id="@pink-middle-panther:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiation answer received                 conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=QAKNDEILIZ user_id="@green-accurate-caribou:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiating, sending SDP offer (3 streams)  conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=QAKNDEILIZ user_id="@green-accurate-caribou:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiation answer received                 conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=VIHCKRDRXL user_id="@copper-fluffy-constrictor:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiating, sending SDP offer (3 streams)  conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=VIHCKRDRXL user_id="@copper-fluffy-constrictor:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiation answer received                 conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=KBSMMCHYQX user_id="@pink-middle-panther:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiation answer received                 conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=QAKNDEILIZ user_id="@green-accurate-caribou:call.ems.host"
INFO[2023-03-27T23:10:51+02:00] Renegotiation answer received                 conf_id=1674726523924lGe9hGSoDBvHPAfL device_id=VIHCKRDRXL user_id="@copper-fluffy-constrictor:call.ems.host"

Basically, each time the subscription gets added, we add a track to the peer connection, this causes a renegotiation process to start. Until the renegotiation completes, the grey frame is seen in the Element Call. We could probably work around it by not displaying a grey frame, but instead displaying a more useful status. As for preventing the renegotiation, the only thing we could do is to allocate a set of transceivers upfront that would allow us to skip the renegotiation stage.

daniel-abramov · 2023-03-29T18:08:10Z

Interesting note, not sure that it's related, but it happened every time where the browser on the sender's side did not send RTPs on a certain layer, we've observed the following errors printed by Pion in the stdout/stderr:

pc ERROR: 2023/03/27 21:56:20 Incoming unhandled RTP ssrc(2954160688), OnTrack will not be fired. incoming SSRC failed Simulcast probing

So that we can detect when the publisher is stalled. Relates to #131.

Relates to #131.

So that when the layer on a published track stalled, we can change subscriptions to expect receiving packets from a different track layer (different simulcast layer) within the same published track. Solves #131.

daniel-abramov added T-Defect Something isn't working T-Task X-Needs-Investigation and removed T-Task labels Feb 13, 2023

daniel-abramov self-assigned this Mar 22, 2023

daniel-abramov mentioned this issue Mar 29, 2023

Find out why we have so many warnings on missing RTPs on certain active subscriptions #121

Closed

daniel-abramov removed the X-Needs-Investigation label Mar 29, 2023

daniel-abramov added a commit that referenced this issue Apr 11, 2023

publisher: add packet observer

e4bafcc

So that we can detect when the publisher is stalled. Relates to #131.

daniel-abramov added a commit that referenced this issue Apr 11, 2023

track: detect stalled publishers and switch layer

7e401a5

Relates to #131.

daniel-abramov closed this as completed Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out why the grey frame after joining the call is shown for so long #131

Figure out why the grey frame after joining the call is shown for so long #131

daniel-abramov commented Feb 13, 2023

daniel-abramov commented Mar 24, 2023 •

edited

Loading

daniel-abramov commented Mar 29, 2023

daniel-abramov commented Mar 29, 2023

Figure out why the grey frame after joining the call is shown for so long #131

Figure out why the grey frame after joining the call is shown for so long #131

Comments

daniel-abramov commented Feb 13, 2023

daniel-abramov commented Mar 24, 2023 • edited Loading

daniel-abramov commented Mar 29, 2023

daniel-abramov commented Mar 29, 2023

daniel-abramov commented Mar 24, 2023 •

edited

Loading