Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jbuf frame completeness #963

Closed
wants to merge 15 commits into from
Closed

Conversation

cspiel1
Copy link
Collaborator

@cspiel1 cspiel1 commented Sep 27, 2023

jbuf: replace adaptive mode by frame completeness check

  • min/max specifies number of frames to keep in buffer
  • jbuf_get() does not deliver un-complete frames or out-of-order packets as long as max is not reached
  • jbuf_put() keeps track of the complete sequence of packets (see field end)

end always points to the end of the complete packet sequence. Thus between the head and end no packet is missing.

The implementation works for audio and video.

Keeping track of end, the number of frames nf and the number of complete frames ncf does not increase the O(.)-complexity of jbuf_put() or jbuf_get().

TODOs:

  • handle marker bit
  • plots to visualize algorithm with/without jitter simulation
  • unit tests

@cspiel1 cspiel1 force-pushed the jbuf_frame_completeness branch from 6b484d8 to 08ab8f9 Compare September 27, 2023 10:06
@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 27, 2023

The plots will heavily depend on this PR. So currently I am not sure if we can/should separate it.

@sreimers
Copy link
Member

The plots will heavily depend on this PR. So currently I am not sure if we can/should separate it.

We should have at least a before and after comparison. We can simply use this one for main: f63865f

@cspiel1 cspiel1 marked this pull request as draft September 27, 2023 10:20
@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 27, 2023

We should have at least a before and after comparison. We can simply use this one for main: f63865f

okay. Do you want to create the PR for your branch?
Then we need also a tools PR for baresip. baresip/baresip#2733

Edit: I could create both PR re/baresip for the plots.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 27, 2023

Should we enable -DUSE_TRACE=ON -DCMAKE_C_FLAGS="-DRE_JBUF_TRACE" in github actions?

@cspiel1 cspiel1 force-pushed the jbuf_frame_completeness branch from 08ab8f9 to 9c3d5e6 Compare September 27, 2023 13:24
@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 27, 2023

Rebased, but it won't build so far. The plot data needs to be updated. I guess that github actions does not build the plot code.

@cspiel1 cspiel1 force-pushed the jbuf_frame_completeness branch from ea40cf9 to c9eb069 Compare September 27, 2023 13:43
@sreimers
Copy link
Member

the builds are failing because of this test line:

test jbuf: TEST_ERR: /home/runner/work/re/re/test/jbuf.c:56: (No such file or directory [2])

Should we enable -DUSE_TRACE=ON -DCMAKE_C_FLAGS="-DRE_JBUF_TRACE" in github actions?

I think we should leave these are only for manual debug test builds and should not be needed (like modules not build with higher debug info).

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 27, 2023

This commit af375ad is to discuss trace module. It supports only one trace. If jbuf is enabled for audio + video, then the trace fails and we get mem leaks. This commit is minimal change to fix this.

Better would be to extend trace module to support multiple traces. This could be used to generate

  • jbuf-audio.json
  • jbuf-video.json

@sreimers
Copy link
Member

sreimers commented Sep 27, 2023

I think trace should handled different.

There should only one single re_trace.json handled by libre_init and libre_close and controlled by USE_TRACE. Different modules or applications can use different category.

#965

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 28, 2023

A first plot for video with

audio_jitter_buffer_type off 
video_jitter_buffer_type fixed
video_jitter_buffer_delay 2-10

jbuf

We can observe that there is one lost packet. The jbuf is filled until 10 frames are reached, then the un-complete frame passes.

@cspiel1 cspiel1 force-pushed the jbuf_frame_completeness branch from f4b8069 to a15575e Compare September 28, 2023 05:49
@sreimers
Copy link
Member

We can observe that there is one lost packet. The jbuf is filled until 10 frames are reached, then the un-complete frame passes.

Looks like max frames is very quickly decreased/consumed by decoder after this. This should be smoothed I think, since it causes:

  • a high cpu spike by decoder
  • visible video speedup

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 28, 2023

Currently we have

  • audio
rtp -> jbuf -> decoder -> aubuf -> auplay
| main thread               | play thread  |
  • video
rtp -> jbuf -> decoder -> display
| main thread                        |

A vrx_thread could be an improvement. jbuf should be thread safe already. I'll try a timer instead.

We should also think about the audio pipeline. There are jbuf and aubuf: The latter does jitter smoothing currently.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 28, 2023

  • one packet loss
    jbuf-1loss

  • with jitter and packet losses
    jbuf-jitter-and-loss

src/jbuf/jbuf.c Outdated

if (!down && tmr_isrunning(&jb->tmr))
tmr_cancel(&jb->tmr);
tmr_start(&jb->tmr, 250, reset_wait, jb);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this fixed 250ms value works well with different fps and min/max jbuf combinations?

Copy link
Collaborator Author

@cspiel1 cspiel1 Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Anyway, while analyzing audio streams with jitter simulation I saw that we need the computation of moving average rdiff to reliable decide if the buffer may be reduced again and how much it can be reduced.

I'll replace the last two commits.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rdiff didn't solve problem for audio. It was only 2 packets while number of complete frames need to be higher to avoid underruns in following aubuf.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 28, 2023

Shrink buffer after last out-of-order packet. Here plots for video

jbuf2
jbuf3

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 29, 2023

  1. rise buffer if packets missing
    @sreimers as you can see if a packet is lost the buffer immediately jumps to the configured max (here 10 frames). This leads to a small break with frozen frame. E.g. for 20fps and max=10 results in 500ms break with frozen frame. If I understood you correctly this is wanted behavior.

Alternative: Beside the overall frame max set another boundary for frame freezing time. E.g. 2 frames. Then pass the next frame which is still incomplete in the hope that the decoder is able to decode this with some artifacts in the display.

  1. shrinking buffer
    The current solution with tmr, the wait and again flags work and reduce the buffer slowly starting 1 second after the last out-of-order event occurred. After the first lost packet the buffer will stay at max frames as long as out-of-order packets are detected which is not ideal. Better would be to reduce the buffer also after a lost frame slowly but while out-of-order still occurs.
    This leads to the question: Why only after a lost frame? What if max rises only to max-1?

Next try: Periodically, e.g. every second the number of complete frames ncf is reduced by passing two frames to the decoder. Until ncf==0 is reached. So no other condition like "lost frame" or "out-of-order packet" has to met.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 29, 2023

Last mentioned method works better. Have to add one commit.
jbuf4

x means reset_wait() was called.
y means jb->again was true in line 617.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 29, 2023

Of course yesterday tried already without the line here removed 9a6c8c5 . But I had bad results because had both jbuf active (for audio and video) and thus the data was mixed which confused me.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 29, 2023

Analysing results for audio stream

audio_jitter_buffer_type fixed
audio_jitter_buffer_delay 1-10
video_jitter_buffer_type off

audio_buffer        	20-300     	# ms
audio_buffer_mode	adaptive

jbuf
ajb

  • jbuf plot: "waiting" events may lead to real aubuf underruns
  • aubuf ajb plot: There are too many underruns. We can see that the jitter is still detected but not as fast and high we would expect it. After the jitter simulation is active for ~4 seconds the underruns in aubuf seems to correlate with the "waiting" events.

*
* @return number of frames
*/
uint32_t jbuf_frames(const struct jbuf *jb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this dropped? It's useful for unit testing and this ensures frame counting works correct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I'll add it.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 29, 2023

The underruns correlate also with the "lost" packets.
Here the plots without packet loss (except one at the end):
jbuf
ajb

Edit:
Above was no delay and 100ms jitter:

tc qdisc add dev ifb1 root netem delay 0ms 100ms

Now with 100ms delay and 50ms jitter:

tc qdisc add dev ifb1 root netem delay 100ms 50ms

jbuf
ajb

Edit:
The lots of underruns are not acceptable. Have to be studied further.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Sep 29, 2023

Here #963 (comment) is an open question (1.) about "lost packets".

Edit:
Except of the "lost packets" question mentioned, I think the jbuf algorithm works now like expected. The underruns in aubuf can't be solved here. Maybe in another PR we could move the ajb back into jbuf and avoid a third "decoding thread" (we also already tried a decoding thread) by means of a timer running in the main thread (later in RTP rx thread).

I set this PR to "ready for review" and fix the unit tests.

@cspiel1 cspiel1 marked this pull request as ready for review September 29, 2023 08:39
@cspiel1
Copy link
Collaborator Author

cspiel1 commented Oct 2, 2023

Should we move jbuf to baresip like @alfredh suggests?
#962 (comment)

Again was thinking about the current solution in this PR. The result of jbuf_get() depends on timing. The shrink timing maybe should be moved to baresip/application. Also the unit tests now look strange now because of the "waiting" state.
--> Put back to draft now to prevent merge.

Also I'll try with our fixed rdiff implementation if the audio underruns are similar.

@cspiel1 cspiel1 marked this pull request as draft October 2, 2023 08:46
@cspiel1
Copy link
Collaborator Author

cspiel1 commented Oct 2, 2023

Compared with this results: #963 (comment)

tc qdisc add dev ifb1 root netem delay 0ms 100ms
audio_jitter_buffer_type adaptive
audio_jitter_buffer_delay 1-10
video_jitter_buffer_type off

audio_buffer        	20-300     	# ms
audio_buffer_mode	adaptive

ajb

This are less underruns.

@sreimers
Copy link
Member

sreimers commented Oct 2, 2023

I will prepare some pcap based tests with SIPp (maybe within a extra tools repository), so it's easier to reproduce, see improvement and we can add real jitter examples (like WiFi/LTE). With editcap we can manipulate order/timer/lost and simulate edge cases. Maybe I add another experiment which is clockrate/ptime based instead of frame counting and we can compare the solutions we currently have.

  1. rise buffer if packets missing
    @sreimers as you can see if a packet is lost the buffer immediately jumps to the configured max (here 10 frames). This leads to a small break with frozen frame. E.g. for 20fps and max=10 results in 500ms break with frozen frame. If I understood you correctly this is wanted behavior.

Unsure, will think about this, is this real packet lost or because of late lost? We should keep delay as small as possible (based on network condition).

Should we move jbuf to baresip like @alfredh suggests?
#962 (comment)

Yes it would make testing easier for now.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Oct 2, 2023

Great ideas for testing!

is this real packet lost or because of late lost?

This makes no difference. The question is if it makes sense not to jump to max in one step. Instead:
Beside the overall frame max set another boundary for frame freezing time. E.g. 2 frames. Then pass the next frame which is still incomplete in the hope that the decoder is able to decode this with some artifacts in the display.
This would mean that the latency only increases two frames for a lost or late packet. If there follow more lost or late packets the latency again increases.

Okay, then I prepare a PR for moving jbuf from re to baresip. This could be merged with low risk to current release.

@cspiel1 cspiel1 force-pushed the jbuf_frame_completeness branch from c599c9a to 09383fc Compare October 2, 2023 12:45
@cspiel1
Copy link
Collaborator Author

cspiel1 commented Oct 2, 2023

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Oct 2, 2023

Applied the changes to a baresip branch based on baresip#2743
https://github.com/cspiel1/baresip/tree/jbuf_frame_completeness2

@cspiel1 cspiel1 closed this Oct 2, 2023
@cspiel1 cspiel1 deleted the jbuf_frame_completeness branch October 6, 2023 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants