Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: streams awaiting capacity lockout #730

Merged
merged 4 commits into from
Jan 8, 2024

Conversation

dswij
Copy link
Member

@dswij dswij commented Jan 1, 2024

Fixes hyperium/hyper#3338

This PR changes the the assign-capacity queue to prioritize streams that are send-ready.

This is necessary to prevent a lockout when streams aren't able to proceed while waiting for connection capacity, but there is none.

This PR changes the the assign-capacity queue to prioritize streams that
are send-ready.

This is necessary to prevent a lockout when streams aren't able to proceed while
waiting for connection capacity, but there is none.
@seanmonstar
Copy link
Member

Thank you so so much! I'm wondering, would it be complicated to make a unit test that exercises this? Or maybe one of the reporters would like to try the patch and confirm the problem is gone.

@dswij
Copy link
Member Author

dswij commented Jan 1, 2024

I tested it on the minimal repro sample provided in the issue, and it seems to be working fine. It also fixes this issue:

for less then 300 concurrent, say 250 - few initial requests are done quickly - 5-8 similar to above. Then it pauses for few seconds, then it catches up and finishes all remaining

hyperium/hyper#3338 (comment)


would it be complicated to make a unit test that exercises this?

I'm currently trying to see if we can make this happen but haven't had the time yet. Hence, the draft status

Also, is there any benchmark for hyper at the moment? @seanmonstar

@seanmonstar
Copy link
Member

That's fantastic! For benchmarks, the end_to_end ones are likely best. You can use a [patch] and then run cargo bench --features full --bench end_to_end http2 to see all the HTTP/2 ones.

@jfourie1
Copy link

jfourie1 commented Jan 5, 2024

Can this change result in a situation where new streams that are not send_ready will never get a chance to become established even if connection capacity is available? If there are send ready streams that continuously consume all the send flow capacity they will always be moved to the front of the pending_capacity queue. I guess this is a tradeoff between per stream throughput vs amount of open streams.

Update: Looking at this in more detail I don't think the scenario in the question above can happen. It is still not clear to me what the root cause of the problem is. Do we ever attempt to assign capacity to streams that are not open yet?
I'm curious if adding a check for is_send_ready() before assigning capacity in assign_connection_capacity() will prevent the original problem from happening.

@dswij dswij marked this pull request as ready for review January 6, 2024 10:51
@dswij
Copy link
Member Author

dswij commented Jan 6, 2024

It is still not clear to me what the root cause of the problem is.

What's happening is that the prioritize model in h2 allows user to declare streams over the max limit, and h2 would only open them when it's possible. Currently, however, the extra streams are put into streams awaiting capacity queue.

so here's an example:

  1. User ask h2 to use 100 streams, but the server only accepts 1.
  2. h2 assigns all connection capacity to StreamId(1), all the other 99 streams are put into awaiting capacity
  3. After getting some data from StreamId(1), the server sends multiple WINDOW_UPDATE frame, but in small increments. We assign a capacity to StreamId(3), then push it back to the end of the queue.
  4. Repeat that until all the other streams get some capacity, but they're all waiting for the lower id-s to open.
  5. Connection capacity is assigned badly.

I've added a test to demonstrate an extreme case of this: e13adae

While 7dec3c2 does prevent a lockout, f1f99e0 is a more complete one @seanmonstar.

Haven't had the chance to formally benchmark it (hyper benchmark is broken in my machine somehow), but against the go server it seems fine.

@jfourie1
Copy link

jfourie1 commented Jan 8, 2024

Thanks for the explanation. I think the changes in f1f99e0 are good.

@seanmonstar
Copy link
Member

I can run the benches. I realized the http2 chunked benchmarks could be re-enabled, once that's done I'll compare them.

@seanmonstar
Copy link
Member

Alright, here's the benchmarks, ran 3 times each:

master

test http2_consecutive_x1_empty                             ... bench:      19,466 ns/iter (+/- 751)
test http2_consecutive_x1_req_100kb                         ... bench:      61,085 ns/iter (+/- 7,731) = 1676 MB/s
test http2_consecutive_x1_req_10b                           ... bench:      23,145 ns/iter (+/- 2,280)
test http2_parallel_x10_empty                               ... bench:      80,085 ns/iter (+/- 1,178)
test http2_parallel_x10_req_10kb_100_chunks                 ... bench:  21,561,698 ns/iter (+/- 33,079,216) = 474 MB/s
test http2_parallel_x10_req_10kb_100_chunks_adaptive_window ... bench:  29,334,189 ns/iter (+/- 17,471,509) = 349 MB/s
test http2_parallel_x10_req_10kb_100_chunks_max_window      ... bench:   4,536,371 ns/iter (+/- 442,932) = 2257 MB/s
test http2_parallel_x10_req_10mb                            ... bench:  28,362,346 ns/iter (+/- 1,293,771) = 3697 MB/s
test http2_parallel_x10_res_10mb                            ... bench:  63,976,528 ns/iter (+/- 10,896,099) = 1639 MB/s
test http2_parallel_x10_res_1mb                             ... bench:   2,977,858 ns/iter (+/- 175,547) = 3521 MB/s
test http2_consecutive_x1_empty                             ... bench:      19,250 ns/iter (+/- 835)
test http2_consecutive_x1_req_100kb                         ... bench:      65,900 ns/iter (+/- 7,830) = 1553 MB/s
test http2_consecutive_x1_req_10b                           ... bench:      23,125 ns/iter (+/- 1,029)
test http2_parallel_x10_empty                               ... bench:      80,587 ns/iter (+/- 2,271)
test http2_parallel_x10_req_10kb_100_chunks                 ... bench:  21,242,009 ns/iter (+/- 33,134,735) = 482 MB/s
test http2_parallel_x10_req_10kb_100_chunks_adaptive_window ... bench:   4,728,061 ns/iter (+/- 739,811) = 2165 MB/s
test http2_parallel_x10_req_10kb_100_chunks_max_window      ... bench:   4,818,920 ns/iter (+/- 312,953) = 2124 MB/s
test http2_parallel_x10_req_10mb                            ... bench:  28,621,065 ns/iter (+/- 1,045,695) = 3663 MB/s
test http2_parallel_x10_res_10mb                            ... bench:  65,335,503 ns/iter (+/- 13,042,825) = 1604 MB/s
test http2_parallel_x10_res_1mb                             ... bench:   2,986,464 ns/iter (+/- 182,535) = 3511 MB/s
test http2_consecutive_x1_empty                             ... bench:      19,958 ns/iter (+/- 615)
test http2_consecutive_x1_req_100kb                         ... bench:      58,819 ns/iter (+/- 3,983) = 1740 MB/s
test http2_consecutive_x1_req_10b                           ... bench:      23,026 ns/iter (+/- 3,322)
test http2_parallel_x10_empty                               ... bench:      80,913 ns/iter (+/- 4,447)
test http2_parallel_x10_req_10kb_100_chunks                 ... bench:  21,632,048 ns/iter (+/- 32,801,356) = 473 MB/s
test http2_parallel_x10_req_10kb_100_chunks_adaptive_window ... bench:  21,337,511 ns/iter (+/- 16,579,399) = 479 MB/s
test http2_parallel_x10_req_10kb_100_chunks_max_window      ... bench:   4,460,987 ns/iter (+/- 107,547) = 2295 MB/s
test http2_parallel_x10_req_10mb                            ... bench:  28,389,317 ns/iter (+/- 1,629,586) = 3693 MB/s
test http2_parallel_x10_res_10mb                            ... bench:  64,092,171 ns/iter (+/- 10,006,727) = 1636 MB/s
test http2_parallel_x10_res_1mb                             ... bench:   2,887,548 ns/iter (+/- 284,703) = 3631 MB/s

patch

test http2_consecutive_x1_empty                             ... bench:      19,344 ns/iter (+/- 1,090)
test http2_consecutive_x1_req_100kb                         ... bench:      62,291 ns/iter (+/- 4,330) = 1643 MB/s
test http2_consecutive_x1_req_10b                           ... bench:      22,989 ns/iter (+/- 1,305)
test http2_parallel_x10_empty                               ... bench:      79,106 ns/iter (+/- 1,078)
test http2_parallel_x10_req_10kb_100_chunks                 ... bench:  22,254,082 ns/iter (+/- 33,339,072) = 460 MB/s
test http2_parallel_x10_req_10kb_100_chunks_adaptive_window ... bench:  13,325,490 ns/iter (+/- 16,811,166) = 768 MB/s
test http2_parallel_x10_req_10kb_100_chunks_max_window      ... bench:   4,439,135 ns/iter (+/- 137,301) = 2306 MB/s
test http2_parallel_x10_req_10mb                            ... bench:  28,679,929 ns/iter (+/- 1,500,517) = 3656 MB/s
test http2_parallel_x10_res_10mb                            ... bench:  65,187,943 ns/iter (+/- 10,425,661) = 1608 MB/s
test http2_parallel_x10_res_1mb                             ... bench:   3,064,030 ns/iter (+/- 562,964) = 3422 MB/s
test http2_consecutive_x1_empty                             ... bench:      19,080 ns/iter (+/- 2,813)
test http2_consecutive_x1_req_100kb                         ... bench:      64,433 ns/iter (+/- 8,307) = 1589 MB/s
test http2_consecutive_x1_req_10b                           ... bench:      22,021 ns/iter (+/- 1,723)
test http2_parallel_x10_empty                               ... bench:      79,386 ns/iter (+/- 9,277)
test http2_parallel_x10_req_10kb_100_chunks                 ... bench:  21,791,421 ns/iter (+/- 33,479,730) = 469 MB/s
test http2_parallel_x10_req_10kb_100_chunks_adaptive_window ... bench:   4,766,794 ns/iter (+/- 339,291) = 2148 MB/s
test http2_parallel_x10_req_10kb_100_chunks_max_window      ... bench:   4,768,157 ns/iter (+/- 315,048) = 2147 MB/s
test http2_parallel_x10_req_10mb                            ... bench:  29,218,333 ns/iter (+/- 2,002,628) = 3588 MB/s
test http2_parallel_x10_res_10mb                            ... bench:  64,073,564 ns/iter (+/- 10,601,740) = 1636 MB/s
test http2_parallel_x10_res_1mb                             ... bench:   2,940,898 ns/iter (+/- 148,619) = 3565 MB/s
test http2_consecutive_x1_empty                             ... bench:      22,374 ns/iter (+/- 1,478)
test http2_consecutive_x1_req_100kb                         ... bench:      81,660 ns/iter (+/- 17,238) = 1253 MB/s
test http2_consecutive_x1_req_10b                           ... bench:      26,591 ns/iter (+/- 761)
test http2_parallel_x10_empty                               ... bench:      81,684 ns/iter (+/- 4,365)
test http2_parallel_x10_req_10kb_100_chunks                 ... bench:  22,456,080 ns/iter (+/- 32,524,006) = 456 MB/s
test http2_parallel_x10_req_10kb_100_chunks_adaptive_window ... bench:  21,630,419 ns/iter (+/- 17,674,624) = 473 MB/s
test http2_parallel_x10_req_10kb_100_chunks_max_window      ... bench:   4,471,232 ns/iter (+/- 248,682) = 2290 MB/s
test http2_parallel_x10_req_10mb                            ... bench:  28,413,500 ns/iter (+/- 2,303,337) = 3690 MB/s
test http2_parallel_x10_res_10mb                            ... bench:  67,332,789 ns/iter (+/- 14,450,978) = 1557 MB/s
test http2_parallel_x10_res_1mb                             ... bench:   3,005,929 ns/iter (+/- 521,638) = 3488 MB/s

@seanmonstar
Copy link
Member

Looking through the results, the differences look like just noise. Maybe there's a workload that matters that the benchmarks don't catch, but 🤷 . If you concur, we can merge this!

@jeromegn
Copy link

jeromegn commented Jan 8, 2024

I didn't work on this, but it does look like noise to me. Some of them are highly variable in both master and the patch (e.g. http2_parallel_x10_req_10kb_100_chunks_adaptive_window).

@seanmonstar
Copy link
Member

adaptive_window

I noticed that too. I think the issue is that the benchmarks are going over localhost, which means adaptive window which tries to detect bandwidth will have a horrible time. A cool project to improve the adaptive window option would be to setup some tests using turmoil where we can simulate latency.

@seanmonstar seanmonstar merged commit 756345e into hyperium:master Jan 8, 2024
7 checks passed
@seanmonstar
Copy link
Member

Thank you so much @dswij, excellent debugging work here ❤️

@seanmonstar
Copy link
Member

Shipped as v0.4.1 :)

@dswij dswij deleted the issue-3338 branch January 9, 2024 14:05
seanmonstar pushed a commit that referenced this pull request Jan 10, 2024
This PR changes the the assign-capacity queue to prioritize streams that
are send-ready.

This is necessary to prevent a lockout when streams aren't able to proceed while
waiting for connection capacity, but there is none.

Closes hyperium/hyper#3338
seanmonstar added a commit that referenced this pull request Jan 10, 2024
This PR changes the the assign-capacity queue to prioritize streams that
are send-ready.

This is necessary to prevent a lockout when streams aren't able to proceed while
waiting for connection capacity, but there is none.

Closes hyperium/hyper#3338

Co-authored-by: dswij <[email protected]>
cxw620 pushed a commit to cxw620/h2 that referenced this pull request Jan 20, 2025
* Add a setter for header_table_size

* Include a test for setting header_table_size

* Fix formatting

* Switch pseudo header order to mimic chrome

* support chrome/okhttp impersonate

* deps(http): bump version to v0.2.11

* feat(header): support safari header pseudo order

* Send frames based on custom configuration

* Revert "deps(http): bump version to v0.2.11"

* deps(http): bump version to v0.2.11

* bump version to v0.4.2

* fix: streams awaiting capacity lockout (hyperium#730) (hyperium#734)

This PR changes the the assign-capacity queue to prioritize streams that
are send-ready.

This is necessary to prevent a lockout when streams aren't able to proceed while
waiting for connection capacity, but there is none.

Closes hyperium/hyper#3338

Co-authored-by: dswij <[email protected]>

* streams: limit error resets for misbehaving connections

This change causes GOAWAYs to be issued to misbehaving connections which for one reason or another cause us to emit lots of error resets.

Error resets are not generally expected from valid implementations anyways.

The threshold after which we issue GOAWAYs is tunable, and will default to 1024.

* Prepare v0.3.24

* perf: optimize header list size calculations (hyperium#750)

This speeds up loading blocks in cases where we have many headers already.

* deps(http): v0.2

* Fix header order

* Optimize pseudo-header sort

* Setting static headers

* feat(frame): Passing values ​​to solve pseudo order

* Add headers priority

* Fix default flag

* Project rename to `h2-patch`
---------

Co-authored-by: 4JX <[email protected]>
Co-authored-by: Sean McArthur <[email protected]>
Co-authored-by: dswij <[email protected]>
Co-authored-by: Noah Kennedy <[email protected]>
cxw620 pushed a commit to cxw620/h2 that referenced this pull request Jan 20, 2025
* Add a setter for header_table_size

* Include a test for setting header_table_size

* Fix formatting

* Switch pseudo header order to mimic chrome

* support chrome/okhttp impersonate

* deps(http): bump version to v0.2.11

* feat(header): support safari header pseudo order

* Send frames based on custom configuration

* Revert "deps(http): bump version to v0.2.11"

This reverts commit 7882ee95b58275200863ce15ece74983ec9c76f3.

* deps(http): bump version to v0.2.11

* bump version to v0.4.2

* fix: streams awaiting capacity lockout (hyperium#730) (hyperium#734)

This PR changes the the assign-capacity queue to prioritize streams that
are send-ready.

This is necessary to prevent a lockout when streams aren't able to proceed while
waiting for connection capacity, but there is none.

Closes hyperium/hyper#3338

Co-authored-by: dswij <[email protected]>

* v0.3.23

* streams: limit error resets for misbehaving connections

This change causes GOAWAYs to be issued to misbehaving connections which for one reason or another cause us to emit lots of error resets.

Error resets are not generally expected from valid implementations anyways.

The threshold after which we issue GOAWAYs is tunable, and will default to 1024.

* Prepare v0.3.24

* perf: optimize header list size calculations (hyperium#750)

This speeds up loading blocks in cases where we have many headers already.

* bump version to v 0.4.3

* deps(http): v0.2

* Fix header order

* Optimize pseudo-header sort

* Setting static headers

* feat(frame): Passing values ​​to solve pseudo order

* Add headers priority

* Fix default flag

* Project rename to `h2-patch`

---------

Co-authored-by: 4JX <[email protected]>
Co-authored-by: Sean McArthur <[email protected]>
Co-authored-by: dswij <[email protected]>
Co-authored-by: Noah Kennedy <[email protected]>
cxw620 pushed a commit to cxw620/h2 that referenced this pull request Jan 20, 2025
* v0.3.26

* Rename project to `rh2`

* Refactor frame sending custom implementation

* Export frame `PseudoOrder` settings

* Reduce unnecessary Option packaging

* v0.3.27

* fix(frame/headers): Fix error when headers priority is empty

* v0.3.29

* feat(frame/headers): Packaging headers pseudo order type (hyperium#8)

* feat(frame/settings): Packaging settings type (hyperium#9)

* Initialize frame settings order in advance

* v0.3.31

* feat(frame): Add unknown_setting frame settings (hyperium#10)

* Add unknown_setting patch

* Customize all Http Settings order

* v0.3.40

* fix(frame): Fix unknown setting encode (hyperium#11)

* v0.3.41

* feat: Replace with static settings (hyperium#12)

* v0.3.50

* feat: Destructive update, fixed-length array records the setting frame order (hyperium#13)

* v0.3.60

* Update README.md

* Sync upstream (hyperium#14)

* fix: streams awaiting capacity lockout (hyperium#730) (hyperium#734)

This PR changes the the assign-capacity queue to prioritize streams that
are send-ready.

This is necessary to prevent a lockout when streams aren't able to proceed while
waiting for connection capacity, but there is none.

Closes hyperium/hyper#3338

Co-authored-by: dswij <[email protected]>

* v0.3.23

* streams: limit error resets for misbehaving connections

This change causes GOAWAYs to be issued to misbehaving connections which for one reason or another cause us to emit lots of error resets.

Error resets are not generally expected from valid implementations anyways.

The threshold after which we issue GOAWAYs is tunable, and will default to 1024.

* Prepare v0.3.24

* perf: optimize header list size calculations (hyperium#750)

This speeds up loading blocks in cases where we have many headers already.

* v0.3.25

* refactor: cleanup new unused warnings (hyperium#757)

* fix: limit number of CONTINUATION frames allowed

Calculate the amount of allowed CONTINUATION frames based on other
settings.

    max_header_list_size / max_frame_size

That is about how many CONTINUATION frames would be needed to send
headers up to the max allowed size. We then multiply by that by a small
amount, to allow for implementations that don't perfectly pack into the
minimum frames *needed*.

In practice, *much* more than that would be a very inefficient peer, or
a peer trying to waste resources.

See https://seanmonstar.com/blog/hyper-http2-continuation-flood/ for
more info.

* v0.3.26

* fix: return a WriteZero error if frames cannot be written (hyperium#783)

Some operating systems will allow you continually call `write()` on a closed socket, and will return `Ok(0)` instead of an error. This patch checks for a zero write, and instead of looping forever trying to write, returns a proper error.

Closes hyperium#781

Co-authored-by: leibeiyi <[email protected]>

* lints: fix unexpected cfgs warnings

* ci: pin deps for MSRV

* ci: pin more deps for MSRV job (hyperium#817)

* fix: notify_recv after send_reset() in reset_on_recv_stream_err() to ensure local stream is released properly (hyperium#816)

Similar to what have been done in fn send_reset<B>(), we should notify RecvStream that is parked after send_reset().

Co-authored-by: Jiahao Liang <[email protected]>

---------

Co-authored-by: Sean McArthur <[email protected]>
Co-authored-by: dswij <[email protected]>
Co-authored-by: Noah Kennedy <[email protected]>
Co-authored-by: beiyi lei <[email protected]>
Co-authored-by: leibeiyi <[email protected]>
Co-authored-by: Jiahao Liang <[email protected]>

* v0.3.61

---------

Co-authored-by: Sean McArthur <[email protected]>
Co-authored-by: dswij <[email protected]>
Co-authored-by: Noah Kennedy <[email protected]>
Co-authored-by: beiyi lei <[email protected]>
Co-authored-by: leibeiyi <[email protected]>
Co-authored-by: Jiahao Liang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP/2: Hyper communication is stuck if too many requests are spawned.
4 participants