Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(networking): Rework auto concurrency backpressure logic #3804

Merged
merged 3 commits into from
Sep 10, 2020

Conversation

bruceg
Copy link
Member

@bruceg bruceg commented Sep 10, 2020

Previously, all responses were marked as not being backpressure. This
changes that logic to:

  1. Treat RetryAction::Retry responses as backpressure

  2. Only use the RTT values from RetryAction::Successful responses

Signed-off-by: Bruce Guenter [email protected]

Closes #3424

Previously, all responses were marked as not being backpressure. This
changes that logic to:

1. Treat RetryAction::Retry responses as backpressure

2. Only use the RTT values from RetryAction::Successful responses

Signed-off-by: Bruce Guenter <[email protected]>
@bruceg bruceg added type: bug A code related bug. domain: networking Anything related to Vector's networking domain: performance Anything related to Vector's performance labels Sep 10, 2020
@bruceg bruceg requested a review from jszwedko September 10, 2020 15:15
@bruceg bruceg self-assigned this Sep 10, 2020
Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Are there tests for this behavior?

I reran the concurrency tests and it seems to be handling 429s and 503s well now.

The "drop connections over 10 concurrent" test seems to achieve optimal behavior, but I still would have, naively, expected it to more closely match the 429/503 graphs.

all

// It would be better to avoid generating the string in Retry(_)
// just to throw it away here, but it's probably not worth the
// effort.
let response = response
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this variable be called something like retry_status? It doesn't seem to represent the response anymore after being mapped which confused me initially.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll rename.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Are there tests for this behavior?

Not yet. I'll see about adding that to #3802

I reran the concurrency tests and it seems to be handling 429s and 503s well now.

Yay! One step closer. I will say the throughput graphs look a bit ugly, but eyeballing the average looks like it's about right. It would be good to get some numbers on the average for those two, as they still have some surprisingly deep dips.

Bruce Guenter added 2 commits September 10, 2020 13:54
@bruceg bruceg merged commit 74b43b7 into master Sep 10, 2020
@bruceg bruceg deleted the auto-concurrency-backpressure branch September 10, 2020 20:05
@jszwedko
Copy link
Member

Yay! One step closer. I will say the throughput graphs look a bit ugly, but eyeballing the average looks like it's about right. It would be good to get some numbers on the average for those two, as they still have some surprisingly deep dips.

Agreed, I opened vectordotdev/http_test_server#5 to see about adding that.

@jamtur01 @binarylogic Is it ok to put those issues over there? Or should I open them in the vector repo.

@jamtur01
Copy link
Contributor

Over there seems fine.

JeanMertz added a commit that referenced this pull request Sep 14, 2020
commit 2b08b6d
Author: Nell Shamrell-Harrington <[email protected]>
Date:   Sun Sep 13 23:10:18 2020 -0700

    docs: Updates reference to gitter to point to the Discord server (#3858)

    Signed-off-by: Nell Shamrell <[email protected]>

commit 27ba8b9
Author: James Turnbull <[email protected]>
Date:   Sat Sep 12 17:26:13 2020 -0400

    chore: Removing CI references to Nix (#3844)

    Signed-off-by: James Turnbull <[email protected]>

commit 7c95aa9
Author: James Turnbull <[email protected]>
Date:   Sat Sep 12 17:25:54 2020 -0400

    chore: Fix make prepare-environment (#3841)

commit cc00cf0
Author: James Turnbull <[email protected]>
Date:   Sat Sep 12 13:38:39 2020 -0400

    chore: Added workflow_dispatch for the env workflow (#3842)

commit b7f656e
Author: James Turnbull <[email protected]>
Date:   Sat Sep 12 12:14:21 2020 -0400

    chore: Fix Debian post-install process to reload systemd daemons (#3832)

commit c2078b7
Author: binarylogic <[email protected]>
Date:   Sat Sep 12 10:28:41 2020 -0400

    chore: Remove check-code.sh since it is no longer used

    Signed-off-by: binarylogic <[email protected]>

commit 7887d4e
Author: James Turnbull <[email protected]>
Date:   Fri Sep 11 23:50:56 2020 -0400

    chore: Added cross installation to fix nightly builds (#3831)

commit 648e82c
Author: Ana Hobden <[email protected]>
Date:   Fri Sep 11 17:33:29 2020 -0700

    chore(platforms): Migrate  x86 gnu target to cross, add new aarch64 gnu target (no sasl) (#3657)

    * Add initial cross adoption

    Signed-off-by: Ana Hobden <[email protected]>

    * Add archiving

    Signed-off-by: Ana Hobden <[email protected]>

    * Clean up the makefile a bit and get tarballs working

    Signed-off-by: Ana Hobden <[email protected]>

    * checkpoint

    Signed-off-by: Ana Hobden <[email protected]>

    * wip

    Signed-off-by: Ana Hobden <[email protected]>

    * Add to CI

    Signed-off-by: Ana Hobden <[email protected]>

    * Add autoinstall to CI

    Signed-off-by: Ana Hobden <[email protected]>

    * Remove cruft

    Signed-off-by: Ana Hobden <[email protected]>

    * Can't test yet

    Signed-off-by: Ana Hobden <[email protected]>

    * Autoinstall in e2e

    Signed-off-by: Ana Hobden <[email protected]>

    * Readd docker image

    Signed-off-by: Ana Hobden <[email protected]>

    * More comments

    Signed-off-by: Ana Hobden <[email protected]>

    * Set right CI task

    Signed-off-by: Ana Hobden <[email protected]>

    * Add note about TZ issue

    Signed-off-by: Ana Hobden <[email protected]>

commit b28eb18
Author: Bruce Guenter <[email protected]>
Date:   Fri Sep 11 16:22:58 2020 -0600

    chore(networking): Rework internal auto concurrency tests (#3802)

    * Move auto concurrency test case data into TOML files

    * Use tokio manual time tracking to speed up tests

    By pausing and then manually advancing the clock, this provides for both
    very fast running tests and nearly perfect time measurements for highly
    repeatable test results.

    * Collect all the failures for each test

    This has the collateral benefit of being able to drop the large
    `assert_within` macro that was only used here.

    * The tests should now work predictably on MacOS

    Signed-off-by: Bruce Guenter <[email protected]>

commit b37dea7
Author: Ana Hobden <[email protected]>
Date:   Fri Sep 11 14:53:14 2020 -0700

    chore: Add some additional Event impls (#3621)

    * Move log_event to new file

    Signed-off-by: Ana Hobden <[email protected]>

    * Move value to new file

    Signed-off-by: Ana Hobden <[email protected]>

    * Move log_schema to config

    Signed-off-by: Ana Hobden <[email protected]>

    * chore: Add some event impls

    Signed-off-by: Ana Hobden <[email protected]>

    * Do some namespace squashing

    Signed-off-by: Ana Hobden <[email protected]>

    * Fixup lints

    Signed-off-by: Ana Hobden <[email protected]>

    * Remove From<Event> for Vec<u8>

    Signed-off-by: Ana Hobden <[email protected]>

    * Fixup leveldb

    Signed-off-by: Ana Hobden <[email protected]>

    * fmt

    Signed-off-by: Ana Hobden <[email protected]>

    * Jean's feedback

    Signed-off-by: Ana Hobden <[email protected]>

commit f03898e
Author: Luke Steensen <[email protected]>
Date:   Fri Sep 11 15:53:51 2020 -0500

    enhancement(tokenizer transform): add internal events (#3807)

    Signed-off-by: Luke Steensen <[email protected]>

commit ff555aa
Author: Luke Steensen <[email protected]>
Date:   Fri Sep 11 15:53:27 2020 -0500

    enhancement(dedupe transform): add internal events (#3809)

    Closes #3393

    Signed-off-by: Luke Steensen <[email protected]>
    Co-authored-by: Jean Mertz <[email protected]>

commit 54a4d8e
Author: MOZGIII <[email protected]>
Date:   Fri Sep 11 17:09:57 2020 +0300

    chore: Generate Kubernetes distribution YAMLs from Helm Chart (#3614)

    * Add a script to manage the Kubernetes YAML configs

    Signed-off-by: MOZGIII <[email protected]>

    * Update Helm Chart to generate config more like YAML files

    Signed-off-by: MOZGIII <[email protected]>

    * Switch to generated YAML

    Signed-off-by: MOZGIII <[email protected]>

    * Define check-all target in multiple lines

    Signed-off-by: MOZGIII <[email protected]>

    * Add Kubernetes YAML tasks to Makefile

    Signed-off-by: MOZGIII <[email protected]>

    * Add check-kubernetes-yaml to CI

    Signed-off-by: MOZGIII <[email protected]>

    * Add kubernetes-yaml.sh to .github/CODEOWNERS

    Signed-off-by: MOZGIII <[email protected]>

    * Add /scripts/kubernetes-yaml/ to .github/CODEOWNERS

    Signed-off-by: MOZGIII <[email protected]>

    * Add a TODO for passing the app-version

    Signed-off-by: MOZGIII <[email protected]>

commit 68a5c43
Author: Do Duy <[email protected]>
Date:   Fri Sep 11 10:23:03 2020 +0700

    docs: Add docs for retry_backoff_secs option (#3819)

    Signed-off-by: Duy Do <[email protected]>

commit 1a1352b
Author: James Turnbull <[email protected]>
Date:   Thu Sep 10 17:34:16 2020 -0400

    chore: Remove docker-compose from integration tests (#3622)

commit 94def0f
Author: Luke Steensen <[email protected]>
Date:   Thu Sep 10 16:17:03 2020 -0500

    enhancement(reduce transform): add internal events (#3812)

    Closes #3401

    Signed-off-by: Luke Steensen <[email protected]>

commit 74b43b7
Author: Bruce Guenter <[email protected]>
Date:   Thu Sep 10 14:05:52 2020 -0600

    fix(networking): Rework auto concurrency backpressure logic (#3804)

    Previously, all responses were marked as not being backpressure. This
    changes that logic to:

    1. Treat RetryAction::Retry responses as backpressure

    2. Only use the RTT values from RetryAction::Successful responses

    Signed-off-by: Bruce Guenter <[email protected]>

commit eb09b92
Author: Ana Hobden <[email protected]>
Date:   Thu Sep 10 13:01:23 2020 -0700

    chore: Split up event (#3619)

    * Move log_event to new file

    Signed-off-by: Ana Hobden <[email protected]>

    * Move value to new file

    Signed-off-by: Ana Hobden <[email protected]>

    * Move log_schema to config

    Signed-off-by: Ana Hobden <[email protected]>

    * Do some namespace squashing

    Signed-off-by: Ana Hobden <[email protected]>

    * Fixup lints

    Signed-off-by: Ana Hobden <[email protected]>

    * Fixup leveldb

    Signed-off-by: Ana Hobden <[email protected]>

    * fmt

    Signed-off-by: Ana Hobden <[email protected]>

commit 2fd9ccf
Author: Jesse Szwedko <[email protected]>
Date:   Thu Sep 10 15:34:52 2020 -0400

    chore: Add expected review counts to CONTRIBUTING.md (#3666)

    * chore: Add expected review counts to CONTRIBUTING.md

    Signed-off-by: Jesse Szwedko <[email protected]>
mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020
…tdev#3804)

Previously, all responses were marked as not being backpressure. This
changes that logic to:

1. Treat RetryAction::Retry responses as backpressure

2. Only use the RTT values from RetryAction::Successful responses

Signed-off-by: Bruce Guenter <[email protected]>
Signed-off-by: Brian Menges <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: networking Anything related to Vector's networking domain: performance Anything related to Vector's performance type: bug A code related bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle explicit backpressure in auto-concurrency layer
3 participants