Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Blackhole implementation for e2e tests #17938

Conversation

henrybear327
Copy link
Contributor

@henrybear327 henrybear327 commented May 5, 2024

Based on Fu Wei's idea discussed in the issue [1], we employ the network traffic blocking on L7, using a forward proxy, without the need to use external tools.

[Background]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing reverse proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

[Proposed solution]

We introduce a forward proxy for each peer, which will be proxying all the connections initiated from a peer to its peers.

We will remove the current use of the reverse proxy, as the forward proxy holds the information of the destination, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables.

The modified architecture will look something like this:

A --- A's forward proxy ----- B
   ^ newly introduced

It's verified that the blocking of traffic is complete, compared to previous solutions attempted in PRs [2][3].

[Implementation]

The main subtasks are

  • redesigned as an L7 forward proxy
  • Unix socket support is dropped: e2e test supports unix sockets for peer communication, but only several e2e test cases use Unix sockets as majority of e2e test cases use HTTP/HTTPS
  • introduce a new environment variable E2E_TEST_FORWARD_PROXY_IP
  • implement L7 forward proxy by drastically reducing the existing proxy server code and design to use blocking traffic

Known limitations are

  • Doesn't support unix socket (L7 HTTP transport proxy only supports HTTP/HTTPS/and socks5)
  • It's L7 so we need to send a perfectly crafted HTTP request
    -Doesn’t support reordering, dropping, etc. packets on-the-fly

[Testing]

  • make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1
  • make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1
  • go test -timeout 30s -run ^TestServer_ go.etcd.io/etcd/pkg/v3/proxy -v -failfast

[References]
[1] issue #17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) #17891
[4] #17938 (comment)
[5] #17985 (comment)

@k8s-ci-robot
Copy link

Hi @henrybear327. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@henrybear327 henrybear327 force-pushed the fix/e2e_blackhole_proxy_review_forward_proxy_review branch from 9887b13 to 3e15f96 Compare May 5, 2024 15:12
@henrybear327
Copy link
Contributor Author

/cc @siyuanfoundation @fuweid

@henrybear327 henrybear327 force-pushed the fix/e2e_blackhole_proxy_review_forward_proxy_review branch 6 times, most recently from e94e96d to 48d1936 Compare May 5, 2024 15:44
@ivanvc
Copy link
Member

ivanvc commented May 5, 2024

/retitle Fix Blackhole implementation for e2e tests

@k8s-ci-robot k8s-ci-robot changed the title Fix Blockhole implementation for e2e tests Fix Blackhole implementation for e2e tests May 5, 2024
pkg/proxy/server.go Outdated Show resolved Hide resolved
pkg/proxy/server.go Outdated Show resolved Hide resolved
@henrybear327 henrybear327 force-pushed the fix/e2e_blackhole_proxy_review_forward_proxy_review branch from 48d1936 to 6158507 Compare May 6, 2024 05:16
pkg/proxy/server.go Outdated Show resolved Hide resolved
@serathius
Copy link
Member

serathius commented May 6, 2024

Not a networking expert, would like to get some help.

cc @aojea
Can you help review this?

@henrybear327 henrybear327 force-pushed the fix/e2e_blackhole_proxy_review_forward_proxy_review branch 3 times, most recently from 763b0d3 to 465b00c Compare May 6, 2024 16:09
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 6, 2024
Thanks to Fu Wei for the input regarding etcd-io#17938.

[Problem]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

[Main idea]

We introduce 1 shared "HTTP proxy" for all peers. All peers will be proxying all the connections through it.

The modified architecture will look something like this:
```
A -- shared HTTP proxy ----- B
     ^ newly introduced
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`, because go will not parse HTTP_PROXY and HTTPS_PROXY that is using localhost or 127.0.0.1, regardless if the port is present or not
- implement the shared HTTP proxy by extending the existing proxy server code
- remove existing proxy setup (the per-peer proxy)
- implement enable/disable of the HTTP proxy in the e2e test

[Testing]

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[References]
[1] Tracking issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] PR (V3) etcd-io#17938

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 6, 2024
Thanks to Fu Wei for the input regarding etcd-io#17938.

[Problem]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

[Main idea]

We introduce 1 shared HTTP proxy for all peers. All peers will be proxying all the connections through it.

The modified architecture will look something like this:
```
A -- shared HTTP proxy ----- B
     ^ newly introduced
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`, because go will not parse HTTP_PROXY and HTTPS_PROXY that is using localhost or 127.0.0.1, regardless if the port is present or not
- implement the shared HTTP proxy by extending the existing proxy server code (we need to be able to identify the source sender)
- remove existing proxy setup (the per-peer proxy)
- implement enable/disable of the HTTP proxy in the e2e test

[Testing]

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[References]
[1] Tracking issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] PR (V3) etcd-io#17938

Signed-off-by: Chun-Hung Tseng <[email protected]>
pkg/proxy/server.go Outdated Show resolved Hide resolved
@serathius
Copy link
Member

Talked with @aojea, proposed L7 forward proxy is an improvement over the previous faulty approach, however it might not be necessarily what we want. We want to simulate a realistic network disruptions, and there is no way to do that using a L7 proxy. This is currently impossible with current setup where etcd processes communicate via localhost, to modify packets we need them to go through kernel network stack, and that's not possible if they share the same network stack.

Unfortunately setting dedicated network stack this is pretty complicated, @aojea suggest using docker which sets it up for each container, however I'm worried about how much using docker will complicate etcd e2e testing. I would prefer to set up network ourselves, however this makes it less portable. Main benefit of using docker is portability across platforms.

As we won't be able to build such solution anytime soon, I'm ok with still using L7 http proxy in the meantime.

Signed-off-by: Chun-Hung Tseng <[email protected]>
@henrybear327 henrybear327 force-pushed the fix/e2e_blackhole_proxy_review_forward_proxy_review branch from 90c71a5 to b97050b Compare September 25, 2024 14:25
@henrybear327
Copy link
Contributor Author

/retest

@henrybear327 henrybear327 marked this pull request as ready for review September 25, 2024 14:37
@henrybear327
Copy link
Contributor Author

henrybear327 commented Sep 25, 2024

Hey @fuweid @serathius @ahrtr @ivanvc, sorry for the long wait... but this PR is finally in shape for another pass! :)

Sorry that it's a bit hard to break it down into smaller chunks, but if it's absolutely a must, I will think hard about it. :(

Thank you!

@henrybear327
Copy link
Contributor Author

/retest

@serathius
Copy link
Member

This PR is too big to be reviewed in the current state, I see you have already nicely separate the commits, maybe we could take those commits and merge them independently. For example:

  • Remove pose
  • Remove packet modification
  • Improve naming
  • Remove latency accept

Removal of unused L4 proxy features seems ok to be done even before we fix the blackhole.

Please feel free to create a PR for each commit and assign them for me to review. It should be very quick to merge.

@henrybear327
Copy link
Contributor Author

This PR is too big to be reviewed in the current state, I see you have already nicely separate the commits, maybe we could take those commits and merge them independently. For example:

  • Remove pose
  • Remove packet modification
  • Improve naming
  • Remove latency accept

Removal of unused L4 proxy features seems ok to be done even before we fix the blackhole.

Please feel free to create a PR for each commit and assign them for me to review. It should be very quick to merge.

Thanks Marek for the ideas! :)

I will do that tonight!

henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature. Also, the initial
implementation has a bug: if connections are not created continuously,
the latency accept will not work. Consider the following case:
a) set latency accept
b) put latency accept into effect
c) latency accept will start idling the goroutine
d) block-wait at accept() - waiting for new connections
e) new connection comes in - establish it
f) go to c -> as we can see, if the request come every x seconds, where
x is larger than the latency accept time we set, we can see that the
latency accept has no effect.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the features targeting L4 reverse proxy.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Based on the ideas discussed in the issues [1] and PRs [2][3][6],
we switch from using a L4 reverse proxy to a L7 forward proxy
to properly block peer network traffic, without the need to use
external tools.

The design aims to implement only the minimal required features that
satisfies blocking incoming and outgoing peer traffic. Complicated
features such as packet reordering, packet delivery delay, etc. to
a future container-based solution.

[Background]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing reverse proxy is insufficient, since only scenario (a) is
handled, and network traffic in scenario (b) is not blocked at all.

[Proposed solution]

We introduce a L7 forward proxy for each peer, which will be proxying
all the connections initiated from a peer to its peers.

We will remove the current use of the L4 reverse proxy, as the L7
forward proxy holds the information of the destination, we can block
all incoming and outgoing traffic that is initiated from a peer to
others, without having to resort to external tools, such as iptables.

The modified architecture will look something like this:
```
A --- A's forward proxy --- B
   ^ newly introduced
```

[Implementation]

The main subtasks are
- redesigned as an L7 forward proxy
- introduce a new environment variable `E2E_TEST_FORWARD_PROXY_IP` to
bypass the limitation of with http.ProxyFromEnvironment
- implement a L7 forward proxy

Known limitations are
- Doesn't support unix socket, as L7 HTTP transport proxy only support
HTTP/HTTPS/and socks5 -> although e2e test supports unix sockets for
peer communication, but only few of the e2e test cases use unix sockets
as majority of e2e test cases use HTTP/HTTPS. It's been discussed and
decided that without the unix socket support is ok for now.
- it's L7 so we need to send a perfectly crafted HTTP request
- doesn’t support reordering, dropping, manipulating packets on-the-fly

[Testing]
- `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `go test -timeout 30s -run ^TestServer_ go.etcd.io/etcd/pkg/v3/proxy -v -failfast`

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] etcd-io#17938 (comment)
[5] etcd-io#17985 (comment)
[6] etcd-io#17938

Signed-off-by: Siyuan Zhang <[email protected]>
Signed-off-by: Iván Valdés Castillo <[email protected]>
Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Based on the ideas discussed in the issues [1] and PRs [2][3][6], we
switch from using an L4 reverse proxy to an L7 forward proxy to properly
block peer network traffic, without the need to use external tools.

The design aims to implement only the minimal required features that
satisfy blocking incoming and outgoing peer traffic. Complicated features
such as packet reordering, packet delivery delay, etc. to a future
container-based solution.

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing reverse proxy is insufficient, since only scenario (a) is
handled, and network traffic in scenario (b) is not blocked at all.

We introduce an L7 forward proxy for each peer, which will be proxying
all the connections initiated from a peer to its peers.

We will remove the current use of the L4 reverse proxy, as the L7
forward proxy holds the information of the destination, we can block all
incoming and outgoing traffic that is initiated from a peer to others,
without having to resort to external tools, such as iptables.

The modified architecture will look something like this:
```
A --- A's forward proxy --- B
   ^ newly introduced
```

The main subtasks are
- redesigned as an L7 forward proxy
- introduce a new environment variable `E2E_TEST_FORWARD_PROXY_IP` to
bypass the limitation of `http.ProxyFromEnvironment`
- implement an L7 forward proxy

Known limitations are
- Doesn't support unix socket, as L7 HTTP transport proxy only supports
HTTP/HTTPS/and socks5 -> Although the e2e test supports unix sockets for
peer communication, only a few of the e2e test cases use unix sockets as
the majority of e2e test cases use HTTP/HTTPS. It's been discussed and
decided that it is ok for now without the unix socket support.
- it's L7 so we need to send a perfectly crafted HTTP request
- doesn’t support reordering, dropping, or manipulating packets on the
fly

- `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `go test -timeout 30s -run ^TestServer_ go.etcd.io/etcd/pkg/v3/proxy -v -failfast`

[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] etcd-io#17938 (comment)
[5] etcd-io#17985 (comment)
[6] etcd-io#17938

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

Signed-off-by: Siyuan Zhang <[email protected]>
Signed-off-by: Iván Valdés Castillo <[email protected]>
Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 25, 2024
Based on the ideas discussed in the issues [1] and PRs [2][3][6], it's
been decided to switch from using an L4 reverse proxy to an L7 forward
proxy to properly block peer network traffic, without the need to use
external tools.

The design aims to implement only the minimal required features that
satisfy blocking incoming and outgoing peer traffic. Complicated features
such as packet reordering, packet delivery delay, etc. to a future
container-based solution.

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing reverse proxy is insufficient, since only scenario (a) is
handled, and network traffic in scenario (b) is not blocked at all.

We introduce an L7 forward proxy for each peer, which will be proxying
all the connections initiated from a peer to its peers.

We will remove the current use of the L4 reverse proxy, as the L7
forward proxy holds the information of the destination, we can block all
incoming and outgoing traffic that is initiated from a peer to others,
without having to resort to external tools, such as iptables.

The modified architecture will look something like this:
```
A --- A's forward proxy --- B
   ^ newly introduced
```

The main subtasks are
- redesigned as an L7 forward proxy
- introduce a new environment variable `E2E_TEST_FORWARD_PROXY_IP` to
bypass the limitation of `http.ProxyFromEnvironment`
- implement an L7 forward proxy

Known limitations are
- Doesn't support unix socket, as L7 HTTP transport proxy only supports
HTTP/HTTPS/and socks5 -> Although the e2e test supports unix sockets for
peer communication, only a few of the e2e test cases use unix sockets as
the majority of e2e test cases use HTTP/HTTPS. It's been discussed and
decided that it is ok for now without the unix socket support.
- it's L7 so we need to send a perfectly crafted HTTP request
- doesn’t support reordering, dropping, or manipulating packets on the
fly

- `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `go test -timeout 30s -run ^TestServer_ go.etcd.io/etcd/pkg/v3/proxy -v -failfast`

[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] etcd-io#17938 (comment)
[5] etcd-io#17985 (comment)
[6] etcd-io#17938

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

Signed-off-by: Siyuan Zhang <[email protected]>
Signed-off-by: Iván Valdés Castillo <[email protected]>
Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 26, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 26, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 26, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature. Also, the initial
implementation has a bug: if connections are not created continuously,
the latency accept will not work. Consider the following case:
a) set latency accept
b) put latency accept into effect
c) latency accept will start idling the goroutine
d) block-wait at accept() - waiting for new connections
e) new connection comes in - establish it
f) go to c -> as we can see, if the request come every x seconds, where
x is larger than the latency accept time we set, we can see that the
latency accept has no effect.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 26, 2024
Part of the patches to fix etcd-io#17737

During the development of etcd-io#17938,
we agreed that during the transition to L7 forward proxy, unused
features and features targeting L4 reverse proxy will be dropped.

This feature falls under the unused feature.

Signed-off-by: Chun-Hung Tseng <[email protected]>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request Sep 26, 2024
Based on the ideas discussed in the issues [1] and PRs [2][3][6], it's
been decided to switch from using an L4 reverse proxy to an L7 forward
proxy to properly block peer network traffic, without the need to use
external tools.

The design aims to implement only the minimal required features that
satisfy blocking incoming and outgoing peer traffic. Complicated features
such as packet reordering, packet delivery delay, etc. to a future
container-based solution.

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing reverse proxy is insufficient, since only scenario (a) is
handled, and network traffic in scenario (b) is not blocked at all.

We introduce an L7 forward proxy for each peer, which will be proxying
all the connections initiated from a peer to its peers.

We will remove the current use of the L4 reverse proxy, as the L7
forward proxy holds the information of the destination, we can block all
incoming and outgoing traffic that is initiated from a peer to others,
without having to resort to external tools, such as iptables.

The modified architecture will look something like this:
```
A --- A's forward proxy --- B
   ^ newly introduced
```

The main subtasks are
- redesigned as an L7 forward proxy
- introduce a new environment variable `E2E_TEST_FORWARD_PROXY_IP` to
bypass the limitation of `http.ProxyFromEnvironment`
- implement an L7 forward proxy

Known limitations are
- Doesn't support unix socket, as L7 HTTP transport proxy only supports
HTTP/HTTPS/and socks5 -> Although the e2e test supports unix sockets for
peer communication, only a few of the e2e test cases use unix sockets as
the majority of e2e test cases use HTTP/HTTPS. It's been discussed and
decided that it is ok for now without the unix socket support.
- it's L7 so we need to send a perfectly crafted HTTP request
- doesn’t support reordering, dropping, or manipulating packets on the
fly

- `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `go test -timeout 30s -run ^TestServer_ go.etcd.io/etcd/pkg/v3/proxy -v -failfast`

[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] etcd-io#17938 (comment)
[5] etcd-io#17985 (comment)
[6] etcd-io#17938

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

Signed-off-by: Siyuan Zhang <[email protected]>
Signed-off-by: Iván Valdés Castillo <[email protected]>
Signed-off-by: Chun-Hung Tseng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

8 participants