-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: block incoming endpoints from call-me-maybe #55
base: main
Are you sure you want to change the base?
Conversation
Previously, (*magicsock.Conn).SetBlockEndpoints(true) would only prevent local endpoints from being sent to other peers. This setting is primarily used to prevent any direct connection from forming, regardless of which side initiated it, but any connections initiated locally to endpoints received from the other peer would still work. If endpoints are blocked, we will now drop any endpoints we already know of (via call-me-maybe) as well as block any future endpoints received via call-me-maybe. Endpoints received via coordination are not impacted (and should be blocked using a different mechanism).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I tested this on coder by setting |
ep.publicKey.ShortString(), derpStr(src.String()), | ||
len(dm.MyNumber), | ||
) | ||
dm.MyNumber = []netip.AddrPort{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we even delivering the CallMeMaybe? We clear out any previously learned endpoints when we set blockEndpoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for incoming call-me-maybe packets. If only one side has "--disable-direct-connections" then the other side will still send call-me-maybe packets populated with endpoints to the other peer, which is why they still connect directly if they are on the same LAN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearing the endpoints once won't prevent them from being readded to the peer map when the peer sends us another call-me-maybe packet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this is for incoming call-me-maybe. I'm asking why we strip out the endpoints and then process the empty call-me-maybe, rather than just drop the call-me-maybe packets on the floor without further processing. What benefit do we get from sending an empty call-me-maybe thru handleCallMeMaybe()
?
wgengine/magicsock/magicsock.go
Outdated
if block { | ||
c.peerMap.forEachEndpoint(func(ep *endpoint) { | ||
// Simulate receiving a call-me-maybe disco packet with no | ||
// endpoints, which will cause all endpoints to be removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't actually remove all endpoints, just the ones previously discovered by CallMeMaybe. Is there a reason we don't want to just blow them all away?
I feel like we've got a lot of piecemeal tweaks that block endpoints from different sources. Wouldn't it be cleaner just to block them all at the level of de.endpointState
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. Magicsock's "block endpoints" just means "don't send local endpoints to peers via disco and don't accept endpoints from peers via disco". The comment on the option is written this way.
The only other way peers can arrive is if they're written via coordination, which is also written in the comment on the option (and already blocked in coder's tailnet package).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I might look into just wiping out all endpoints rather than only the call-me-maybe ones. It would be nice to get rid of all the blockEndpoints plumbing in the coder tailnet package in favor of a simple implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
When: time.Now(), | ||
What: "clearEndpoints-" + why, | ||
}) | ||
de.endpointState = map[netip.AddrPort]*endpointState{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this, but I think we should pair it with a lower-level implementation of blockEndpoints
Right now, we check this field on the magicsock Conn
before calling into methods on endpoint
that could add an endpoint. There are 3 of these (updateFromNode
, handleCallMeMaybe
and addCandidateEndpoint
), but you've only directly blocked the first two.
So, there is still a narrow race condition where you can set block endpoints, clear them all out, and a Disco Ping could come in and add an endpoint back in.
What do you think about extending endpoint
to also have a blockEndpoints
field that we sync with Conn
, that way we can always directly check whether we are blocking endpoints at the moment we are about to add one?
Previously, (*magicsock.Conn).SetBlockEndpoints(true) would only prevent local endpoints from being sent to other peers.
This setting is primarily used to prevent any direct connection from forming, regardless of which side initiated it, but any connections initiated locally to endpoints received from the other peer would still work.
If endpoints are blocked, we will now drop any endpoints we already know of (via call-me-maybe) as well as block any future endpoints received via call-me-maybe.
Endpoints received via coordination are not impacted (and should be blocked using a different mechanism).
Note; it was difficult to write a test that would fail with the previous behavior but pass with the current behavior because of the test utilities available in Tailscale, but I did get it to work. Notably, there's a single
time.Sleep(time.Second)
, but after trying various other ways to accomplish the same thing I couldn't get anything to work.