Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mirrord connection reconnect #3000

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

DmitryDodzin
Copy link
Member

@DmitryDodzin DmitryDodzin commented Jan 2, 2025

Add reconnect option for operator to fix #2901

@Razz4780 Razz4780 self-requested a review January 21, 2025 14:20
@DmitryDodzin DmitryDodzin marked this pull request as ready for review January 23, 2025 14:49
@DmitryDodzin DmitryDodzin changed the title Update mirrord to agent port-forward connection wrapper Mirrord connection reconnect Jan 23, 2025
@Razz4780 Razz4780 requested a review from gememma January 27, 2025 10:37
Copy link
Member

@gememma gememma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly questions for my own understanding, and a couple of bits and pieces :)

changelog.d/+update-single-portforward.changed.md Outdated Show resolved Hide resolved
mirrord/intproxy/src/background_tasks.rs Outdated Show resolved Hide resolved
mirrord/intproxy/src/background_tasks.rs Show resolved Hide resolved
mirrord/intproxy/src/background_tasks.rs Outdated Show resolved Hide resolved
mirrord/intproxy/src/lib.rs Show resolved Hide resolved
mirrord/intproxy/src/main_tasks.rs Outdated Show resolved Hide resolved
mirrord/kube/src/api/kubernetes/portforwarder.rs Outdated Show resolved Hide resolved
mirrord/kube/src/api/kubernetes/portforwarder.rs Outdated Show resolved Hide resolved
Copy link
Member

@gememma gememma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noice 🦅

Copy link
Contributor

@Razz4780 Razz4780 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond what's in the comments:

  1. I think we need to check if operator supports reconnecting (new operator feature). Otherwise things will get wild.
  2. FilesProxy is never informed about the reconnect, and it is very stateful (queues expected responses, caches dir entries and file contents). This scares me 👀 Same applies to the ping pong task.
  3. Not that I want to force my idea, but I think we can do sth simpler:
    1. Let the AgentConn task fail in peace.
    2. When IntProxy picks it up, it enters a resetting state. It notifies all relevant tasks with a new message variant (e.g Reset). Then, it makes a new agent connection and starts a new AgentConn task. For each task that was sent a Reset, IntProxy memorizes that it expects one more ResetAck from it.
    3. When a task gets the Reset command, it immediately responds with ResetAck. Then it does all necessary housekeeping, like cleaning state, sending error responses to the layer (for requests in progress), resending port subscriptions to the agent, dropping local connections, resetting ping pong state.
    4. Intproxy ignores all messages from a resettable task until it gets expected count of ResetAcks.
    5. Due to layer storing remote file descriptors, we'd have to create an independent mapping in the FilesProxy and make sure not to let through any requests for unknown remote fds.
    6. If it doesn't do that yet, BackgroundTasks should remove the task's receiver from the streammap when this task exits (in order to drop all previous messages)

Let me know what you think 'bout this

mirrord/kube/src/api/kubernetes/portforwarder.rs Outdated Show resolved Hide resolved
mirrord/intproxy/src/proxies/simple.rs Outdated Show resolved Hide resolved
mirrord/intproxy/src/proxies/incoming/subscriptions.rs Outdated Show resolved Hide resolved
mirrord/intproxy/src/proxies/incoming.rs Outdated Show resolved Hide resolved
Comment on lines 545 to 551
for subscription in self.subscriptions.iter_mut() {
tracing::debug!(?subscription, "resubscribing");

for message in subscription.resubscribe() {
message_bus.send(ProxyMessage::ToAgent(message)).await
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as IncomingProxy is concerned, these subscriptions were already confirmed. This looks kind of sus, I'd check out handling of DaemonTcp::PortSubscribe

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is assuming that all agent side connections are dead thus we need someone to ask the agent to perform the subscription again so user's app isn't required to re-listen on incoming socket (like intproxy will just recreate the incoming tasks with new listens results)

mirrord/intproxy/src/background_tasks.rs Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proxy randomly closes during connection
3 participants