Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: MPTCP subflows are reconnected every now and then #526

Closed
kangsenlu opened this issue Oct 21, 2024 · 22 comments
Closed

Question: MPTCP subflows are reconnected every now and then #526

kangsenlu opened this issue Oct 21, 2024 · 22 comments
Labels

Comments

@kangsenlu
Copy link

kangsenlu commented Oct 21, 2024

I have setup mptcp upstream on two hosts. I have enabled mptcp feature, configured both the client host and the server host to use two channels for mptcp subflow purposes. I have seen the client applications openvpn and openssh-client connected to the openvpn and sshd in the server with mptcp connections, and each mptcp connection consists of two subflows of their own.
However, when I run "netstat -net" to monitor all the tcp connections, I noticed all 4 subflows are constantly being established and released somehow. I ran "ip mptcp monitor", the output confirmed what I saw. I have captured the ip mptcp endpoint show output and the in the ip mptcp monitor output here.


On the server host

root@ip-10-0-0-60:/var/log# ip mptcp endpoint show
root@ip-10-0-0-60:/var/log#

Ip mptcp monitor

[SF_ESTABLISHED] token=0c0359a2 remid=1 locid=0 saddr4=198.18.140.1 daddr4=198.18.0.6 sport=9000 dport=54291 backup=0
[SF_ESTABLISHED] token=87d9775f remid=1 locid=0 saddr4=198.18.140.1 daddr4=198.18.0.6 sport=31336 dport=41203 backup=0
[   	REMOVED] token=0c0359a2 remid=2
[ 	SF_CLOSED] token=0c0359a2 remid=2 locid=0 saddr4=198.18.140.1 daddr4=198.18.16.6 sport=9000 dport=33781 backup=1
[   	REMOVED] token=87d9775f remid=2
[ 	SF_CLOSED] token=87d9775f remid=2 locid=0 saddr4=198.18.140.1 daddr4=198.18.16.6 sport=31336 dport=55799 backup=1
[SF_ESTABLISHED] token=0c0359a2 remid=2 locid=0 saddr4=198.18.140.1 daddr4=198.18.16.6 sport=9000 dport=35563 backup=1
[SF_ESTABLISHED] token=87d9775f remid=2 locid=0 saddr4=198.18.140.1 daddr4=198.18.16.6 sport=31336 dport=48043 backup=1
[   	REMOVED] token=0c0359a2 remid=1
[ 	SF_CLOSED] token=0c0359a2 remid=1 locid=0 saddr4=198.18.140.1 daddr4=198.18.0.6 sport=9000 dport=54291 backup=0
[   	REMOVED] token=87d9775f remid=1
[ 	SF_CLOSED] token=87d9775f remid=1 locid=0 saddr4=198.18.140.1 daddr4=198.18.0.6 sport=31336 dport=41203 backup=0
[SF_ESTABLISHED] token=0c0359a2 remid=1 locid=0 saddr4=198.18.140.1 daddr4=198.18.0.6 sport=9000 dport=41839 backup=0
[SF_ESTABLISHED] token=87d9775f remid=1 locid=0 saddr4=198.18.140.1 daddr4=198.18.0.6 sport=31336 dport=40523 backup=0
[    	CLOSED] token=ba6a3cad
[    	CLOSED] token=8265af0f
[   	REMOVED] token=0c0359a2 remid=1
[ 	SF_CLOSED] token=0c0359a2 remid=1 locid=0 saddr4=198.18.140.1 daddr4=198.18.0.6 sport=9000 dport=41839 backup=0
[   	REMOVED] token=87d9775f remid=1
[ 	SF_CLOSED] token=87d9775f remid=1 locid=0 saddr4=198.18.140.1 daddr4=198.18.0.6 sport=31336 dport=40523 backup=0

On the client host

root@10053:~/# ip mptcp e
198.18.0.6 id 1 subflow dev tunUDP1 
198.18.16.6 id 2 subflow backup dev tunUDP2 
@matttbe
Copy link
Member

matttbe commented Oct 22, 2024

Hello,

Which kernel version are you using on both sides? Can you reproduce the issue with the latest stable release? (or the last LTS one?).

(I guess you are using the in-kernel path-manager, right? → Did you change sysctl net.mptcp.pm_type?)

How are your MPTCP endpoints being managed?
The in-kernel PM will close (or re-open) subflows during a connection only when the corresponding endpoint is removed (or re-added).

Note that subflows can also be removed during the connection due to some network issues, but in this case, and for the moment (#440), the in-kernel will not try to re-establish these subflows.

@kangsenlu
Copy link
Author

kangsenlu commented Oct 22, 2024 via email

@matttbe
Copy link
Member

matttbe commented Oct 22, 2024

The client is running openwrt kernel 6.1.71

In your case, this behaviour is likely on the client side, because it is usually the client that initiates the creation of the new subflows.

This kernel is quite old (from January 2024), can you eventually upgrade to a more recent version from the v6.1 series, or 6.6.y or newer?

Quite a lot of fixes have been addressed between the v6.1.71 and the v6.1.113: 79 commits and 11 files changed, 842 insertions(+), 390 deletions(-)

I am just wondering if there is an easy way to have the kernel explicitly or implicitly show some debugging message about what triggered the subflow recycling?

The trigger should come from the userspace (except a bug), via Netlink. You should be able to add an nlmon interface and use tcpdump. Or use perf to debug the kernel, or bpftrace, or show the debug message from net/mptcp/pm_netlink.c, something like:

echo "file net/mptcp/pm_netlink.c +fmp" > /sys/kernel/debug/dynamic_debug/control

@kangsenlu
Copy link
Author

kangsenlu commented Oct 22, 2024 via email

@kangsenlu
Copy link
Author

kangsenlu commented Oct 22, 2024 via email

@matttbe
Copy link
Member

matttbe commented Oct 22, 2024

I ran tcpdump on the server and it seems the subflow establishment is initiated by the client, but subflow release is initiated by the server.

Are you saying that because the 'FIN' is initiated by the server? That's not necessarily true: when an MPTCP endpoint is removed on the client, the client will first notify the other peer with a REMOVE_ADDRESS message (in a TCP ACK). When the server receives this notification, it will remove the corresponding subflow(s).

The creation part is easy to explain, because I put "subflow" option in the "ip mptcp endpoint add" command in the client.

Do you do that manually?

I turned on the pm_netlink.c debugging.

On the server side I guess, right? Here, we can see that a REMOVE_ADDRESS notification is received. So the removal is triggered by the other peer (client I suppose).

@kangsenlu
Copy link
Author

kangsenlu commented Oct 22, 2024 via email

@matttbe
Copy link
Member

matttbe commented Oct 22, 2024

so somehow the client sent the "rem-addr" mptcp option to the server. The question is why, and how to figure out the reason.

If I'm not mistaken, this RM_ADDR is only sent by the in-kernel PM when an endpoint is removed (or flushed).

To come back to a previous question: how are your MPTCP endpoints managed? Manually or via an automated tool (Network-Manager? mptcpd?) or script?

@kangsenlu
Copy link
Author

kangsenlu commented Oct 22, 2024 via email

@kangsenlu
Copy link
Author

I am puzzled by the client side "ip mptcp monitor" output. There are only two endpoints, but the monitir output seems to suggest that there are 3 locid, 0,1,2. Where does that locid=0 come from?

// ip mptcp monitor at the client:

[ SF_CLOSED] token=1f1c87d0 remid=0 locid=0 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=60178 dport=9000 backup=0
[ SF_CLOSED] token=1f1c87d0 remid=0 locid=1 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=34521 dport=9000 backup=0 ifindex=14
[ SF_ESTABLISHED] token=1f1c87d0 remid=0 locid=1 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=32003 dport=9000 backup=0 ifindex=14
[ SF_CLOSED] token=1f1c87d0 remid=0 locid=2 saddr4=198.18.16.6 daddr4=198.18.140.1 sport=42045 dport=9000 backup=1 ifindex=1
6
[ SF_ESTABLISHED] token=1f1c87d0 remid=0 locid=2 saddr4=198.18.16.6 daddr4=198.18.140.1 sport=45411 dport=9000 backup=1 ifindex=1
6
[ SF_CLOSED] token=1f1c87d0 remid=0 locid=0 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=60178 dport=9000 backup=0
[ SF_CLOSED] token=1f1c87d0 remid=0 locid=1 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=32003 dport=9000 backup=0 ifindex=14
[ SF_ESTABLISHED] token=1f1c87d0 remid=0 locid=1 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=27077 dport=9000 backup=0 ifindex=14
[ SF_CLOSED] token=1f1c87d0 remid=0 locid=2 saddr4=198.18.16.6 daddr4=198.18.140.1 sport=45411 dport=9000 backup=1 ifindex=1
6
[ SF_ESTABLISHED] token=1f1c87d0 remid=0 locid=2 saddr4=198.18.16.6 daddr4=198.18.140.1 sport=29427 dport=9000 backup=1 ifindex=1
6
[ SF_CLOSED] token=1f1c87d0 remid=0 locid=2 saddr4=198.18.16.6 daddr4=198.18.140.1 sport=29427 dport=9000 backup=1 ifindex=1
6
[ SF_ESTABLISHED] token=1f1c87d0 remid=0 locid=2 saddr4=198.18.16.6 daddr4=198.18.140.1 sport=49413 dport=9000 backup=1 ifindex=1
6
[ SF_CLOSED] token=1f1c87d0 remid=0 locid=0 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=60178 dport=9000 backup=0
[ SF_CLOSED] token=1f1c87d0 remid=0 locid=1 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=27077 dport=9000 backup=0 ifindex=14
[ SF_ESTABLISHED] token=1f1c87d0 remid=0 locid=1 saddr4=198.18.0.6 daddr4=198.18.140.1 sport=25739 dport=9000 backup=0 ifindex=14

—---------------------------------------

root@10053:~/# ip mptcp e
198.18.0.6 id 1 subflow dev tunUDP1
198.18.16.6 id 2 subflow backup dev tunUDP2

—-----------------------------------------------------------

@matttbe
Copy link
Member

matttbe commented Oct 23, 2024

But how come right after the old subflow is removed, almost immediately the same subflow is created again?

I don't know. Except a bug, (or something I missed), this should only be triggered by the userspace changing the MPTCP endpoints. Are you sure there is nothing else that would modify the MPTCP endpoints? (mptcpd, network-manager, scripts, etc.)

If you cannot find anything doing that (tcpdump with NLMon might help), it would really help if you can update the kernel, at least to a more recent 6.1.y version.

There are only two endpoints, but the monitir output seems to suggest that there are 3 locid, 0,1,2. Where does that locid=0 come from?

The addresses of the initial subflow have a special ID: 0. That's in MPTCP specs.
The kernel should re-use the ID 0 instead of switching to 1 because it is the same address: that's not a big deal, but that's something that has been fixed a few months ago, and should be in the last v6.1.y release.

@kangsenlu
Copy link
Author

kangsenlu commented Oct 29, 2024 via email

@kangsenlu
Copy link
Author

kangsenlu commented Oct 29, 2024

I just build the openwrt image with nlmon kernel module. Then I issued the following commands at the openwrt shell prompt:

# ip link add nlmon0 type nlmon
# ip link set nlmon0 up
# tcpdump -i nlmon0 -w nlmsg.pcap

Then I ran wireshark to examine the nlmon.pcap file. I am not sure how to find readable information. Can you help, Matthieu?

@kangsenlu
Copy link
Author

I have installed nldecap.py on a ubuntu host, but it does not work now. My host runs python3.9.

@matttbe
Copy link
Member

matttbe commented Oct 29, 2024

I did find something odd, I have two mptcp connections, one runs between openvpn, and the other runs between ssh. Each mptcp connection consists of two subflows. If I kill the openvpn connection, then the subflows between ssh are stable. In any other combinations, the subflows are released and reestablished.

Mmh, in your setup, tunUDP1 and tunUDP2 are UDP tunnels, not TCP, right?

In ip mptcp monitor logs, we can see ifindex=1 or =14. You can see which interface they correspond with by looking at the output of ip link. Can you confirm the right interfaces are being used?

Then I ran wireshark to examine the nlmon.pcap file. I am not sure how to find readable information.

Sorry, I don't remember what you are supposed to see exactly. But the idea is to check if MPTCP endpoints are being changed by the userspace. When you do ip mptcp endpoint add (or del) you should see Netlink message with tcpdump (or wireshark): can you see them later (same type) when you see that subflows are being re-established and when you don't manually type these ip mptcp endpoint commands?

@matttbe
Copy link
Member

matttbe commented Oct 29, 2024

Also, why do you need OpenVPN? Typically, MPTCP is used with a proxy, e.g. a transparent proxy intercepting the TCP connections created by a client connected to the router, then communicating using MPTCP with a proxy server, before reaching the end server. What is described on OpenMPTCPRouter website for example.

In this case, you don't need to use MPTCP through the OpenVPN tunnel.

@kangsenlu
Copy link
Author

I did a test of running tcpdump to capture the nlmon0 interface. Then I did "ip mptcp endpoint delete" to delete one of the two mptcp endpoints. After I examined the pcap file with wireshark, the wireshark does not show anything noticeable. I tried to run nldecap.py on ubuntu 24.04, and it does not run properly.
I am looking at the ip mptcp monitor output, and I only see locid 0 and 2, not 1. Could that be the cause of the subflows being relesaed and re-etablished? Even running kernel 6.1.109.

@matttbe
Copy link
Member

matttbe commented Nov 4, 2024

I did a test of running tcpdump to capture the nlmon0 interface. Then I did "ip mptcp endpoint delete" to delete one of the two mptcp endpoints. After I examined the pcap file with wireshark, the wireshark does not show anything noticeable. I tried to run nldecap.py on ubuntu 24.04, and it does not run properly.

Can you eventually share (a zipped version of) this file here please?

Do you have the issue without OpenVPN tunnels (see my previous comment: #526 (comment)).

I am looking at the ip mptcp monitor output, and I only see locid 0 and 2, not 1. Could that be the cause of the subflows being relesaed and re-etablished?

No, that's what you should see when using an up-to-date kernel: the addresses of the initial subflows should have this special local/remote ID 0. See #526 (comment)

@kangsenlu
Copy link
Author

Matttbe:

Thanks for the help.
I am attaching three files for your viewing: (1) ip link show; (2) ip mptcp show; (3) compressed nlmon packets. They are all in the debug.tar.gz.

debug.tar.gz

@matttbe
Copy link
Member

matttbe commented Nov 7, 2024

@kangsenlu : with Paolo's help, we looked at the nlmon traces:

  • there are a lot of events there in one minutes, that's strange
  • it looks like there are a lot of "route" events, and many "Create network interface" type: do you mind checking if it is normal please? Maybe the interfaces are reloaded in a loop?
  • when we use the genl.family_id == 0x15 filter on Wireshark, we can see MPTCP PM commands initiated by the userspace (or events?):
    • every 10 seconds, we can see this sequence of commands (doesn't look like events): get endpoints (+ reply), delete endpoint, destroy two subflows (?), add endpoint, add two subflows (?)
    • if what we interpreted is correct, it looks like something on your system is periodically removing and re-adding endpoints. Any idea what's causing that? (I guess NetworkManager is not running on OpenWRT, maybe mptcpd daemon? A test script? Something else?)

Just to be sure this has been interpreted correctly, it might help to:

  • make sure no ip mptcp monitor (nor ip monitor) is running → maybe even better if you can stop a maximum of services, e.g. no OpenVPN, no monitor services, etc.
  • create the nlmon interface, and start the capture with tcpdump
  • quickly run the command to add a "dummy" endpoint manually, e.g. ip mptcp endpoint add 1.2.3.4 dev eth0 and stop tcpdump directly after.
  • then share the trace here.
    Hopefully with this, there should not be many events and commands, and we can see the new dummy endpoint being created with this filter on Wireshark: (genl.family_id == 0x15) && (genl.cmd == 1)

@matttbe matttbe added the feedback-needed Waiting for a reply label Nov 7, 2024
@kangsenlu
Copy link
Author

Hi, Matthieu:
Thanks for your helping effort. Finally we have been able to locate the root cause of the subflow being released and re-established about every 11 seconds. It is our own script that monitors the subflow status. But by accident, out script would delete endpoints and add endpoints back. Once we know the script that caused the problem, it is not too hard to "fix" it.
My only regret is that this script that deletes and adds mptcp endpoints is so evasive to locate. If thru nlmon.pcap, we can find the PID of this shell script, that would save us a lot of time to debug. So my question is that from the nlmon.pcap, can we find the PID of the process that issued the command "ip mptcp endpoint delete" to the kernel?
Thanks a lot for your helping effort, again.

@matttbe
Copy link
Member

matttbe commented Nov 8, 2024

It is our own script that monitors the subflow status. But by accident, out script would delete endpoints and add endpoints back.

That makes sense, the only thing that could cause the behaviour you had.

So my question is that from the nlmon.pcap, can we find the PID of the process that issued the command "ip mptcp endpoint delete" to the kernel?

Not that I think of no, but I rarely use nlmon. But I guess you will only make this mistake once ;)

I guess we can close this issue then.

@matttbe matttbe closed this as completed Nov 8, 2024
@matttbe matttbe removed the feedback-needed Waiting for a reply label Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants