Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add autonat v2 spec #538

Merged
merged 26 commits into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
d663611
add autonat v2 spec
sukunrt Apr 12, 2023
1db8613
use priority ordered list in requests for autonat-v2
sukunrt Apr 15, 2023
0ff8ac6
only send index of the dialed address
sukunrt Apr 21, 2023
f2a431c
accept a priority ordered list of addresses for dial requests
sukunrt Apr 21, 2023
62123df
Improve naming for messages
sukunrt Apr 25, 2023
0771bab
add interaction diagram
sukunrt Apr 25, 2023
3e57202
address review comments
sukunrt Apr 27, 2023
f6def9a
allow autonat v2 to dial all ips (#542)
sukunrt May 3, 2023
b769d79
use oneof for messages
sukunrt May 3, 2023
e4efaae
use a single nonce in DialRequest
sukunrt May 15, 2023
f28511c
drop terms node and peer in favour of client and server
sukunrt Jul 11, 2023
d4da279
client should not reuse listen port while making dial request
sukunrt Jul 11, 2023
5b8d37d
explicitly disallow probing for private addresses
sukunrt Jul 11, 2023
05a0de2
add explanation for amplification attack prevention mechanism
sukunrt Jul 12, 2023
8b52643
add recommendation for 30k - 100k bytes
sukunrt Jul 16, 2023
dd2750c
move DialStatus proto out of Response
sukunrt Aug 12, 2023
4e6ecaa
wrap data sent for amplification attack prevention in a protobuf
sukunrt Aug 12, 2023
2af3309
send only a single dial status
sukunrt Aug 16, 2023
6b1604b
add comment regarding nonce
sukunrt Aug 18, 2023
f979fac
rename attempt to dial-back
sukunrt Aug 20, 2023
209b215
fix ResponseStatus_OK
sukunrt Aug 21, 2023
094089b
IPv4 only servers should refuse requests for IPv6 addresses
sukunrt Sep 6, 2023
b4a856b
fix dial-request protocol name
sukunrt Oct 30, 2023
1c76613
add a response to the dialback stream
sukunrt Feb 5, 2024
03718ef
allow the client to send slightly more dial data
sukunrt Jun 20, 2024
0195203
add note that server should not dial any private address
sukunrt Oct 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions autonat/README.md
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# NAT Discovery <!-- omit in toc -->
> How we detect if we're behind a NAT.


Specifications:
- [autonat v1](autonat-v1.md)
- [autonat v2](autonat-v2.md)
3 changes: 0 additions & 3 deletions autonat/autonat-v1.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
# NAT Discovery <!-- omit in toc -->
> How we detect if we're behind a NAT.

| Lifecycle Stage | Maturity | Status | Latest Revision |
|-----------------|----------------|--------|-----------------|
| 3A | Recommendation | Active | r1, 2023-02-16 |
Expand Down
207 changes: 207 additions & 0 deletions autonat/autonat-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
# AutonatV2: spec
sukunrt marked this conversation as resolved.
Show resolved Hide resolved


| Lifecycle Stage | Maturity | Status | Latest Revision |
|-----------------|--------------------------|--------|-----------------|
| 1A | Working Draft | Active | r2, 2023-04-15 |

Authors: [@sukunrt]

Interest Group: [@marten-seemann], [@marcopolo], [@mxinden]

[@sukunrt]: https://github.com/sukunrt
[@marten-seemann]: https://github.com/marten-seemann
[@mxinden]: https://github.com/mxinden
[@marcopolo]: https://github.com/marcopolo


## Overview

A priori, a node cannot know if it is behind a NAT / firewall or if it is
publicly reachable. Knowing its NAT status is essential for the node to be
well-behaved in the network: A node that's behind a NAT / firewall doesn't need
to advertise its (undialable) addresses to the rest of the network, preventing
superfluous dials from other peers. Furthermore, it might actively seek to
improve its connectivity by finding a relay server, which would allow other
peers to establish a relayed connection.

In `autonat v2` client sends a priority ordered list of addresses. On receiving
this list the server dials the first address on the list that it is capable of
dialing. `autonat v2` allows nodes to determine reachability for individual
addresses. Using `autonat v2` nodes can build an address pipeline where they can
test individual addresses discovered by different sources like identify, upnp
mappings, circuit addresses etc for reachability. Having a priority ordered list
of addresses provides the ability to verify low priority addresses.
Implementations can generate low priority address guesses and add them to
requests for high priority addresses as a nice to have. This is especially
helpful when introducing a new transport. Initially, such a transport will not
be widely supported in the network. Requests for verifying such addresses can be
reused to get information about other addresses

Compared to `autonat v1` there are two major differences
1. `autonat v1` allowed testing reachability for the node. `autonat v2` allows
testing reachability for an individual address
2. `autonat v2` provides a mechanism for nodes to verify whether the peer
actually successfully dialled an address.


## AutoNAT V2 Protocol
sukunrt marked this conversation as resolved.
Show resolved Hide resolved


![Autonat V2 Interaction](autonat-v2.svg)


A node wishing to determine reachability of its adddresses sends a `DialRequest`
message to a peer on a stream with protocol ID
`/libp2p/autonat/2.0.0/dial`.

This `DialRequest` message has a list of `Candidate`s. Each item in
this list contains an address and a fixed64 nonce. The list is ordered in
descending order of priority for verfication.

Upon receiving this message the peer attempts to dial the first candidate from
the list of candidates that it is capable of dialing. It dials the candidate
address, opens a stream with Protocol ID `/libp2p/autonat/2.0.0/attempt` and
sends a `DialAttempt` message with the candidate nonce. The peer MUST NOT dial
any candidate other than the first candidate in the list that it is capable of
dialing.

Upon completion of the dial attempt, the peer sends a `DialResponse` message to
the initiator node on the `/libp2p/autonat/2.0.0/dial` stream with the
index(0 based) of the candidate that it attempted to dial and the appropriate
`ResponseStatus`. see [Requirements For
ResponseStatus](#requirements-for-responsestatus)

The initiator MUST check that the nonce received in the `DialAttempt` is the
same as the nonce the initiator sent in the `Candidate` for the candidate
index received in `DialResponse`. If the nonce is different, the initiator MUST
discard this response.


### Requirements for ResponseStatus

On receiving a `DialRequest` the peer selects the first candidate on the list it
is capable of dialing. This candidate address is referred to as _addr_. The
`ResponseStatus` sent by the peer in the `DialResponse` message MUST be set
according to the following requirements

`OK`: the peer was able to dial _addr_ successfully.

`E_DIAL_ERROR`: the peer attempted to dial _addr_ and was unable to connect.

`E_DIAL_REFUSED`: the peer didn't attempt a dial because of rate limiting,
resource limit reached or blacklisting.

`E_TRANSPORT_NOT_SUPPORTED`: the peer didn't have the ability to dial any of the
requested addresses.

`E_BAD_REQUEST`: the peer didn't attempt a dial because it was unable to decode
the message.

`E_INTERNAL_ERROR`: error not classified within the above error codes occured on
peer that prevented it from completing the request.

Implementations MUST discard responses with status codes they do not understand

### Consideration for DDOS Prevention

sukunrt marked this conversation as resolved.
Show resolved Hide resolved
In order to prevent attacks like the one described in [RFC 3489, Section
12.1.1](https://www.rfc-editor.org/rfc/rfc3489#section-12.1.1) (see excerpt
below), implementations MUST NOT dial any multiaddress unless it is based on the
IP address the requesting node is observed as. This restriction as well implies
that implementations MUST NOT accept dial requests via relayed connections as
one can not validate the IP address of the requesting node.

> RFC 3489 12.1.1 Attack I: DDOS Against a Target
>
> In this case, the attacker provides a large number of clients with the same
> faked MAPPED-ADDRESS that points to the intended target. This will trick all
> the STUN clients into thinking that their addresses are equal to that of the
> target. The clients then hand out that address in order to receive traffic on
> it (for example, in SIP or H.323 messages). However, all of that traffic
> becomes focused at the intended target. The attack can provide substantial
> amplification, especially when used with clients that are using STUN to enable
> multimedia applications.


## Implementation Suggestions

For any given address, implementations SHOULD do the following
- periodically recheck reachability status
- query multiple peers to determine reachability
sukunrt marked this conversation as resolved.
Show resolved Hide resolved

The suggested heuristic for implementations is to consider an address reachable
if more than 3 peers report a successful dial and to consider an address
unreachable if more than 3 peers report unsuccessful dials.

Implementations are free to use different heuristics than this one


## RPC Messages
sukunrt marked this conversation as resolved.
Show resolved Hide resolved

All RPC messages sent over a stream are prefixed with the message length in
bytes, encoded as an unsigned variable length integer as defined by the
[multiformats unsigned-varint spec][uvarint-spec].

All RPC messages on stream `/libp2p/autonat/2.0.0/dial` are of type
`DialMessage`. A `DialRequest` message is sent as a `DialMessage` with the `dialRequest`
field set and the `type` field set to `DIAL_REQUEST`. `DialResponse` is handled
similarly.

On stream `/libp2p/autonat/2.0.0/attempt`, there is a single message type
`AttemptMessage`

```proto
syntax = "proto3";

message DialMessage {
enum Type {
DIAL_REQUEST = 0;
DIAL_RESPONSE = 1;
}

Type type = 1;
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
DialRequest dialRequest = 2;
DialResponse dialResponse = 3;
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
}

message Candidate {
bytes addr = 1;
fixed64 nonce = 2;
}

message DialRequest {
repeated Candidate candidates = 1;
}

message DialResponse {
enum ResponseStatus {
OK = 0;
E_DIAL_ERROR = 100;
E_DIAL_REFUSED = 101;
mxinden marked this conversation as resolved.
Show resolved Hide resolved
E_TRANSPORT_NOT_SUPPORTED = 102;
E_BAD_REQUEST = 200;
E_INTERNAL_ERROR = 300;
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
}
sukunrt marked this conversation as resolved.
Show resolved Hide resolved

ResponseStatus status = 1;
string statusText = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of the status text if we have an enum?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will help in debugging. For a status code the node doesn't understand logs of this text might be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced. If I am not mistaken, HTTP > 1.1 also no longer sends the text variant of the status code because it is redundant and only consumes bytes.

Sending it at the off-chance that we first add a new status code and then save someone some time in debugging when they could also look at the latest spec doesn't seem worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it. People can check the updated spec to understand the code.

int32 addrIdx = 3;
}

message AttemptMessage {
enum Type {
DIAL_ATTEMPT = 0;
}

Type type = 1;
DialAttempt dialAttempt = 2;
}

message DialAttempt {
fixed64 nonce = 1;
}
```

[uvarint-spec]: https://github.com/multiformats/unsigned-varint

19 changes: 19 additions & 0 deletions autonat/autonat-v2.plantuml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
@startuml
participant Cli
participant Srv

skinparam backgroundColor white
skinparam sequenceMessageAlign center

== Dial Request Success==

Cli -> Srv: [dial] DialRequest: (Addr1, Token1), (Addr2, Token2)
Srv -> Cli: [attempt] DialAttempt: Token2
Srv -> Cli: [dial] DialResponse: STATUS: OK, CandidateIdx: 1

== Dial Request Failure==

Cli -> Srv: [dial] DialRequest: (Addr1, Token1), (Addr2, Token2)
Srv ->x Cli: [attempt] DialAttempt: Token2
Srv -> Cli: [dial] DialResponse: STATUS: E_DIAL_ERROR, CandidateIdx: 1
@enduml
1 change: 1 addition & 0 deletions autonat/autonat-v2.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.