-
-
Notifications
You must be signed in to change notification settings - Fork 31
2. Spanning Tree
The spanning tree provides network participants with name-dependent identifier routing which is used before nodes have fully bootstrapped. It also acts as a synchronisation primitive as root announcements are periodically flooded through the network.
As with all spanning trees, there is a single root node at any given time for a given network. It is the responsibility of the root node to send out root announcements into the network at a specific interval.
Since Pinecone networks are typically dynamic in nature, root selection must take place automatically and without user intervention, so as to avoid deliberate points of centralisation. In order to do this, a form of network-wide election takes place, where the node with the numerically highest ed25519 public key will win.
If the current root node disappears from the network or otherwise stops sending root announcements (due to dropped peerings), another election will take place, eventually settling on the next highest key.
A node considers itself to be a root node if it does not have a chosen parent.
It is necessary for the functioning of a Pinecone network for all nodes to hear an announcement from the same root node so that all nodes can agree on the same tree-space coordinate system for routing bootstrap and setup messages.
A root announcement contains the following fields:
Root public key | |
Root sequence | |
Signature 1 | Signing public key |
Destination port number | |
Signature covering all fields upto this signature | |
Signature 2 | Signing public key |
Destination port number | |
Signature covering all fields upto this signature | |
Signature …n | Signing public key |
Destination port number | |
Signature covering all fields upto this signature |
The sequence number is a monotonically increasing unsigned integer, starting at 0. Since root announcements are flooded through the network, it is the goal for all nodes to eventually agree on the same Root public key and Sequence number following a root update.
The sequence number must be increased when the root node wishes to send out a new root announcement by reaching the announcement time interval.
The sequence number must not be increased when repeating the previous update to peers for any other reason, and must not be increased by any node that is not the root node before repeating the updates to their peers.
Before sending the root announcement to a peer, it should sign the update, appending the signature to the end of the announcement. Doing so allows any node receiving an root announcement to ensure that none of the ancestors have forged earlier signatures within the update, and to more reliably detect loops within the tree (if a signature incorrectly appears within a given update more than once).
Note that the Destination port number field must include the port number of the peer that the update is being sent to. Therefore if you are repeating the update to 6 peers, you will need to create 6 separate signed updates.
It is the responsibility of the root node to send a new root announcement on a regular interval (typically every 30 minutes). To do this, it should create a new empty announcement, populating the Root public key field with the node’s own public key and the Sequence number.
Each node should maintain their own local sequence number, which is used only when sending out root announcements with their own key as the Root public key, ensuring that the sequence number increments only when a new announcement is generated (and not when repeating an update due to changes in the tree).
It is the responsibility of other non-root nodes to sign and repeat good root announcements received from their chosen parent to all directly peered nodes (including repeating the update to the chosen parent) at the time that it is received, although they must not attempt to modify the update in any other way (for example, by trying to modify the Root public key, Root sequence or any of the existing Signature fields).
Nodes must not repeat bad root announcements or any root announcements arriving from other peers that are not the chosen parent.
When a node receives a root announcement from a direct peer, it should do the following:
- Perform sanity checks on the update;
- Store the received update against the peer in memory;
- Decide whether to re-parent;
If a root announcement is received from any peer during the 1 second reparent wait interval, it should be sanity-checked (step 1) and stored (step 2) but should not be processed any further. This is to ensure that many updates arriving quickly from direct peers will not result in a flood of root announcements being sent from the node in rapid succession, which overall reduces the load on the network and prevents the tree rebuilding excessively.
The following sanity checks must be performed on all incoming root announcement updates:
- The update must be well-formed, with no missing fields or unexpected values;
- There must be at least one signature;
- The first Signing public key must match the Root public key;
- The last Signing public key must match the public key of your direct peer;
- None of the signature entries should have a Destination port number of 0;
- The update must not contain any loops, that is that the update must not contain the same public key in the signatures more than once;
- If a root update has been received from this peer before, and the Root public key is equal to the last update, the Root sequence must be greater than or equal to the last update.
If any of these conditions fail, the update is considered to be invalid, the update should be dropped and the peer that sent us this announcement should be disconnected as a result of the error.
Once the sanity checks have passed, if the update came from the currently selected parent, perform the following checks in order:
- If the reparent wait timer is active, do nothing and stop processing;
- If any of the Signatures entries in the update contain the node’s own public key, implying that our parent has suddenly decided to choose us as their parent instead:
- Become a root node, sending an announcement to all of your direct peers notifying them of this fact;
- Start a 1 second reparent wait timer, after which the parent selection algorithm will run;
- If the Root public key is lower than the last root announcement from our chosen parent, implying that either the previous root node has disappeared or the parent is notifying us of bad news:
- Become a root node, sending an announcement to all of your direct peers notifying them of this fact;
- Start a 1 second reparent wait timer, after which the parent selection algorithm will run;
- If the Root public key and the Root sequence fields are both equal to those in the last root announcement from our chosen parent, implying that the parent is notifying us of bad news:
- Become a root node, sending an announcement to all of your direct peers notifying them of this fact;
- Start a 1 second reparent wait timer, after which the parent selection algorithm will run;
- If the Root public key is higher than the last root announcement from our chosen parent, implying that our parent has learned about a stronger root:
- Send a tree announcement with the new update to all directly connected peers;
- If the Root public key is equal to the last root announcement but the Root sequence is higher than the last root announcement:
- Send a tree announcement with the new update to all directly connected peers;
Otherwise, if the update arrived from another peer that is not our chosen parent, perform the following checks in order:
- If the reparent wait timer is active, do nothing and stop processing;
- If any of the Signatures entries in the update contain the node’s own public key, implying that the peer has chosen us a parent, do nothing and stop processing;
- If the Root public key is higher than the last root announcement from our chosen parent:
- Set the node’s chosen parent to this specific peer;
- Send a tree announcement with the new update to all directly connected peers;
- If the Root public key is lower than the last root announcement from our chosen parent:
- Send a tree announcement back to this peer only with the last root announcement from our chosen parent;
- In any other case not matched by the above:
- Run the parent selection algorithm.
With the exception of the root node, which cannot by definition have a parent, each node on the network must choose one of its peered nodes to be its “parent”. The chosen parent node will influence the node’s coordinates.
In the case that a non-root node has only a single peering, that peered node will always be the chosen parent. In the case there are multiple peerings, the parent selection algorithm will determine which the most suitable node is.
The parent selection algorithm runs as follows:
- Start with the current Root public key and Root sequence as the best key and best sequence, and the best candidate as an empty field;
- Iterate through all directly connected peers, checking all of the following conditions in order:
- Skip the peer and move onto the next if the last root announcement received from this peer exceeds the announcement timeout interval, which is typically 45 minutes;
- Skip the peer and move onto the next if the last root announcement received from this peer includes our own public key in the Signing public key field of any of the signatures, as this implies that we are receiving an update again that we have already seen and that a routing loop has taken place;
- If the Root public key of the last peer update is higher than the best key, update the best key and mark this peer as the best candidate;
- If the Root public key of the last peer update is lower than the best key, skip the peer and move onto the next;
- If the Root sequence of the last peer update is higher than the best sequence, update the best sequence and mark this peer as the best candidate;
- If the Root sequence of the last peer update is lower than the best sequence, skip the peer and move onto the next;
- As a last resort tie-breaker, mark this peer as the best candidate and move onto the next if the peer's Root public key and Root sequence match the best key and best sequence, and either:
- This peer's update arrived before the update from the best candidate did, or;
- No other peer has been selected as the best candidate so far.
If the best candidate is not empty at the end of the loop and the chosen candidate is different to the current chosen parent, mark the best candidate as our chosen parent and then send root announcement updates to all directly connected peers.
If the best candidate is still empty at the end of the algorithm, none of the connected peers were suitable candidates and therefore the node should become a root node instead, emptying the chosen parent field and then sending a root announcement to all directly connected peers with its own public key as the Root public key.
Directly connected peers are likely to handle the sudden update to a weaker root key as bad news, therefore any nodes that were previously children of this node will respond accordingly.
A node’s coordinates describe the location of the node on the spanning tree. In effect, the coordinates are the path from the root node down to a given node, with each number representing the downward port number at each hop.
For example, coordinates [1 4 2 4]
describe the following path: starting via the root node’s port 1, followed by the next node’s port 4, followed by the next node’s port 2 and then, finally, the destination node’s parent’s port 4.
The coordinates of the node are calculated by taking the chosen parent’s last root announcement and concatenating all of the Destination port number values in order.
With that in mind, a root node, which has no chosen parent, will have a zero-length set of coordinates []
and a node’s coordinates will change each time it either selects a new parent, or receives a root announcement update from the parent with a different path described in the signatures.
When a node first comes online, it has no prior knowledge about the state of the network and therefore does not have a chosen parent. Therefore each node that comes online and joins the network will start off considering itself to be a root node.
When the first peer connection is made, the node should send its root announcement to the newly connected peer and wait for the remote peer to send their announcement back. One of two things will happen:
- Our key will be stronger than the remote side’s root key (in which case we will remain as a root node, they will have received good news about a new stronger root and will re-run the parent selection algorithm accordingly), or:
- The root announcement that they send us will contain a stronger root key than our own public key (which we will consider to be good news) which will cause us to run parent selection.
In a network which is entirely starting up from cold, or in the case of the previous root node disappearing/going offline, the process of agreeing on a new root is iterative. Nodes will run the parent selection algorithm, finding the strongest root key, notifying their peers, which may cause them to run the parent selection algorithm, announcing their strongest chosen root to their peers, and so on, until the network eventually settles on the strongest key.
When using tree routing to route towards a certain set of coordinates, a simple distance formula is used. To calculate the distance between coordinates A and B:
- Calculate the length of the common prefix of both sets of coordinates;
- Add the length of coordinates A to the length of coordinates B;
- Subtract the length of the common prefix multiplied by 2.
As an illustrative example, where A is [1 3 5 3 4]
and B is [1 3 5 7 6 1]
:
- Both A and B start with
[1 3 5]
, therefore the common prefix length is 3; - A has a length of 5, B has a length of 6;
- (5 + 6) - (3 ✕ 2) results in a total distance of 5 between A and B.
The next-hop selection for tree routing is as follows:
- Start with the distance from our own coordinates to the destination coordinates as the best distance, and the best candidate as an empty field;
- If the best distance is already 0 then the frame has arrived at its intended destination coordinates. Pass the frame to the local router and stop here, otherwise;
- Iterate through all directly connected peers, checking the following conditions in order:
- Skip the peer and move onto the next if:
- The peer has not sent us a root announcement yet;
- The peer is the peer that sent us the frame that we are trying to forward (we will never route the frame back to where it came from as this would create a routing loop);
- The Root public key of the peer’s last root announcement does not match that of our chosen parent’s last announcement;
- The Root sequence of the peer’s last root announcement does not match that of our chosen parent’s last announcement;
- Calculate the distance from the peer’s coordinates to the destination coordinates;
- If the calculated distance is less than the best distance, update the best distance and mark the peer as the best candidate;
- Skip the peer and move onto the next if the calculated distance is greater than the best distance;
- If the best candidate is not empty and the peer’s last root announcement was received sooner than that of the best candidate, mark the peer as the best candidate.
- Skip the peer and move onto the next if:
If the best candidate is still empty at the end of the algorithm, there is no suitable next-hop candidate that will take the frame closer to its destination. A node must never route a tree-routed packet to a peer that will take it further away from the destination.