feat: republishing of user pkarr records #78

SHAcollision · 2025-02-23T17:05:55Z

Background: DHT Eviction and Need for Republishing

We have observe that the DHT evicts pkarr records after a period of inactivity, causing homeservers to become unresolvable unless their records are periodically re-published. While the exact eviction timing is unclear, some anecdotal evidence suggest that eviction typically occurs around 4 days. This 4-day threshold, though arbitrary, forms the basis of our current republishing strategy. Of course, records will still likely be cached by relays long after being evicted. So maybe republishing is not so much needed once there is several robusts relays in the network.

Potential Approaches: Homeserver vs. Client-Side Republishing

1. Homeserver-Driven Republishing (Alternative Approach)

A natural approach would be for the homeserver itself to handle republishing. Since the homeserver is always online, it could maintain a copy of each user's pkarr record and ensure that these records are kept alive on the DHT. However, this approach faces significant challenges:

Risk of Rate Limiting/Blocking by DHT Peers:
A homeserver hosting many users would need to republish frequently, potentially triggering rate limits or blocks from DHT peers.
Need for Activity Tracking:
To avoid excessive republishing, a homeserver would need to implement smart logic, for example:
- Republishing only for users who have been active recently.
- Enforcing a minimum interval between republishes (e.g., at least 4 days) to avoid spamming the DHT.
Complexity and Overhead:
Tracking user activity and implementing these smart republishing policies introduces additional complexity and state management on the homeserver side.
Redundancy with Client-Initiated Flows:
If the publishing flow is already triggered by the user during signin, it raises the question of whether the homeserver should handle it at all.

While a homeserver should certainly republish its own pkarr record to remain discoverable, the above challenges make it less ideal for handling user record republishing.

2. Client-Side Republishing (Proposed Approach)

We propose shifting the responsibility for user record republishing to the client. This approach offers several advantages:

Distributed Publishing Load:
Since republishing occurs on the client side during signin, the load is naturally distributed, avoiding the risk of a single homeserver being rate-limited.
Seamless User Experience:
Republishing can occur transparently in the background when the user signs in, requiring no additional user intervention.
Minimized DHT Traffic:
The client will use a conditional republishing strategy (IfOlderThan), ensuring republishing only occurs if:
- The record is missing, or
- The record is older than 4 days.
Simplicity for the Homeserver:
The homeserver no longer needs to track user pkarr records or manage republishing logic, reducing operational complexity.

Proposed Implementation Details

On Signup:
- The signup() method publishes a new pkarr record immediately, ensuring instant discoverability for new users. It disregards any existing record except for the purpose of CaS and race conditions.
On Signin (Background Republishing):
- After successful signin (i.e., token recovery), the client spawns a background task that:
  - Resolves the most recent pkarr record.
  - Extracts the homeserver host via extract_host_from_record(...).
  - Calls publish_homeserver using the IfOlderThan strategy, only republishing if the record is missing or older than 4 days.
Public Method for Explicit Republishing:
- A public method republish_homeserver(keypair, host) is provided for key management applications.
- This allows republishing without requiring a full signup, particularly useful if signin fails due to an unresolvable homeserver.
- The same IfOlderThan strategy ensures that unnecessary DHT spam is avoided.

Rationale for the Proposed Approach

Efficiency:
By aligning republishing with the signin process, we avoid introducing additional network operations, leveraging a flow that users trigger naturally.
Reduced Risk of DHT Rate Limiting:
Client-driven republishing avoids a central point of republishing, making it less likely for DHT peers to block or limit requests.
Simplicity and Maintainability:
Homeservers remain focused on their primary responsibilities, without needing to track user activity or implement rate-limiting workarounds.

Cons of the Proposed Approach

While client-side republishing during signin offers several advantages, it also comes with certain drawbacks:

Potential Eviction of Long-Inactive Records:
If a user does not sign in for an extended period, their pkarr record may be evicted from the DHT and become difficult for others to resolve—especially if no pkarr relay has cached it.
Impact on Low-Activity Use Cases:
This limitation may not significantly affect applications where user activity is crucial anyway (e.g., private messaging, social networks). However, it could be problematic for use cases where users are expected to sign in once, publish content, and rarely return (e.g., publishing boards or archival applications). In such scenarios, content could become hard to find unless users actively refresh their records.
Need for Future Discussion:
Addressing the discoverability of long-inactive records may require additional strategies. For example, scheduled republishing mechanisms or homeserver-assisted approaches might need to be revisited for these specific use cases.

By no means does the current client-side republishing on signin and explicit command fully resolve the republishing challenge. However, it represents a significant step forward, allowing client users to more easily ensure their records remain active and discoverable on the DHT.

The text was updated successfully, but these errors were encountered:

SeverinAlexB · 2025-02-24T15:01:49Z

I agree that we need to keep the load on the DHT low but let me do some math to show you the "load".

What Is Load?

What Is The Actual Rate Limit

Each node in Mainline implements their own rate limiting. I looked at code of libtorrent previously and I remember seeing a 8kb/s read limit per IP. I am not 100% sure about the write limit but for simplicity, I assume it to be the same. Considering that the pkarr packet is only 1kb we could easily publish the packet even every second.

Additionally, Mainline has around 10M nodes. In theory, we can read/write 8kb/s to every node. So yes there is a rate limit and yes we should keep the load low but the actual limit is quite generous if the load is distributed evenly.

Let's Do Some Math

Mainline has ~10M nodes. 1 public keys is stored on 20 nodes. If the pubkeys are equally distributed, we can publish 10M/20 = 500k public keys at least every second with one IP address without running into any rate limiting and without overloading even one node. This is an idealized scenario obviously but it still shows what is possible.

Assuming we republish user packets every day once, a single IP address could publish a lot of packets.

let seconds_per_day = 60*60*24; // 90'000
let possible_users = 500_000 * seconds_per_day; // 45'000'000'000

In this idealized scenario, we can support 45 billion users with a single IP address. That's a lot.

Part 1 of 2

SeverinAlexB · 2025-02-24T15:23:12Z

Proposal Response

Client-Side Republishing

I don't like the client-side republishing because of two reasons:

Clients are not always online and therefore can't reliably republish packets. Even when I don't login to Twitter I would still like others to resolve my username @SeverinAlexB correctly after 7 days. The same goes for my public key. Therefore this approach is not reliable enough.
Any web client doesn't support UDP and therefore uses relays so they don't actually publish packets, the relays do. So you just move the responsibility from the client to the relay which creates the same IP rate limiting bottleneck like on the homeserver.

Homeserver-Driven Republishing

You are right that this includes some complexity. I propose the following:

Track activity of each user as a last_seen timestamp. Keep republishing as long as last_seen is < 3 months or so.
Republish packet in a certain interval. Download the most recent packet from the DHT. Check if the homeserver is still used by the user. If not skip. If yes, republish.

With this method, we only need to store the last_seen timestamp per user. We don't need to store any packets.

Risk: In case the packet gets somehow lost on the DHT, the homeserver might not be able to republish. We should try to take this risk initially to keep things simple. Additionally, the risk is mitigated by the fact that relays have a huge cache and the chance that a packet gets missing is very small.

Part 2 of 2

SHAcollision added enhancement New feature or request pkarr Pkarr related issues labels Feb 23, 2025

SHAcollision linked a pull request Feb 23, 2025 that will close this issue

feat: client api for pkarr record republishing #79

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: republishing of user pkarr records #78

feat: republishing of user pkarr records #78

SHAcollision commented Feb 23, 2025

SeverinAlexB commented Feb 24, 2025 •

edited

Loading

SeverinAlexB commented Feb 24, 2025

feat: republishing of user pkarr records #78

feat: republishing of user pkarr records #78

Comments

SHAcollision commented Feb 23, 2025

Background: DHT Eviction and Need for Republishing

Potential Approaches: Homeserver vs. Client-Side Republishing

1. Homeserver-Driven Republishing (Alternative Approach)

2. Client-Side Republishing (Proposed Approach)

Proposed Implementation Details

Rationale for the Proposed Approach

Cons of the Proposed Approach

SeverinAlexB commented Feb 24, 2025 • edited Loading

What Is Load?

What Is The Actual Rate Limit

Let's Do Some Math

SeverinAlexB commented Feb 24, 2025

Proposal Response

Client-Side Republishing

Homeserver-Driven Republishing

SeverinAlexB commented Feb 24, 2025 •

edited

Loading