Skip to content

Commit

Permalink
adding ipfs keydensity plot
Browse files Browse the repository at this point in the history
  • Loading branch information
guillaumemichel committed Aug 30, 2024
1 parent dc16549 commit c24719b
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 20 deletions.
20 changes: 0 additions & 20 deletions content.en/avail/dht/2024-32-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,26 +72,6 @@ This plot illustrates the evolving count of nodes supporting each of the listed

{{< plotly json="../../../plots/2024/08/05/avail-crawl-errors-single.json" height="600px" >}}

## Keyspace Density Monitoring

{{< hint info >}}
💡 The latest keyspace density data is available [here](../../keydensity/).
{{< /hint >}}

In Kademlia, every object indexed by the DHT requires a binary identifier. In the libp2p DHT implementation, peers are identified by the digest of `sha256(peer_id)` and CIDs are identified by the digest of `sha256(cid)`. This Kademlia identifier determines the location of an object within the Kademlia XOR keyspace.

The following plots examine the peer distribution within the keyspace, aiding in the identification of potential [Sybil](https://en.wikipedia.org/wiki/Sybil_attack) and eclipse attacks.

### Keyspace population distribution

**Description**: The plot illustrates the number of peers whose Kademlia identifier matches each prefix for all prefixes of a given size, for a given network crawl. Since the density of keyspace regions follows a [Poisson](https://en.wikipedia.org/wiki/Poisson_distribution) distribution, it is expected to observe some regions that are significantly more populated than others.

**How to read the plot:** The selected `depth` indicates the prefix size. There are `2^i` distinct prefixes at depth `i`. The slider at the bottom of the plot enables visualization of the population evolution over time across multiple crawls.

**What to look out for:** The red dashed line represents the expected density per region, corresponding to the number of peers matching a prefix. A bar exceeding the expected density by more than twice suggests that a region of the keyspace might be under an eclipse attack.

{{< plotly json="../../../plots/2024/08/05/avail-regions-population.json" height="600px" >}}

## Stale Node Records

### All Peers
Expand Down
23 changes: 23 additions & 0 deletions content.en/ipfs/amino/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,5 +134,28 @@ In the following we show the change in distribution of the nine most recent rele

{{< plotly json="../../plots/latest/recent-kubo-versions-over-time.json" height="600px" >}}

## DHT Key Density Monitoring

In Kademlia, every object indexed by the DHT requires a binary identifier. In the libp2p DHT implementation, peers are identified by the digest of `sha256(peer_id)` and CIDs are identified by the digest of `sha256(cid)`. This Kademlia identifier determines the location of an object within the Kademlia XOR keyspace.

The following plots examine the peer distribution within the keyspace, aiding in the identification of potential [Sybil](https://en.wikipedia.org/wiki/Sybil_attack) and eclipse attacks.

### Keyspace population distribution

**Description**: The plot illustrates the number of peers whose Kademlia identifier matches each prefix for all prefixes of a given size, for a given network crawl. Since the density of keyspace regions follows a [Poisson](https://en.wikipedia.org/wiki/Poisson_distribution) distribution, it is expected to observe some regions that are significantly more populated than others.

**How to read the plot:** The selected `depth` indicates the prefix size. There are `2^i` distinct prefixes at depth `i`. The slider at the bottom of the plot enables visualization of the population evolution over time across multiple crawls.

**What to look out for:** The red dashed line represents the expected density per region, corresponding to the number of peers matching a prefix. A bar exceeding the expected density by more than twice suggests that a region of the keyspace might be under an eclipse attack.

{{< plotly json="../../plots/latest/ipfs-regions-population.json" height="600px" >}}

### Keyspace density distribution

**Description:** As previously mentioned, the keyspace population follows a [Poisson](https://en.wikipedia.org/wiki/Poisson_distribution) distribution, which can make identifying outliers challenging. The plot below counts the number of regions for each population size and facilitates the identification and analysis of outliers. While it is normal for some regions to have populations above the average, the plot enables us to quantify these deviations.

**How to read the plot:** The red dashed line represents the expected number of regions for each population size. Note that the Poisson distribution is more evident at greater depths (longer prefix size), while analyzing data at lower depths provides limited insights. It is recommended to read the plot for depths between 9 and 13.

**What to look out for:** If a bar significantly exceeds its expected value on the right side of the plot, or if an isolated bar appears on the far right, it may indicate a potential eclipse attack, warranting further investigation.

{{< plotly json="../../plots/latest/ipfs-density-distributions.json" height="600px" >}}

0 comments on commit c24719b

Please sign in to comment.