Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plausible Deniability in the BitTorrent protcol #121

Open
x9p0 opened this issue Jun 4, 2021 · 10 comments
Open

Plausible Deniability in the BitTorrent protcol #121

x9p0 opened this issue Jun 4, 2021 · 10 comments

Comments

@x9p0
Copy link

x9p0 commented Jun 4, 2021

To add Plausible Deniability to the BitTorrent Protocol, thus shielding people from prosecution.

Proposal:

Add PD_BIT (bit), PD_NUMBER (integer identifying number of random torrents)

Client side can set the "PD_BIT" bit to use the "Plausible Deniability feature", which consumes more data but adds safety.

The client will interact with a number of other random torrents from the DHT, but never saves content on disk from these random interactions, if this bit is set,

Optionally the client can set the minimum / maximum number of interactions in his/her side (PD_NUMBER), user choose depending on availability of bandwidth.

@the8472
Copy link
Contributor

the8472 commented Jun 4, 2021

Ignoring the merit of the proposal, this sounds like a feature request that should be pitched to client developers (or you could implement it yourself), it doesn't need a bitorrent protocol extension.

@x9p0
Copy link
Author

x9p0 commented Jun 4, 2021

Ignoring the merit of the proposal, this sounds like a feature request that should be pitched to client developers (or you could implement it yourself), it doesn't need a bitorrent protocol extension.

Thank you for replying.
The reason to be proposed as a BPE is to allow the whole ecosystem of BitTorrent to be immune to legal snooping or lawsuit actions.
Clients not supporting the PD extension would still get PD traffic based on default settings of trackers / clients - thus making legal action actually impossible to occur, or easily able to be reversed in court, against users of the protocol.

@the8472
Copy link
Contributor

the8472 commented Jun 4, 2021

Nevertheless it would make sense to prototype this in a client first as it does not require any protocol changes, an informational BEP could be published later.

@gubatron
Copy link
Contributor

gubatron commented Jun 5, 2021

@x9p0 have you ran any simulations/numbers to measure the effects on the overall DHT performance something like this would have on the DHT network?

What would be a sweet/healthy spot for a global PD_NUMBER if all clients were to adopt this?

this seems like something I would not let the user decide on, I can imagine some sort of attack where you set this number really high on lots of clients and you DDOS the DHT.

This certainly seems like something we'd implement without the need for an extension and we'd have mechanisms to tune our client's PD_NUMBER via signed remote config message.

@x9p0
Copy link
Author

x9p0 commented Jun 5, 2021

@the8472 I agree to the idea of a prototype.

@gubatron I have not. Will dig into some headers and source code to look for a sweet spot for PD_BIT / PD_NUMBER adoption.

Agree the feature can be abused if users can set / alter PD_NUMBER to a value too high. Protocol definition can impose restrictions in the MAX value of PD_NUMBER. I would start with a low PD_NUMBER, measure the effect on the DHT, and raise it slowly after tests / adoption.

Agree if there is a good spot to put the feature, can be done without an extension.

Will see some code these days and get back here.

@gubatron
Copy link
Contributor

gubatron commented Jun 6, 2021

@x9p0 does this proposal include keeping copies of the .torrents (just the metadata, not content) so that these clients can also help magnet queries? On one hand this would probably help magnet queries greatly, however, I believe it's a legal gray area if you're found hosting an illegal .torrent (just the metadata, piece hashes, etc, not the content), say, you're all of a sudden hosting metadata for child porn.

@x9p0
Copy link
Author

x9p0 commented Jun 8, 2021

@gubatron Indeed it is a gray area. Thats the idea of plausible deniability. Nothing is never saved on disk, except what the user actually is downloading willfully - so expertise will have nothing to find in the hardware (if seized).

The BitTorrent ecosystem is already well-established, so, after adoption, legal teams will have so many false-positives, and so much trouble suing and losing, that will be better for them to drop the ball.

First idea maintaining backward compatibility, requiring minimum changes:

1 - Client query the DHT with info_hash all zero'ed

{"t":"bb",
"y":"q",
"q":"get_peers",
"a":{"id":"abcdefghij0123456789",
"info_hash":"000000000000 .. 20 bytes ALL ZERO torrent info hash"}
}

2 - DHT server answers with a random info_hash

3 - Client fetches in memory only one piece of this random info_hash, limited to 500Kb

4 - Client announces this info_hash into the DHT for 30 minutes max

5 - Client serves the fetched piece

6 - Client gets second piece of random info_hash, starts serving

7 - Client discards first piece of memory

8 - Client rotates pieces every 30 minutes

Sounds nice? Or better another approach?

@gubatron
Copy link
Contributor

gubatron commented Jun 8, 2021

Here's a similar approach:

  1. Clients implementing PD announce themselves at canonical hash address: sha1("BTPD")
  2. They perform a get_peers announced at BTPD
  3. Clients that now have a list of BTPD compatible peers are now able to perform requests every M minutes.

The goal of the request is to obtain a list of N random .torrent info_hash hashes which they can then request from the DHT.
Such request I believe should include the following:
-> A bloom filter that keeps track of what info_hashes the requester already knows, this way the responder can avoid sending back an infohash they mutually know.
-> The Max size the requester is willing to go for when it comes to an info hash.
-> N number of infohashes to receive in the response

The protocol should maybe have a setting for the max size of the response, an error message could be sent back if requester is asking more than responder is willing to send back.

This way, if the answering peer sends back an info_hash that a) I already knew or b) is bigger than I requested, I can blacklist the peer that responded and not ever talk to it again with respect to BTPD random infohash requests.

Infohashes are downloaded, verified and kept in memory to further help the network.

Since all clients announce themselves on a canonical key, perhaps there are more future optimizations to avoid spam, blacklist bad actors.

I like this idea because it's sort of a prototype to creating a distributed search index. If the BTPD compatible peers keep about 10Mb worth of info_hashes each, and you have tens of thousands of these you could then potentially implement a very powerful distributed search and start doing without torrent search websites.

@x9p0
Copy link
Author

x9p0 commented Jun 9, 2021

@gubatron Agree with the specification. BTPD is also a nice name for the feature. Never thought of it as a distributed search index, but indeed, this implementation can become one, depending on how widespread is the adoption.

Standards before going hard with final code to be agreed upon:

For the bloom filter, I think CRC-40 could do the trick and still be very conservative with bytes used. CRC-32 would be too small

N random info_hashes can be dependent on the available memory in the client, with optional minimum (if client does not have min. available memory, does not participate in BTPD) and maximum threshold to avoid attacks

M minutes between requests of compatible BTPD peers can be fixed + randomized to avoid spikes in the network, I think base_minutes + rand(seed, max) would do the trick

Blacklist comes handy with already implemented function, nice

10Mb is way more than I originally thought for the original feature, but it was extended, and making it dependent on available memory from peers, we are flexible and allow everyone to participate, even the smallest VPS / containers

@the8472
Copy link
Contributor

the8472 commented Jun 10, 2021

If you want random infohashes you can already use BEP 51.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants