Skip to content
This repository has been archived by the owner on Sep 19, 2019. It is now read-only.

Add Chord implementation #4

Open
makkes opened this issue Oct 10, 2013 · 16 comments
Open

Add Chord implementation #4

makkes opened this issue Oct 10, 2013 · 16 comments
Assignees
Milestone

Comments

@makkes
Copy link

makkes commented Oct 10, 2013

Currently we only have a full mesh of all peers. We should implement a router that implements Chord.

makkes pushed a commit that referenced this issue Mar 18, 2014
* Added scaffolding code
chris-- added a commit that referenced this issue Mar 18, 2014
refs #4 Add Chord implementation
@jure
Copy link
Contributor

jure commented Jun 10, 2014

Have you seen: https://github.com/derekchiang/chord.js? Maybe it can help. And this too: https://github.com/optimizely/chord

@makkes
Copy link
Author

makkes commented Jun 11, 2014

Thanks, @jure for the hints!

@piranna
Copy link

piranna commented Jun 11, 2014

I have been reading the paper and Chord seems really interesting, I'm thinking about using it on my own P2P (http://webp2p.io), but don't know if the usage of PeerJS or UTP are mandatory, there should be a transport agnostid implementation... Anyone is interested to join forces on this task? :-)

@jure
Copy link
Contributor

jure commented Jun 11, 2014

I love your work (read all papers from past 2 years) and have been trying since yesterday to get Chord working in my fork https://github.com/jure/core/tree/chord

I've only had time to fix up some issues with 'require' and hashes, and today I'll try to implement put and get based on https://github.com/derekchiang/chord.js

There's a few things that aren't clear to me:

  1. When you join the network, do you get ownership of some keys and are the values that belong to those keys then supposed to automatically be transferred to you?
  2. Derek Chiang's Chord.js has replication built in, is this something you would want as well?
  3. Say there are 6 nodes, 0, 1, 2, 3, 4, 5. Ideally:

0 is connected to 1, 2, 4
1 is connected to 2, 3, 5
2 is connected to 3, 4, 6
3 is connected to 4, 5, 1
4 is connected to 5, 6, 2
5 is connected to 6, 1, 3

Is the above correct?

So if I'm node 0, and I find out that node 5 is in charge of the key-value I'm looking for, how do I get to it in the context of WebRTC? Do I send a message to 4, which sends a message to 5, which sends a message back to 4, which sends a message back to me (0)? Or does 5 establish a DataChannel with 0 directly via the "bootstrapping/signaling server"?

Does 5 need to talk to the bootstrapping/signaling server to connect to 0, or can SDP offer/answer happen through existing connections in the Chord.

Sorry for the many questions, turns out distributed systems are hard to think about intuitively.

@jure
Copy link
Contributor

jure commented Jun 11, 2014

I like the idea of joining forces, @piranna. Also, I like the approach taken in https://github.com/boplish/core/blob/master/js/application.js where all the networking/overlay/WebRTC stuff is abstracted away and you only have a nice simple API where you register callbacks for receiving data and have a simple way to send it where you want. I guess Chord will cause some changes in the sending part still, though.

Ideally, for my case, I'd like an API where I get or put a key in a DHT network, and the rest is done magically somewhere else (connections, WebRTC, replication, Bloom filters, etc.)

@piranna
Copy link

piranna commented Jun 11, 2014

I think we have the same point of view regarding to APIs design :-) Seems BOP-Core is similar to WebP2P, I'll take a look.

So if I'm node 0, and I find out that node 5 is in charge of the key-value I'm looking for, how do I get to it in the context of WebRTC? Do I send a message to 4, which sends a message to 5, which sends a message back to 4, which sends a message back to me (0)? Or does 5 establish a DataChannel with 0 directly via the "bootstrapping/signaling server"?

I think here the correct way is 4 working as relay (no signaling server), working as an intermediary peer sending the messages back and forth. A better approach is that it work as an intermediary signaling channel to help to create a direct DataChannel between 0 and 5 (this is how WebP2P works, by the way :-) ), and later remember this connection as a shortcut, but don't know if Chord is intended to work this way. Maybe would this be considered an extension for the Chord protocol?

@makkes
Copy link
Author

makkes commented Jun 11, 2014

We are currently in the middle of implementing DHT functionality (using Chord, since it's a rather simple protocol). Our impl. should be ready by end of this month (we try to update the 'chord' branch as often as possible).

@jure, to answer your questions:

  1. The idea stated in the original Chord paper is that the ID of every host is generated by hashing IP address and port. Currently we just generate it with Math.random. And yes, the values have to be transferred to my host during the joining procedure.
  2. Yes, replication is kind of a must have, esp. with high churn.
  3. All our communication takes place under the paradigm of 'server-less communication'. We want to be independent of a single entity as much as possible. So as soon as you have joined the DHT, the connection management is conducted completely via the DHT layer without a single server brokering connections. This is the reason why our implementation will be much more complicated than others' (that just ask a server to broker the connection).

@jure
Copy link
Contributor

jure commented Jun 11, 2014

Thanks for the answers, @makkes!

Before I start working on this this afternoon, do I understand correctly that the code in the chord branch is not updated right now and there's more code locally that you haven't pushed yet?

My use case is distributed keyword search, so each node would be in charge of a certain number of keywords. So when a new node joins, it would take ownership of closest keywords and replicate them locally, correct? As in, it would actually go to the node which had ownership before and copy over the data from previous owner node to itself.

I like the fact that you involve the server as little as possible, because what I have in mind will likely have no funding and the only potentially expensive variable right now is the server.

Anyway, this is very interesting stuff. Keep up the good work and let me know if I can help! I honestly believe that with WebRTC and a smart overlay network, things can get super crazy.

@makkes
Copy link
Author

makkes commented Jun 11, 2014

Before I start working on this this afternoon, do I understand correctly that the code in the chord branch is not updated right now and there's more code locally that you haven't pushed yet?

Yes, I'll push an update ASAP.

My use case is distributed keyword search, so each node would be in charge of a certain number of keywords. So when a new node joins, it would take ownership of closest keywords and replicate them locally, correct? As in, it would actually go to the node which had ownership before and copy over the data from previous owner node to itself.

Correct. That procedure is precisely described in the original Chord paper: http://pdos.csail.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf

Anyway, this is very interesting stuff. Keep up the good work and let me know if I can help! I honestly believe that with WebRTC and a smart overlay network, things can get super crazy.

Totally! If you would like to have a look at some use cases we have in mind take a look at our most recent paper (http://inet.cpt.haw-hamburg.de/publications/wvs-lobsb-14.html) that we're presenting next month in Madrid.

@piranna
Copy link

piranna commented Jun 11, 2014

Totally! If you would like to have a look at some use cases we have in mind take a look at our most recent paper (http://inet.cpt.haw-hamburg.de/publications/wvs-lobsb-14.html) that we're presenting next month in Madrid.

Madrid? Spain?? I'm from here! :-D When do you come? :-)

@chris--
Copy link
Member

chris-- commented Jun 11, 2014

Madrid? Spain?? I'm from here! :-D Where do you come? :-)

We are headed to the ICDCS'14 from June 30th to July 5th. Maybe we can meet up somewhere if you are around ;)

@jure
Copy link
Contributor

jure commented Jun 11, 2014

I've actually already read this paper. A lot of your stuff comes up on Google Scholar, with regards to WebRTC, DHT, P2P, etc. :)

@piranna
Copy link

piranna commented Jun 11, 2014

Madrid? Spain?? I'm from here! :-D Where do you come? :-)

We are headed to the ICDCS'14 http://lsd.ls.fi.upm.es/icdcs2014 from
June 30th to July 5th. Maybe we can meet up somewhere if you are around ;)

D'oh! Exams and work just that week :-/ If so, it could only be just some
beers in the afternoon in the second half of the week :-)


Reply to this email directly or view it on GitHub
#4 (comment).

"Si quieres viajar alrededor del mundo y ser invitado a hablar en un monton
de sitios diferentes, simplemente escribe un sistema operativo Unix."
– Linus Tordvals, creador del sistema operativo Linux

@jure
Copy link
Contributor

jure commented Jun 11, 2014

A better approach is that it work as an intermediary signaling channel to help to create a direct DataChannel between 0 and 5 (this is how WebP2P works, by the way :-) ), and later remember this connection as a shortcut, but don't know if Chord is intended to work this way. Maybe would this be considered an extension for the Chord protocol?

I don't think this scales very well. For every node that you contact, you would have to maintain a WebRTC connection to it. Given some hard connection limits (i.e. Chrome 256, Firefox can't find right now but remember reading 51 somewhere), it makes sense to try and maintain the minimum possible.

I can see the potential performance benefits by cutting out as many as log N middlemen, where N is the number of nodes, but it's possible these benefits would drown in the sea of general slowdowns that occur as the network is trying to maintain many many more connections.

@jure
Copy link
Contributor

jure commented Jun 11, 2014

Before I start working on this this afternoon, do I understand correctly that the code in the chord branch is not updated right now and there's more code locally that you haven't pushed yet?

Yes, I'll push an update ASAP.

That would be awesome! I'm trying to come up with an MVP for my project, so I really don't mind if it's not perfect. I.e. things like replication, serverless WebRTC connection brokering, etc, are not really crucial. It would just be great to not start working on a month old branch, if there are updates :)

@piranna
Copy link

piranna commented Jun 11, 2014

A better approach is that it work as an intermediary signaling channel to help to create a direct DataChannel between 0 and 5 (this is how WebP2P works, by the way :-) ), and later remember this connection as a shortcut, but don't know if Chord is intended to work this way. Maybe would this be considered an extension for the Chord protocol?

I don't think this scales very well. For every node that you contact, you would have to maintain a WebRTC connection to it. Given some hard connection limits (i.e. Chrome 256, Firefox can't find right now but remember reading 51 somewhere), it makes sense to try and maintain the minimum possible.

It's not about having a connection with all the peers you have seen in
the past, it could be limited by using a least-recently-used cache
removing the old connections. Also, this cache could be independent of
the list of peers of Chord, so it doesn't get affected.

I can see the potential performance benefits by cutting out as many as log N middlemen, where N is the number of nodes, but it's possible these benefits would drown in the sea of general slowdowns that occur as the network is trying to maintain many many more connections.

Knowing what the limits are, I think we could take them in account.
For 1000000 peers, it would means a list of 1000 connections per peer.
Maybe with the limits of Chrome and Firefox this would not be
feasable, but also band-width is not wasted at the cost of more leaps
(O > log n, but still O < n). I don't think it's absolutely bad...

"Si quieres viajar alrededor del mundo y ser invitado a hablar en un
monton de sitios diferentes, simplemente escribe un sistema operativo
Unix."
– Linus Tordvals, creador del sistema operativo Linux

makkes pushed a commit that referenced this issue Sep 12, 2014
* First working implementation
* Currently uses only successor list (So complexity is O(n))
@makkes makkes self-assigned this Sep 12, 2014
makkes pushed a commit that referenced this issue Sep 15, 2014
* Started integrating Chord into the main application flow of BOPlishClient
* The fallback WebSocket channel is now passed to the Chord instance
* The 'local node' instance transparently uses the WebSocket channel for
  communication
* Added 'route' method to Chord

Other fixes:

* Fixed 'setRemoteDescription' calls which at least in Firefox need a callback
  function as parameter to succeed.
* Reactivated tests
makkes pushed a commit that referenced this issue Sep 15, 2014
* Fixed message routing
* Added error message type
* Removed unused 'to' parameter from Chord messages
makkes pushed a commit that referenced this issue Sep 16, 2014
* Implemented 'registerDeliveryCallback' and protocol type handling
makkes pushed a commit that referenced this issue Sep 16, 2014
* Fixed ChordNode#put so that you can just pass a JSON object now
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants