Weeb-3 - A Swarm client for browsers

This project is a work in progress swarm client implementation relying solely on browser side technologies. It uses wasm-pack to build the project for use in the browser.

Building the code

Ensure you have wasm-pack, protoc, and clang installed.

Build the client library:

wasm-pack build --target web --out-dir static

Start the local server to serve html, js and wasm files:

cargo run

Note this server uses an unsecure self-signed certificate to provide https, which is not sufficient to enable Service Workers in chrome etc. This enables displaying single files from swarm, however to display websites a service worker is necessary, which requires a certificate deemed safe by the browser. You can however get your own safe certificate from - for example - github pages by forking the repository and setting the github pages to 'docs', and copying your latest version of the files from the static folder to the docs folder.

Open the URL (https://localhost:8080 or for the github pages hosted version https://lat-murmeldjur.github.io/weeb-3)

[Notes]

How it works (architectural overview)

The weeb-3 client consists of several logical components:

The web interface (implemented by src/interface.rs & of course static/index.html) that communicates with the shared web worker and loads the service worker
The libp2p/swarm client running in a shared web worker (with main entry points in src/lib.rs & static/worker.js)
A service worker, that enables hot-loading assets on relative paths for websites loaded from swarm (found in static/service.js)

Below is a piece by piece overview of the current logic of components

The interface

As of "commit number 114" the instantiator of all components is the index.html that starts the interface (by calling the function "interweeb" from src/interface.rs).

The interweeb function has the following roles (in order of appearing in the code):

Starting the libp2p/swarm client in a shared worker (if not already running)
Gaining a handle for the message port of the shared worker, to be able to make requests towards it
Setting a listener on the navigation text input field, triggering requests to the shared worker on content change of the navigation text input field
Listening to resources sent back from the shared worker, and displaying them if possible, otherwise displaying service worker unavailibility warnings or content not found results

The shared worker

Shared web workers have no access to the window object and some associated functions, as they exist as a common resource for multiple tabs of the same origin. This design choice enables having one swarm client serving multiple open tabs, however, at the cost of giving up on the possibility of being able to open webrtc connections. Shared web workers can however still make websocket connections, so the client rides on the websocket transport layer of libp2p.

The high level architecture of the client resides in src/lib.rs, which (in order of appearing in the code) implements the following functions:

Defining and importing the swarm protocols generated by the protoc compiler
Defining the client (as the class Sekirei), it's in-memory registry of peers and their accounting (as the struct Wings)
Defining the 3 main functions of the client, namely
1. Instantiation (the "new" function), that starts the libp2p client
2. A function that continues running the client, asynchronously maintains/establishes connections, serves requests from the interface, and engages in protocols with the swarm (the "run" function)
3. A function that enables using the running client as a receiver to make requests towards it (the function "acquire")

The actual shared web worker is the javascript file that uses these functions (static/worker.js), which starts up the client, calls its maintained run function, and listens to the shared worker message port for requests from the interface, triggering "acquire" calls on the running client and relaying their respective results back to the interface.

In slightly more detail, the new function does the following (in order of appearing in the code):

Randomises a new secret keypair
Starts a libp2p client with a number of libp2p protocols enabled (autonat, dcutr, identify, ping, stream) using websocket transport
Creates a registry of peers (connected_peers, overlay_peers) and peer accounting (accounting_peers, ongoing_refreshments)
Creates a message port (to be listened to by the client and to be used by the acquire function) This message port can receive the writing end of a channel of bytes along with an address, so that it can write back the results of looking up the address to the channel received.

The run function of the client implements an asynchronous architecture that does the following functions (in order of appearance in the code):

Creating channels for swarm specific functions
1. Receiving new peers to connect from the gossip protocol (peers_instructions_chan, connections_instructions_chan)
2. Accounting related functions (accounting_peer_chan, pricing_chan, refreshment_instructions_chan, refreshment_chan)
3. A channel to enable concurrent data retrieval from subcomponents (data_retrieve_chan)
Setting up listening to gossip protocol messages (information about existing peers) and pricing protocol messages (for receiving connected peers payment threshold updates)
Connecting to a (currently hardcoded) bootnode address
An async routine to continously establish new libp2p-connections (dial) and consume libp2p-swarm events (swarm_event_handle)
An async routine that wraps a number of further async routines for the following functions (event_handle):
1. Completing handshakes with successfully dialed peers (k0)
2. Accounting connecting newly established peer connections (k1)
3. Setting payment thresholds for peers after successfully receiving payment threshold updates in the pricing protocol (k2)
4. Initiating refreshments/pseudosettle protocol for peers when triggered by accounting actions (k3)
5. Registering the results of successful refreshments towards peers (k4)
An async routine that listens to the shared worker message port for incoming requests from the interface (retrieve_handle)
An async routine that enables multiple subcomponents to simultaneously trigger retrieving chunks or joined data (retrieve_data_handle) Currently - due to the blocking - non-blocking nature of the async framework, and to avoid a waiting thread hogging the single execution thread, the aforementioned routines intermittently try progressing every 600ms with non cpu intensive async sleeps happening in-between.

The Swarm Client Subcomponents

The aforementioned architecture further depends on the following code modules:

The protocol handlers for handshake, hive, pricing, pseudosettle and retrieval (src/handlers.rs)
The accounting functions, such as calculating chunk prices, reserving, crediting, refreshing (src/accounting.rs)
The retrieval logic such as selecting peers to retrieve chunks from, joining files or triggering manifest interpretations (src/retrieval.rs)
The manifest interpretation logic (src/manifest.rs)
Common methods and struct declarations including DOM manipulation, calculating proximity orders, validating content addressed and single owner chunks, calculating feed addresses, and encoding/decoding resource groups to communicate through byte channels e.g. towards the interface (src/conventions.rs)

The Service Worker

Quoting from the MDN documentation, "Service workers essentially act as proxy servers that sit between web applications, the browser, and the network (when available). They are intended, among other things, to enable the creation of effective offline experiences, intercept network requests, and take appropriate action based on whether the network is available, and update assets residing on the server. They will also allow access to push notifications and background sync APIs.".

The weeb-3 interface functions can display single files, for example pictures, documents, and other single files without relying on the service worker through dynamically creating blobs with associated mime types from the data retrieved from swarm, and making them available on virtual urls. These resources are displayed in embed tags prepended to the content of the resultField html tag.

However, this createObjectUrl method inserts random strings into the virtual urls assigned to individual resources, which makes it unfeasible to be used to render complete websites, as relative paths of assets embedded in the site would be broken by such random strings. This necessitates the use of a service worker, which can intercept the http requests aimed towards the de facto server (for example github pages) and is able to create objects with deterministic url paths to be served in response to these requests, making serving relative assets possible.

To enable this, upon detecting a website manifest, the interface sends each retrieved resource complete with relative path and mime type to the service worker (the message event listener in static/service.js), which injects it into a named cache ('default0'), before prepending the website index document as an iframe to the resultField html tag of the weeb-3 browser tab.

This service worker is only enabled by the browser if the browser detects that the site is served through a secure https connection based on a trusted certificate, as this functionality is clearly a security sensitive asset that can be used to intercept http requests and inject arbitrary resources. The use of this functionality also creates new surfaces of attack such as creating malicious websites that could load a malevolent service worker at runtime, to enable further malicious injections. Disableing such attacks is part of the security development topic of the planned developments section.

[Planned development]

Adding text input fields to the interface for selecting bootnode address, RPC node address, and swarm network id, to enable switching between networks, e.g. mainnet, testnet or other custom networks.
Adding functionality to the service worker to enable triggering requests towards the shared worker, to retrieve resources when swarm references are present in a website, alternatively, achieving the same by overwriting navigation bar contents when an onclick event is detected to be a swarm reference
Adding the swarm manifest encryption feature
Simultaneous manifest fork lookups
Adding the swarm ACT feature
ENS lookups
Wallet related functionality such as cheques and buying storage space on swarm to enable uploads, engaging in pushsync
Browser-side caching of retrieved data through indexeddb or other forms of local storage
Developing security of origin separation between swarm websites loaded
Hardening against service worker replacement and other injection types of attacks

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
.github/workflows		.github/workflows
docs		docs
hax		hax
src		src
static		static
.gitignore		.gitignore
Cargo.toml		Cargo.toml
Code_One.hx		Code_One.hx
Enter.hx		Enter.hx
LICENSE		LICENSE
README.md		README.md
build.rs		build.rs
import.hx		import.hx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Weeb-3 - A Swarm client for browsers

Building the code

[Notes]

How it works (architectural overview)

The interface

The shared worker

The Swarm Client Subcomponents

The Service Worker

[Planned development]

About

Releases

Packages

Languages

License

lat-murmeldjur/weeb-3

Folders and files

Latest commit

History

Repository files navigation

Weeb-3 - A Swarm client for browsers

Building the code

[Notes]

How it works (architectural overview)

The interface

The shared worker

The Swarm Client Subcomponents

The Service Worker

[Planned development]

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages