Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add spec for status protocol stack, deprecate waku-usage spec #105

Merged
merged 10 commits into from
Oct 25, 2024

Conversation

jm-clius
Copy link
Contributor

Scope

This adds a new raw specification covering the Status protocol stack, including:

  • common app-level features (content topics, functional scopes, ephemerality)
  • e2e reliability layer
  • encryption layer (TBD)
  • Waku transport layer

It adds directives on:

  • how to assign pubsub topics for sharding
  • how to design content topics
  • what strategies for publishing/subscribing to use for self-addressed messages
  • what interim strategies for publishing/subscribing to use for large messages (e.g. Community Description)

It does not define any of the app-level functions itself. These most properly belong in the 1:1 Chat and Community Specifications.

Since this specification also addresses the Waku transport layer, it deprecates the old WAKU2-USAGE spec.

What major items are still missing from this spec?

  • the entire section on encryption is TBD
  • the content topic section can probably be expanded with more precise directives (cc @chaitanyaprem)
  • we need a spec and proper name for the "new" e2e reliability protocol which we can link to from this spec (cc @shash256)

What are the next steps?

1. Continue revising Status specs

This spec is envisioned to be the "landing page" spec that establishes the high-level strategy and concepts for all Status protocols (similar to the role 10/WAKU2 played for Waku protocols). The next step would be to revise the 1:1 Chat and Community specs according to the concepts established in this specification, e.g. being explicit about content topic and functional scope for each app-level message. In the process, we can probably merge or deprecate a whole bunch of other Status specs.

2. Continue implementing newly specified strategies

The main purpose of writing the Status specs is to carefully establish reasonable strategies to save bandwidth, establish e2e reliability, etc. We should continue the work to bring the application implementation in line with the specifications.

Copy link
Contributor

@fryorcraken fryorcraken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

status/raw/status-app-protocols.md Outdated Show resolved Hide resolved
status/raw/status-app-protocols.md Outdated Show resolved Hide resolved
status/raw/status-app-protocols.md Show resolved Hide resolved
The specific set of Waku protocols used depend on desired functionality and resource usage profile for the specific client.
Resources can be restricted in terms of bandwidth and computing resources.

Waku protocols that are more appropriate for resource-restricted environments are often termed "light protocols".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have started to write an explanation article about this. Will share draft when ready and we can start to move away from non-relay/light protocol terminology.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'm happy to also adapt to whatever terminology we deem best. In case it helps, reasoning behind my sticking with "light" and "full":

  • people seemed to always revert to this, even while we pushed for "resource-restricted" and "adaptive"
  • it fits current Status client terminology
  • relay vs non-relay wouldn't be accurate as each client will (or should) use a combination of filter/lightpush/relay etc. depending on resources.

each using the combination of "full" and "light" protocols most appropriate to match its environment and motivations.

To simplify interaction with the selection of "full" and "light" protocols,
Status clients MUST define a "full mode" and "light mode"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Status clients MUST define a "full mode" and "light mode"
Status clients MUST define a "full mode" and "light mode"

Same, I want to potentially propose alternative terminology. Will work on it and share draft.

to retrieve the full contents of historical messages that the client may have missed during offline periods,
or to populate the local message database when the client starts up for the first time.

**Store queries for reliability**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should clean up reliability specs (my bad, I committed to do it and never got around to it) so we can refer to them here instead of restating logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will be great.

status/raw/status-app-protocols.md Outdated Show resolved Hide resolved
in order to provide light subscription and publishing services to other clients
for each pubsub topic to which they have a relay subscription.

Status clients MAY mount the store query protocol as service node (see [WAKU2-STORE](https://github.com/waku-org/specs/blob/8fea97c36c7bbdb8ddc284fa32aee8d00a2b4467/standards/core/store.md))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we there yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but I think the spec should provision for this possibility, but it will remain underspecified until we have a proper design for decentralised store.

also known as _self-addressed_ messages,
MUST be published to a distinct pubsub topic or a distinct _set_ of pubsub topics
used exclusively for messages with local scope (see [Pubsub topics and sharding](#pubsub-topics-and-sharding)).
Status clients (full or light) MUST use lightpush protocol to publish self-addressed messages (see [Publishing](#publishing)).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we set expectations for the service node (eg usage of store sync over relay)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is underdefined for now.


Status clients MAY provide service-side protocols to other clients.

Full clients SHOULD mount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may want to mention peerExchange and discv5 as well here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. The whole topic of discovery is still (purposely) missing, while I was gathering some thoughts. Will add in a subsequent commit.

into smaller segments for individual Waku transport.
The definition of a large message is up to the application.
However, the maximum size for a [14/WAKU2-MESSAGE](../../waku/standards/core/14/message.md) payload is 150KB.
Status application payloads that exceed this size MUST be chunked into smaller pieces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this statement contradicting in a way.
it states that large payloads must be chunked and also considered for large messages which mean they use a dedicated large-messages pubsubtopic.

I am wondering if pubsubtopics/shards are going to be dedicated for large messages do we need to set the 150KB limit or increase it since clients anyways are not going to subscribe to these topics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it states that large payloads must be chunked and also considered for large messages which mean they use a dedicated large-messages pubsubtopic.

I expect large messages to be chunked, in more chunks.

I am wondering if pubsubtopics/shards are going to be dedicated for large messages do we need to set the 150KB limit or increase it since clients anyways are not going to subscribe to these topics.

Fair q.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if pubsubtopics/shards are going to be dedicated for large messages do we need to set the 150KB limit or increase it since clients anyways are not going to subscribe to these topics.

Indeed a fair question to consider in future, but I don't think needs to be defined right now. The point is that the total data transfer for large messages (whether in chunks or not) needs a designed mechanism to (1) prevent them from being broadcast to everyone (2) allow users to download ad-hoc.


The application SHOULD define at least one separate pubsub topic for each separate community's community control and community content messages.
The application MAY define a set of more than one pubsub topic per community to allow traffic sharding for scalability.
It is RECOMMENDED that separate pubsub topics be used for global control messages and global content messages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is RECOMMENDED that separate pubsub topics be used for global control messages and global content messages.
It is RECOMMENDED that separate pubsub topics be used for community control messages and community content messages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Addressed in 9ce3bc5

To simplify interaction with the selection of "full" and "light" protocols,
Status clients MUST define a "full mode" and "light mode"
to allow users to select whether their client would prefer "full protocols" or "light protocols" by default.
Status Desktop clients are assumed to have more resources available and SHOULD use full mode by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for user facing devices/apps it should be light by default (desktop could have a switch to enable full mode). In most users' experience, the same app in the different devices should mostly behave the same, there may be more features in a desktop environment, but by default we should make it consistent between devices.
Without this, user will likely get surprised when finding the differences, also every app needs to educate user the differences.

I understand that there needs a lot of work to make this a reality, but for specification, the content that are works in long term makes more sense to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for user facing devices/apps it should be light by default (desktop could have a switch to enable full mode). In most users' experience, the same app in the different devices should mostly behave the same, there may be more features in a desktop environment, but by default we should make it consistent between devices.

Ultimately the question is for Status product team. But considering the desired properties, it is fair to assume that a desktop app may consume more resources. The same way it's fair to assume that using tor browser is slower, or using BitTorrent over FTP consumes more upload bandwidth.

If the Status product team decides that Desktop and Mobile must behave the same, and both must use light mode by default. Then the question is "where is the infrastructure"?

This can be compensated by either:

  1. encouraging users to run companion "nodes": a second software on your laptop, rasberry pi, cloud or enable relay mode
  2. Or having incentivization in place, where users or community owners pay for the resources of the network, with a large enough number of nodes to support this relay network

In this context, it also mean that by "default" privacy and censorship resistance are less than what they could be.

(1) is unlikely to be enough to allow the application to be reliable, as regularly stated.

(2) we are not ready yet, work is in progress and I expect to agressivly ramp up this stream in 2025 H1.

Once (2) is ready, then Status product will have a choice between:
A. Free product to use, but users share resources by default
B. Paying product for users, but they dont share resources by default.

(A) was clearly the selected direction so far by Status.

Copy link
Contributor

@chaitanyaprem chaitanyaprem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM apart from few minor comments.


**Community messages**

The application SHOULD define at least one separate pubsub topic for each separate community's community control and community content messages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe like this to avoid repeating community thrice?

Suggested change
The application SHOULD define at least one separate pubsub topic for each separate community's community control and community content messages.
The application SHOULD define at least one separate pubsub topic for each separate community's control and content messages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Addressed in 9ce3bc5

@jm-clius jm-clius requested a review from osmaczko October 15, 2024 15:03
@igor-sirotin igor-sirotin self-requested a review October 15, 2024 15:04
Copy link
Collaborator

@jimstir jimstir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few suggestions, also markdown-linting still needs to be addressed. Besides that LGTM! Great spec 👍

status/raw/status-app-protocols.md Outdated Show resolved Hide resolved
status/raw/status-app-protocols.md Outdated Show resolved Hide resolved
status/raw/status-app-protocols.md Outdated Show resolved Hide resolved
Comment on lines +283 to +284
Status clients SHOULD use the store query protocol, as specified in [WAKU2-STORE](https://github.com/waku-org/specs/blob/8fea97c36c7bbdb8ddc284fa32aee8d00a2b4467/standards/core/store.md), to retrieve historical messages relevant to the client from store service nodes in the network.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Status clients SHOULD use the store query protocol, as specified in [WAKU2-STORE](https://github.com/waku-org/specs/blob/8fea97c36c7bbdb8ddc284fa32aee8d00a2b4467/standards/core/store.md), to retrieve historical messages relevant to the client from store service nodes in the network.
Status clients SHOULD use the store query protocol, to retrieve historical messages relevant to the client from store service nodes in the network. For more information, see [WAKU2-STORE](https://github.com/waku-org/specs/blob/8fea97c36c7bbdb8ddc284fa32aee8d00a2b4467/standards/core/store.md).

status/raw/status-app-protocols.md Show resolved Hide resolved
status/raw/status-app-protocols.md Outdated Show resolved Hide resolved
@jm-clius jm-clius merged commit 37b3edf into main Oct 25, 2024
0 of 2 checks passed
@jm-clius jm-clius deleted the docs/add-status-protocol-stack branch October 25, 2024 16:33
Copy link

@osmaczko osmaczko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late to the party. Great specs! From my understanding, this outlines the protocol’s ideal state, rather than how it currently operates, correct?


#### Community scope

3. _Community control_: messages enabling the basic functioning of the app to control features _only relevant to members of a specific community_. Examples include Community Membership Updates, community Status Updates, etc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, that's not entirely true. Users subscribe to updates for communities they may not be members of. An example is the Discover Communities section.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I imagine this is based on some community description messages that are published "to everyone" - i.e. at least on shards that everyone is subscribed to and is not encrypted by the community key? In this case, these would still be modelled as "global" messages as they affect operations that everyone should be able to participate in. "Community control" refers to those messages that can be distributed only within a single community.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that information is derived from CommunityDescription. Please note that CommunityDescription itself is never fully encrypted; only specific sub-elements, like members and channels, are encrypted depending on the context. Even so, these cards will render correctly, with the exception of member counts and active member counts, which remain inaccessible to users without the necessary key.

As for shards, CommunityDescription can reside within its own community shard. The protocol is designed to ensure that everyone can locate the shard a given community uses, enabling them to subscribe to that shard to access the description. Essentially, the community owner publishes CommunityDescription on the community shard and shares shard details on the default shard.
Rationale: status-im/status-go#3961 (comment)
Issue: status-im/status-go#4230
Implementation: status-im/status-go#4499

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think the issue here might be that eventually we'd want Communities to be able to provide their own infrastructure - their own bootstrap nodes, Store nodes, etc. In this case we may not want everyone to be able to retrieve the Community Description (which would require discovering and using that Communities internal service infrastructure). Of course, it will be easy to define a global shard only used for "discoverable" Community descriptions, so I don't think the spec prohibits either design.

| Status application layer |
| End-to-end reliability layer |
| Encryption layer |
| Transport layer (Waku) |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a Segmentation layer just before Transport layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is very helpful for my understanding. I included a section on "chunking large messages", but this should indeed be a separate layer with separate paragraph.

In other words, the number of content topics defined in the app SHOULD match the number of filter use cases.
For the sake of illustration, consider the following common content topic and filter use cases:

- if all messages belonging to the same 1:1 chat are always filtered together, they SHOULD use the same content topic (see [55/STATUS-1TO1-CHAT](../55/1to1-chat.md))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: 1:1 chats may use different content topics depending on the state of topic negotiation between the two parties.

and consequently SHOULD also make use of MVDS to achieve reliable data synchronisation between all parties involved in the communication.
Non-ephemeral 1:1 and private group chat messages MAY make use of of [scalable distributed log reliability](https://forum.vac.dev/t/end-to-end-reliability-for-scalable-distributed-logs/293/16) in future.
Since MVDS does not scale for large number of participants in the communication,
non-ephemeral community messages MUST use scalable distributed log reliability as defined in this [original forum post announcement](https://forum.vac.dev/t/end-to-end-reliability-for-scalable-distributed-logs/293/16).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of communities, an app-level reliability mechanism called peersyncing was introduced some time ago. It is currently disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed! I chose to exclude it (for now) as its future is uncertain.


## Encryption layer

The encryption layer wraps the Status App and Reliability layers in an encrypted payload.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The payload is not always encrypted. Both public and private messages are wrapped in an encryption layer; however, only private messages are actually encrypted but both carry x3dh bundle information.

The Status application MAY use a chunking mechanism to break down large payloads
into smaller segments for individual Waku transport.
The definition of a large message is up to the application.
However, the maximum size for a [14/WAKU2-MESSAGE](../../waku/standards/core/14/message.md) payload is 150KB.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the case atm: status-im/status-go#4955

@osmaczko osmaczko requested a review from Samyoul October 29, 2024 14:07
@jm-clius
Copy link
Contributor Author

Thanks for your insights, @osmaczko. Really clarified a couple of things for me.

From my understanding, this outlines the protocol’s ideal state, rather than how it currently operates, correct?

Well, I wouldn't call the specified protocol the "ideal state" yet, but it certainly aims to specify beyond the current state. I wanted to capture the short and medium-term improvements that we have made and plan to make for the Status protocols to reach a viable state. The next steps would be to ensure that we do all the implementation work to bring the implementation to parity with the spec. Most of the remaining work relates to the introduction of e2e reliability, better content topic usage (advanced stage of implementation done) and different sharding strategy,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants