Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft for apps data protocol #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 119 additions & 0 deletions SWIPs/swip-draft_app_data_protocols.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
SWIP: <to be assigned>
title: App data protocols
author: Paul Le Cam (@PaulLeCam)
discussions-to: https://beehive.ethswarm.org/swarm/channels/dappprotocols
status: Draft
type: Standards Track (Core, Networking, Interface)
category: Interface
created: 2019-08-06
---

<!--You can leave these HTML comments in your merged SWIP and delete the visible duplicate text guides, they will not appear and may be helpful to refer to if you edit it again. This is the suggested template for new SWIPs. Note that a SWIP number will be assigned by an editor. When opening a pull request to submit your SWIP, please use an abbreviated title in the filename, `SWIP-draft_title_abbrev.md`. The title should be 44 characters or less.-->

## Simple Summary

<!--"If you can't explain it simply, you don't understand it well enough." Provide a simplified and layman-accessible explanation of the SWIP.-->

App developers shouldn't have to reinvent the wheel when building on top of Swarm.

## Abstract

<!--A short (~200 word) description of the technical issue being addressed.-->

Apps built on top of Swarm share similar end-user needs, such as establishing contact with other users, sending messages, discovering files...

This SWIP aims to define core data structures and protocols that can be implemented by any app using Swarm to support these needs.

## Motivation

<!--The motivation is critical for SWIPs that want to change the Swarm protocol. It should clearly explain why the existing protocol specification is inadequate to address the problem that the SWIP solves. SWIP submissions without sufficient motivation may be rejected outright.-->

User data is currently exploited by technology companies with little respect for privacy and security, let alone transparency about the usage they make of this data.
Even though data is generated by the users, it is usually not owned by them but rather the service providers that have incentives to use this as a way to lock-in their users.

Swarm can provide an opportunity to shift this relationship between data ownership and services using this data by allowing users to store their data directly in Swarm, and grant access to the apps and services they choose.
The problem is then to define data formats that can be shared by different apps and services having different purposes, and protocols to ensure compatiblity and security between all the interested parties interacting with this data.

Beyond end-user incentives of data ownership, shared formats and protocols can be beneficial to application and services developers as a way to get started faster with development on top of Swarm, and possibly access to existing user data.

## Specification

<!--The technical specification should describe the syntax and semantics of any new feature. The specification should be detailed enough to allow competing, interoperable implementations for the current Swarm platform and future client implementations.-->

### Scope (WIP)

- Should only rely on existing features of Swarm, but might evolve as new possibilities are added
- Should be usable via a HTTP gateway
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean exactly? No PSS like functionality can be used?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the core protocols should work with the "minimum service" provided by Swarm, but additional functionalities could use additional Swarm capabilities as they are available.
For example we could have a communication channel between 2 contacts that uses feeds by default, but users could opt-in to send messages over PSS as well for faster interactions.
What do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I understand what you mean and this absolutely makes sense.

- Should protect the user's security and privacy as much as possible, and document necessary trade-offs
- Should define core data structures and validation methods
- Should define custom extensions to core data structures
- Should define protocols for data authoring and discovery accross multiple apps/devices/services and between different users
- Should support versioning

### Terminology (WIP)

- Actor: human entity (person, group...)
- Agent: code acting on behalf of an Actor (app, device, service...)
- Resource: data accessible on Swarm
- File (any binary data)
- Entity (JSON data defined in spec)
- Source (single Entity or Entity feed)
- Publication (list of Sources)

### Research areas (WIP)

#### Data structure format

- Should we use JSON or a binary format such as protocol buffers?
Copy link

@agazso agazso Aug 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both has advantages and disadvantages. With JSON

  • it's easy to use from any programming language
  • it can be self descriptive
  • it's usually not well-defined or standardised, so things like schemas or signatures are more complicated

With binary:

  • A schema can be used and enforced
  • Need more tooling to use/debug and this can be more complicated when developing cross-platform apps
  • Better performance, especially when dealing with more data

I think at the early stage of a protocol it's better if adoption is easier and in that case JSON seems a better choice.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Let's start with JSON and we can reevaluate if performance or schema enforcement becomes a problem.

- What about validation?

#### Entities extensibility

- Can we restrict the spec to limited number of core Entities while allowing for more complex Entities to be added as extensions?
- Can we provide fallback types for complex Entities that would not be supported by a given client?
- What about validation of unknown (non-core or from supported extensions) Entities?

#### Core entities and protocols

- Define what entities should be part of core vs extensions.
- Define validation rules for these core entities.
- Define discover and exchange rules for these entities.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by discover and exchange rules?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if Alice wants to contact Bob, the protocol should define "where" (basically the feed parameters) Alice should write so that Bob can discover it, and what encryption keys they need to use.


#### Key management

- How could an Actor add and remove (revoke) Agents and Resources at will?
- How could an Agent add and remove (if own) a Source from a Publication?

#### Data encryption

- Should a single algorithm or multiple ones be supported/recommended?
- Should Swarm built-in encryption and/or ACT be used?
Copy link

@agazso agazso Aug 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to use light nodes everywhere and use Swarm's built-in encryption.

However light nodes are not going to be production ready for a while and there may be always use-cases where they are not feasible to use (think of embedded devices). Also I can imagine hybrid products and solutions that are not completely decentralized but are using Swarm as a storage layer and therefore using gateways to access data.

For those cases it would be good to have a protocol level, standardised encryption layer to provide end-to-end encryption. I was thinking of somehow using Swarm's encryption code for that but the encryption in Swarm is tightly coupled with how the files are stored so I assume it would come with a big dependency on other components (think of chunker, hashing etc).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree light nodes are the best option but I think it would be good to start working on these protocols based on the current version of Swarm so we are not limited by Swarm's development timeline, though obviously as new features are released we should adapt accordingly.

I think the question is also if we want to support Web apps hosted on Swarm and accessed using a Web browser via a Swarm gateway?
Personally I'd rather not support this use case because I think there are too many potential security and privacy risks, but at the same time if this is a primary use-case for Swarm I think these protocols should support it in order to be relevant.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, again it's the question of adoption. Assuming installing something decreases the willingness of people to try out new things, I think most people will use their first dapps from the browser. And until the majority of the browsers don't support Swarm the only option is a gateway.

I also understand the risks you mentioned with this but if we already have implementations in Javascript I don't see a reason why the browsers should not be supported. Regarding the protocol this doesn't require too much effort on our side, for example in the case of encryption the only requirement is to choose a standard which has a Javascript implementation which is also supported in the browser.

What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think it's OK to be cautious about not defining requirements that would prevent usage in Web browsers, I guess I'm mostly concerned about developers hosting their apps using a gateway, that would completely undermine security and privacy measures the protocols would otherwise define.


## Rationale

<!--The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work, e.g. how the feature is supported in other languages. The rationale may also provide evidence of consensus within the community, and should discuss important objections or concerns raised during discussion.-->

TODO

## Backwards Compatibility

<!--All SWIPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The SWIP must explain how the author proposes to deal with these incompatibilities. SWIP submissions without a sufficient backwards compatibility treatise may be rejected outright.-->

N/A

## Test Cases

<!--Test cases for an implementation are mandatory for SWIPs that are affecting changes to data and message formats. Other SWIPs can choose to include links to test cases if applicable.-->

TODO

## Implementation

<!--The implementations must be completed before any SWIP is given status "Final", but it need not be completed before the SWIP is accepted. While there is merit to the approach of reaching consensus on the specification and rationale before writing code, the principle of "rough consensus and running code" is still useful when it comes to resolving many discussions of API details.-->

TODO

## Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).