Skip to content

Commit bf6e904

Browse files
committed
Problem: MLS member leaving/joining interaction with council nodes not documented (WIP crypto-com#141 fixes crypto-com#142)
Solution: sketched out a doc containing more details beyond the original implementation plan (it still lacks some parts, so still WIP) + some parts -- e.g. new transaction types -- can be moved to other modules when it's agreed on and detailed/polished later -- also, as the MLS protocol draft changed since the original implementantion plan, key rotation procedure was simplified to make use of MLS exporter functionality
1 parent 5e3b666 commit bf6e904

File tree

1 file changed

+247
-0
lines changed

1 file changed

+247
-0
lines changed

docs/modules/tdbe.md

Lines changed: 247 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
# Transaction Data Bootstrapping Enclave details
2+
3+
Transaction Validation Enclave (TVE):
4+
- responsible for validating Enclave tx types;
5+
- holds current tx obfuscation key;
6+
- reacts to *immediate* requests (chain-abci, i.e. potentially uncommitted or invalid transaction; encryption requests from tx-query enclave)
7+
- chain-abci directly talks to it during (consensus) state-machine execution
8+
9+
Transaction Data Bootstrapping Enclave (TDBE):
10+
- responsible for generating MLSHandshake tx types;
11+
- fetches old transaction data payloads;
12+
- prerequisite for a node administrator to submit a node join tx request or construct genesis.json
13+
- reacts to *block-committed* requests:
14+
- needs to run a Tendermint light client verification (TODO: trusted anchor injected as compile-time data?) and be connected to its Tendermint node RPC
15+
- chain-abci does not talk to it (it may invoke it during startup / for fetching old tx data, but can't talk to it during state-machine execution)
16+
- "knows" when it should generate MLSHandshake tx: 1) when in mid-time of keypackage expiration; 2) being "leftmost" node
17+
- based on valid block-committed requests, it should update its internal states, derive secrets and push tx obfuscation key to TVE
18+
19+
20+
## new/rejoining node
21+
(for non-genesis specified ones)
22+
- submit council or community node join request TX
23+
- if valid, wait until CommitChangeTx is committed with Welcome payload
24+
- mark the block where this CommitChangeTx is committed as the "fetched up to" block
25+
and obtain old transaction data
26+
27+
TODO: fetch keypackages/leaf nodes? https://github.com/mlswg/mls-protocol/issues/344
28+
29+
### Obtaining old transaction data
30+
When a node is started up (before tendermint consensus syncing etc.), it should be provided
31+
"genesis-skip" (in the case it's one of nodes starting at genesis)
32+
or one or more connection strings
33+
that would proxy to some public full node RPC.
34+
35+
Via the Tendermint light-client-verifying connection, it should obtain the list of
36+
needed transaction IDs (of withdrawal and transfer transactions):
37+
38+
* full node / historical querying: all IDs between its last processed block and the last block on the remote node
39+
* validator: new IDs in the "UTXO set diff" between its last processed block and the last block on the remote node
40+
(TODO: https://github.com/crypto-com/chain/issues/794 ; initial implementation here can start with just "all IDs" for both)
41+
42+
From remote node RPC, it should also learn its TDBE connection details
43+
and then initiate enclave-to-enclave mutually attested TLS: TODO https://github.com/crypto-com/chain/issues/1549
44+
and fetch the needed transaction payloads and seal them for its host.
45+
46+
TODO: handling in abci -- options (probably start with option 1):
47+
1. record the "fetched up to" block;
48+
report to TM still the last processed block
49+
and run normal TM block-by-block syncing and skip tx-validation-enclave until the "fetched up to" block
50+
2. TM 0.34+ -- run state-sync for jellyfish merkle / staking state https://docs.tendermint.com/master/spec/abci/apps.html#state-sync
51+
and resume from the "fetched up to" block?
52+
53+
## Sealing / recovering
54+
on genesis or when the (add/update) request is committed,
55+
TDBE should seal its init key + credential + "trusted anchor" parts (app hash components, validator identities... block number)
56+
with app hash (or some state fingerprint?) as AAD.
57+
58+
Restarted TDBE (that wasn't kicked out) starts off this state via the TM light client
59+
60+
## Security upgrades
61+
TDBE/chain-abci should keep track of the highest `isv_svn` observed
62+
(in a valid committed tx) and reject Add/Update proposals with a lower number.
63+
for (old) data bootstrapping:
64+
65+
- TDBE (server): reject all TLS connections with lower `isv_svn` than the highest one
66+
- TDBE (client): keep track keypackage `isv_svn` proportions (based on MLSHandshake messages);
67+
- if (highest `isv_svn` count / lower `isv_svn` count >= 2/3), reject all TLS connections with lower `isv_svn` than the highest one
68+
- otherwise, tolerate fetching data from servers where `isv_svn` == `the highest one - 1`
69+
70+
### SVN verification + compilation order
71+
Besides verifying `MRSIGNER`, ProductId, etc. is the same as in `Report::for_self()`, one should also verify:
72+
73+
- TDBE: if (other party's `isv_svn`) == (my `isv_svn` / in the report), check MRENCLAVE is the same
74+
- TVE: (incoming local connection's `isv_svn`) == (my `isv_svn`) and MRENCLAVE is corresponding to compile-time encoded value of TQE or TDBE (depending on the expected connection type)
75+
76+
NOTE: as TVE only expects local communication, it should expect `isv_svn` to be exactly the same as its own
77+
(the node operator would update all binaries at once on its node).
78+
79+
This mandates the following compilation / signing order:
80+
81+
1. TDBE
82+
2. TQE
83+
3. TVE: (provided TDBE's and TQE's MRENCLAVE values at compile-time)
84+
85+
## Obfuscation key rotation via Message Layer Security handshakes
86+
### data type modifications
87+
```rust
88+
struct StakedState {
89+
address: StakedStateAddress,
90+
nonce: u64,
91+
bonded: Coin,
92+
unbonded: Coin,
93+
unbonded_from: Timespec,
94+
node_details: Option<NodeDetails>,
95+
}
96+
97+
enum NodeDetails {
98+
Council(Validator),
99+
Community(FullNode)
100+
}
101+
102+
struct FullNode {
103+
council_node: CommunityNode,
104+
jailed_until: Option<Timespec>, // TODO: should there be jailing? for invalid handshake tx submission?
105+
inactive_time: Option<Timespec>, // TODO: when missing submitting handshake tx? should validator have 2 "inactive times"?
106+
}
107+
108+
pub struct CommunityNode {
109+
pub name: Name, /// name / moniker (just for reference / human use)
110+
pub security_contact: Contact, /// optional security@... email address
111+
pub confidential_init: ConfidentialInit, // contains the keypackage
112+
}
113+
```
114+
### extra network params
115+
- community node minimal required stake
116+
- mls handshake commit timeout
117+
- slash rate for invalid commits?
118+
- keypackage expiration time (as there are consensus-related rules --
119+
e.g. when to remove nodes with expired keypackages whose TDBE failed to submit update in time;
120+
when/how often update can be submitted -- it probably should be a consensus parameter)
121+
122+
### genesis.json creation
123+
The genesis generation ceremony needs to happen in several steps:
124+
125+
1. Keypackage-independent parameters are exchanged/specified as compile-time parameters (for the light client trusted anchor basis)
126+
127+
2. enclave code is compiled with these parameters and signed on by the production key
128+
129+
3. administrators of nodes participating from genesis obtain and verify the signed binaries and use them to generate their keypackages
130+
131+
4. they exchange keypackages
132+
133+
5. leftmost / first validator's node administrator is responsible for generating the first MLSHandshake CommitChange that's included in genesis.json
134+
and generating and distributing genesis.json (that's verified by other parties)
135+
136+
### extra TX types
137+
```
138+
Enclave
139+
...
140+
Public
141+
...
142+
NodeJoin -> CouncilNodeJoin (just rename)
143+
CommunityNodeJoin
144+
MLSHandshake
145+
CommitChange
146+
SelfUpdateProposal
147+
```
148+
149+
CommunityNodeJoin is similar to CouncilNodeJoin, but has fewer rules, as there's no consensus key.
150+
TODO: extra rules for CouncilNode -> CommunityNode (needs to unbond / become inactive first?)
151+
152+
#### Handshake transactions
153+
internal Vec<u8> payloads are expected to be encoded in the TLS standard binary encodings
154+
(TODO:
155+
[the draft MLS architecture doc](https://github.com/mlswg/mls-architecture/blob/master/draft-ietf-mls-architecture.md):
156+
> In addition, it does not specify a complete wire encoding, but rather a set of abstract data structures which can then be mapped onto a variety of concrete encodings, such as TLS {{?RFC8446}}, CBOR {{?RFC7049}}, and JSON {{?RFC7159}}.
157+
158+
Besides X.509 (to reuse TLS RA stuff) identities in keypackages, we may potentially switch to SCALE to keep it simpler;
159+
TLS wire format is mainly interesting for test-vectors / unit tests, but the protocol draft is too in flux at the moment.
160+
).
161+
```rust
162+
pub struct CommitChangeTx {
163+
messages: Vec<Vec<u8>>, // MLSPlaintext -- any proposals (Add or Remove), the last one is assumed to be Commit
164+
welcome: Option<Vec<u8>>, // Welcome -- if there are any Add proposals, there should be a welcome with encrypted paths/epochs for new joiners
165+
}
166+
```
167+
```rust
168+
pub struct SelfUpdateProposal {
169+
proposal: Vec<u8>, // MLSPlaintext -- Update
170+
commit: Vec<u8>, // MLSPlaintext -- Commit
171+
}
172+
```
173+
174+
##### TDBE handling / generation
175+
TDBE runs as a standalone enclave which includes TM light client and reacts to information received from it.
176+
It opens mutually attested TLS connection to tx-validation enclave (TVE) for pushing exported obfuscation keys.
177+
178+
###### Commits
179+
When a block is committed with nodejoin, unbond or punishment events (liveness/byzantine issues, bue also including expired keypackages),
180+
Node's TDBE corresponding to the leftmost non-empty leaf should generate and broadcast `CommitChangeTx` with proposals reflecting the triggering block's change (Add/Remove).
181+
182+
[[ TODO: instead of leftmost non-empty leaf, start with `triggering block time ``mod`` leaf_count` closest non-empty to the right circulating to 0 ? ]]
183+
184+
(It should use the triggering block's number as AAD.)
185+
If there are only Add proposals (post-genesis/group creation), populating the path can be omitted.
186+
187+
If CommitChangeTx is not received in a block with time less the triggering block's time + timeout,
188+
OR received CommitChangeTx was invalid;
189+
190+
node's TDBE corresponding to the next leftmost non-empty leaf should generate and broadcast `CommitChangeTx` -- it should include
191+
a Remove proposal for the previous node(s) (that failed to submit in time or submitted invalid `CommitChangeTx`;
192+
it should be indicated in timeout-committed block
193+
);
194+
and include any additional proposals (if any) since the original triggering block;
195+
(it should take the latest "timeout" block's number as AAD.)
196+
197+
###### Updates
198+
After 1/3 of keypackage's lifetime is over, TDBE is allowed to generate and broadcast `SelfUpdateProposal`
199+
200+
-- it may happen that there's another committed `SelfUpdateProposal` (from a different sender) or `CommitChangeTx`,
201+
which makes TDBE's original `SelfUpdateProposal` -- in which case, it should re-generate and retry.
202+
203+
###### new obfuscation key
204+
Once the Commit is applied (or state is reconstructed from Welcome), TDBE should generate a new obfuscation key as:
205+
```
206+
new_key = MLS-Exporter(
207+
Label="Crypto.com Chain tx validation " + block number where CommitChangeTx is included,
208+
Context=(that updated group's context),
209+
length=(AES_128_GCM_SIV key length)
210+
)
211+
```
212+
and it should push it over mutually attested TLS to TVE.
213+
TVE should delete old key when signalled `CommitChangeTx` by abci.
214+
[[ TODO: signalling to TVE may need to be delayed by some time to allow for "NACK" ]]
215+
216+
As with https://github.com/mlswg/mls-protocol/blob/master/draft-ietf-mls-protocol.md#deletion-schedule
217+
218+
TDBE should delete the secrets + exported key after it's been pushed to TVE.
219+
220+
##### abci processing
221+
In MLS Architecture terminology:
222+
- Tendermint serves as Delivery Service (DS)
223+
- IAS/DCAP/... + Tendermint (staking states) serve as Authentication Service (AS)
224+
225+
As such, abci app (chain-abci) needs to have some understanding of the logic of `MLSHandshake` transaction types.
226+
227+
Notably, it needs:
228+
- to keep track of (MLS) Leaf<->staking address mapping in order to execute corresponding updates on stake
229+
- to keep track of keypackage lifetimes -- to limit frequency / know when `SelfUpdateProposal` are valid; for validators whose corresponding keypackage expires,
230+
they should be removed from the validator set (similar to liveness fault handling)
231+
- to keep track if valid `CommitChangeTx` was received in time -- invalid one receive similar treatment as byzantine faults (removal from validator set if a validator + slash)
232+
- to signal to TVE valid `CommitChangeTx` -- TVE will erase its current obfuscation key in memory,
233+
and block/reject requests until it's pushed a new key from TDBE
234+
- after CommitChangeTx -- after block commit / in the next block (or two blocks?), enquire TVE if it was pushed a new key;
235+
if not, there are two possibilities
236+
- local node problem (e.g. no running TDBE)
237+
- invalid Commit (e.g. DirectPath)
238+
In any case, it should punish the proposer/submitter of that `CommitChangeTx`
239+
(if chain-abci should know identity of its staking address / TDBE and self-shutdown if it originated from its "node")
240+
and expect to receive `CommitChangeTx` from the next leaf.
241+
242+
[[ OPEN ISSUE:
243+
https://github.com/mlswg/mls-protocol/issues/21
244+
for invalid Commits, the punishment of the proposer/submitter
245+
may need to be delayed until one receives a valid "NACK"
246+
(revealing that the proposer submitted bogus) from affected TDBE.
247+
]]

0 commit comments

Comments
 (0)