Trustchain proposal

Decentralised ID - Trustchain

Tim Hobson, Feb 2022

Purpose

The aim of this document is to:

further our discussions on the subject of decentralised identity by setting out some concrete assertions (which I attempt to justify) and one key hypothesis,
put forward a rough proposal from which a prototype decentralised identity system might be developed.

Assertions

A decentralised Public Key Infrastructure (PKI) is a necessary ingredient for a decentralised ID system and is sufficient for many interesting use cases.
A public blockchain in which consensus is achieved via proof of work (PoW) can provide verifiable and universally accessible timestamping.
The trustworthiness of a timestamp in a PoW blockchain increases in proportion to the total network hash rate and the length of the interval since the timestamp was generated.
A blockchain (or DLT) cannot provide any trust assurances beyond verifiable timestamping.
It is possible to verify timestamps without downloading the entire blockchain.

Justification

Assertion 1. Verifiable credentials enabling selective disclosure are (or should be) central to any ID system. In a centralised system both individual users and relying parties can be assumed to be in possession of public keys belonging to the central ID service provider, with whom they interact via a dedicated app or portal. By contrast, in a decentralised setting the set of legal entities that may act as credential issuers is potentially large and will change over time. In order to verify those credentials, relying parties need to know the public keys associated with each issuer. Users also need to be able to reliably associate public keys with credential issuers and service providers, so that they can selectively share information over private communication channels. This is the necessity argument.

Some use cases are given at the end of this document to support the claim that decentralised PKI is also sufficient for certain applications of interest.

Assertion 2. A transaction in a PoW blockchain cannot be modified or deleted unless all of the (computational) work done since the transaction was included in a block is re-done. This work must also be performed at a rate exceeding that at which the original chain is being worked (or "mined").

The computations required to work the chain are of a specific nature (i.e. repeated executions of a cryptographic hash function) and specialist hardware exists for this purpose with well-known performance characteristics. It is also possible to accurately estimate the amount of computational effort being expended by the network at any time, by examining the proofs of work inserted in each block. With this information, together with approximate prices for electrical power, it is possible to estimate the cost of modifying any transaction in the chain as a function of the time since it was inserted.

In this way the timestamp on a transaction can be verified in a quantitative manner by computing how expensive it would be to execute an attack to produce a deceptive timestamp. Very little computational effort is required to perform the verification.

The blockchain data structure contains all of the information needed to verify the timestamp so no trust in any third party is required.

Assertion 3. Because the verification mechanism involves computing the economic cost of an attack that retrospectively reorganises the blockchain (a "51% attack"), it follows that the trustworthiness of timestamps in a PoW blockchain increases in proportion to the network hash rate (the total amount of computation being performed per unit time by all nodes on the blockchain network), which itself increases linearly with the rate at which electrical energy is expended.

As of Feb 2022, the total hash rate of the Bitcoin network is ~2x10^20 H/s and the approximate cost (in USD) to modify the history of the chain, as a function of the temporal depth of the attack, is as follows:

0	1 hour	1 day	1 week	1 month
<$1	$1,725,383	$41.4m	$290m	$1.26b

It is very cheap to create honest timestamps but very expensive to create deceptive ones.

Assertion 4. This one is contentious and cannot be argued briefly, but the main point is as follows. While PoW timestamps can be independently and quantitatively verified, trust assurances attributed to other mechanisms (which may appear superficially similar) are in fact derived from preceding or alternative trust relationships based on digital signatures (in the case of proof of stake) and/or governance models (in the case of permissioned blockchains such as Hyperledger) which cannot be independently verified.

In other words, what is actually being trusted in those cases is not the blockchain or ledger, but the entity or entities responsible for maintaining it. This is not to say that no other decentralised or semi-centralised solutions can be deemed trustworthy, only that their use of a blockchain does not contribute to their degree of trustworthiness.

A corollary is that the act of recording any "real world" data (e.g. an institution's public key) on a blockchain does not make it more trustworthy. It is possible to verify that the data has been there for a certain period of time, but there is no way to verify its correctness or that it relates to a particular legal entity in the real world.

Assertion 5. Transaction IDs in a Bitcoin block are arranged in a Merkle tree data structure and the Merkle root is included in the block header. This makes the block header a commitment to all of the transactions contained in the block, and also provides an efficient way to verify that a given transaction is included. The method, called Simplified Payment Verification (SPV), is described in Section 8 of Nakamoto's Bitcoin whitepaper. An SPV proof (that a transaction exists in a given block) requires only ~350 bytes of data, provided the 80 byte block header is also known. The proof consists of a Merkle path from the given transaction to the Merkle root, whose presence in the block header can be verified (by computing the header's SHA-256 hash).

The important consequence of this is that mobile devices (or other light clients), which cannot download the entire blockchain (currently ~320 GB in size), can still be used to verify the presence of particular transactions (and therefore to perform timestamp verification) using only the chain of block headers (currently ~58 MB in size) together with (minimal) Merkle proof data which can be requested from other nodes on the network.

This makes the proposed system accessible using mobile devices.

[Note: in Bitcoin the use of SPV proofs is discouraged for privacy reasons, since the act of requesting an SPV proof from a full node leaks information about an individual's transactions. In the context of timestamp verification there is no such drawback because the transactions of interest do not transfer monetary value. Instead they form part of an open messaging protocol. SPV proofs are therefore ideal for this application.]

Consequences

Assertions 1 & 2 raise the question: is a universally accessible and practically immutable timestamping message board a useful building block for creating a decentralised public key infrastructure?

The proposal (that follows) is an attempt to answer this question in the affirmative.

It considers the use of ION, which is a messaging system that leverages the Bitcoin blockchain and IPFS to share decentralised identifiers (DIDs), as a building block within a decentralised ID system. Given Assertion 3, it makes sense to exploit the Bitcoin blockchain (as opposed to any other DLT) because its network is openly accessible and has by far the greatest total hash rate.

But if we also accept Assertion 4 then we need some external source of trust for any real world data placed on the blockchain. This is an unfortunate but fundamental truth that we cannot avoid. (In the context of smart contracts it's referred to as the Oracle Problem.)

It follows that we need some sort of trusted setup.

[Note that this point appears not to be accepted by the authors of the 2015 paper Decentralized Public Key Infrastructure, but neither do they explain or justify the idea that simply registering a real world identifier (such as a web domain) with an on-chain identifier somehow makes either of those identifiers more trustworthy.]

The good news is that although we need an external source of trust, we only need it once. That is, provided we can embed (using ION) a single transaction in the (Bitcoin) blockchain that can be trusted, then that transaction can form the root of trust on which a decentralised PKI can be built.

Key attributes of ION

ION is a second-layer protocol that writes data into Bitcoin transactions.
The data written on-chain consists only of SHA-256 hash digests (256 bit strings) which act as content identifiers on the IPFS distributed file system.
The content identifier enables files (including DID documents) to be retrieved from IPFS nodes whose content can be verified by recomputing the SHA-256 hash.
This guarantees that the file content stored on IPFS is precisely that committed to in the (verifiably timestamped) Bitcoin transaction. As such, all DID content is itself verifiably timestamped.
DID documents typically contain identifying information about legal entities including public keys, top level domains and service (API) endpoints.
Each on-chain transaction can embed a batch of ION processes corresponding to many new or updated DIDs, enabling scalability.
The majority of ION code is developed as a blockchain-agnostic protocol (called "Sidetree").

Proposal & example use cases

Hypothesis:

Verifiable and universally accessible timestamping can be used to establish a trustworthy decentralised PKI.

The basic idea is that a chain of trust can be built from a single trusted transaction, and a one-off well publicised event in which prominent institutions take part could be used to make that particular transaction so generally known and recognised as to make fraud impossible.

Prominent institutions could include, e.g, IETF, W3C, ISO, NIST, World Bank, IMF, EU Commission, etc. but should include at least one entity trusted by each nation that wants to participate in the ID system.

At this event, a Bitcoin transaction is constructed that is invalid unless it is signed by all of the participating institutions. The transaction contains an ION reference to a DID document that itself contains a list of trusted public keys. Each key is associated (in the DID document) with one of the institutions and by signing the transaction they confirm that the public keys are genuine (i.e. that one institution securely holds each of the corresponding private keys).

Unless all of the institutions agree on the correctness of the information in the DID document, they will not sign the transaction and it can never be mined into the blockchain.

There is some circularity here in that the signing of the transaction itself implies the association of a public key with each of the participating institutions. This is why the event must be well publicised, with prominent members from each institution taking part. The act of signing the transaction must be tied publicly to the recognised (and trusted) institution. The event itself might be costly, but it would be a one-off.

After the root transaction has been mined into the blockchain, nothing is done for some extended period of time, say six months. After that time it is sufficiently deep to make modification practically impossible, and exposure in the case of fraud practically inevitable.

Once this root transaction has been established, the trusted entities (whose public keys are referenced in the DID document) can delegate trust to other entities by publishing new transactions in the blockchain, each referring to a new DID document containing another set of trusted public keys. We refer to these as downstream decentralised identifiers (dDIDs).

Downstream DID

For instance, a central bank whose keys are in the root transaction might subsequently publish a DID containing the public keys of all commercial banks to which it has granted a banking license. The central bank is then responsible for making sure that the keys in this new DID are correct, and it signs the new DID to demonstrate its confidence in the trustworthiness of the information it contains.

Any user wanting to interact with a commercial bank (either to do business with them or to verify a credential issued by them) can look on the blockchain to find their public key in a DID. There may be many (fraudulent) DIDs on the chain that refer to this same bank, all with different public keys. But the user is able to distinguish which is the valid DID by checking its signature. Unless the DID is signed by the central bank (whose public key is available in the root DID), it will not be trusted.

Chains of trusted DIDs can be established in this way. Anybody can verify that a particular public key is trustworthy by checking the chain of signatures which must lead ultimately to the single root transaction created in the trusted setup event. If it does, then there is a chain of trust from an entity in the initial setup down to the entity whose public key is contained in the DID, so that public key can be assumed trustworthy.

Lists of revoked public keys can also be published, and should be deemed trustworthy provided they are signed by an entity higher up in the trust chain than the DID being revoked.

Revocation transactions might be missed by clients if they are unable to connect to the network. In that case they could be tricked into accepting a public key that has in fact been revoked (e.g. because the private key was compromised). However the public nature of the Bitcoin blockchain makes this unlikely. Even an SPV client would be able to identify revoked public keys as soon as it can connect to any honest full node (by requesting revocation transactions affecting a given DID). Revocation could also be achieved by giving DIDs an expiry date and republishing them periodically.

Downstream DIDs could be used to bestow different levels of "trust permission" on the legal entities whose public keys they contain. The most basic would be an attestation granting the right to use their stated public key within the system. This might be afforded to everyday businesses. A higher level of "trust permission" would grant an entity the right to create their own downstream DIDs, thereby delegating responsibility further along a branch in the chain of trust. Even higher permission would be the right to create downstream DIDs that can themselves be used to delegate trust. Delegation of higher levels of trust permission could require a greater number of signatures from upstream entities, or a signature from a particular upstream entity.

Trustchain

This public key infrastructure is decentralised in the sense that delegation of responsibility fans out along branches to create a "tree of trust". Ultimately trust can be traced back to the entities in the root transaction, but there are many of these entities, they represent institutions from around the world, and they can delegate trust to smaller entities.

The public event need not be unique. There could be many of them and participants can decide which root transactions to trust, based on their own level of knowledge about the event in which they were generated and published. The basic requirements would be:

a trusted entity publicly declaring their participation in the generation of the root transaction
a long-ish period (say 6 months) since the transaction was mined, making fraud practically impossible (as it would be easily exposed by the trusted entity).

In this scenario of multiple root transactions there is a trade-off between decentralisation (which is enhanced) and interoperability (which is impaired).

For instance, suppose the UK DfE participated in a root transaction. It could then create a downstream DID containing a public key for each UK university. Then universities could issue verifiable credentials (VCs) to their alumni as digital degree certificates. These could be used by employers or universities in foreign countries to confirm applicants' qualifications. However, the foreign institution would need to recognise the root transaction from the DfE. If that transaction were instead a downstream DID transaction signed by a higher-level UK government entity then the foreign institution would be more likely to recognise it.

This decentralised PKI could also play a role in enabling relatively loosely-coupled federations of existing ID systems, provided some common standards exist for operations such as user authentication.

Suppose for instance that two national ID systems are compliant with some such standard and are willing recognise each other's credentials. Each ID system would need to be registered in the decentralised PKI system, thereby sharing their own public keys. They could then jointly sign a new downstream DID which communicates both their compliance with the relevant standard(s) and their willingness to interoperate. As soon as this new DID is published, it is downloaded and verified by users of both systems (automatically by their devices).

When a user of one system wants to authenticate against the other (e.g. when travelling), they provide the authentication data requirement for their home system, together with a reference to the "interoperability" DID. The user's data is encrypted with the public key belonging to their home ID system and delivered to the foreign system as an authentication request. The relying party operating within the foreign ID system retrieves the DID document specified by the user and validates the chain of signatures back to a recognised root trust transaction, (it may also check the DID's permission level). It extracts from the DID document both the public key of the user's home ID system and the API endpoint (URI) for a proxy authentication request. It then signs the user's encrypted data and sends it to the API endpoint together with its own DID details (and a randomly chosen nonce to identify the request). The home system receives the request, verifies that the DID of the foreign entity has its own valid chain of trust back to a root transaction and extracts that entity's public key from the corresponding DID document. It can then verify the signature on the request. If the foreign entity is recognised and the request is valid, the user's home system decrypts the data packets, performs the authentication checks as if it were a local request, signs the result of the operation and returns it to the foreign entity (using the same nonce used in the request). Finally, the foreign system checks the signature on the response and, if valid, accepts the result.

Without the decentralised PKI system a similar arrangement would involve many bilateral relationships between identity providers who would then need to securely share these agreements with all users and relying parties, updating public keys on their client devices. In that scenario device updates would become an attack vector and the shared state across different participants in the system could easily become unsynchronised.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly