From ba7a02ef311d9301440fda810db8f5eca7d9f699 Mon Sep 17 00:00:00 2001 From: Brian Lawrence Date: Mon, 10 Jun 2024 14:28:07 -0700 Subject: [PATCH] Edit intro stuff into the file --- easy/main.typ | 3 + easy/src/fhe0.typ | 15 +++- easy/src/fhe2.typ | 33 ++++++--- easy/src/fhe3.typ | 81 ++++++++++++-------- easy/src/h-zksnark.typ | 29 ++++++++ easy/src/intro.typ | 163 +++++++++++++++++++++++++++++++++++++++++ easy/src/lwe.typ | 10 +-- easy/src/plonk.typ | 2 +- easy/src/zkintro.typ | 17 +++++ 9 files changed, 305 insertions(+), 48 deletions(-) create mode 100644 easy/src/h-zksnark.typ create mode 100644 easy/src/intro.typ create mode 100644 easy/src/zkintro.typ diff --git a/easy/main.typ b/easy/main.typ index 95c8cd4..c21054b 100644 --- a/easy/main.typ +++ b/easy/main.typ @@ -1,4 +1,5 @@ #import "src/preamble.typ":* + #let chapter(filename) = { include filename pagebreak(weak: true) @@ -23,6 +24,8 @@ #toc #pagebreak() +#chapter("src/intro.typ") + #part[Oblivious transfer, garbled circuits, and multiparty computation] #chapter("src/mpc.typ") #chapter("src/ot.typ") diff --git a/easy/src/fhe0.typ b/easy/src/fhe0.typ index 0651da6..5090a2f 100644 --- a/easy/src/fhe0.typ +++ b/easy/src/fhe0.typ @@ -3,9 +3,18 @@ = Introduction to fully homomorphic encryption -Fully homomorphic encryption (FHE) lets you encrypt a message, and then -other people can perform arbitrary operations on the encrypted message -without being able to read the message. +Alice has a secret $x$, and Bob has a function $f$. +They want to compute $f(x)$. +Actually, Alice wants Bob to compute $f(x)$ -- but +she doesn't want to tell him $x$. + +Alice wants to encrypt $x$ and send Bob $Enc (x)$. +Then Bob is going to "apply $f$ to the cyphertext", +to turn $Enc (x)$ into $Enc (f(x))$. +Finally, Bob sends $Enc (f(x))$ back, +and Alice decrypts it to learn $f(x)$. + +This is fully homomorphic encryption (FHE). Levelled FHE is a sort of weaker version of FHE. Like FHE, levelled FHE lets you perform operations on encrypted data. But unlike FHE, there diff --git a/easy/src/fhe2.typ b/easy/src/fhe2.typ index 2bdd41c..ae9d5aa 100644 --- a/easy/src/fhe2.typ +++ b/easy/src/fhe2.typ @@ -6,7 +6,7 @@ The learning with errors problem (@lwe) is one of those "hard problems that you can build cryptography on." The problem is to solve for constants -$ a_1, dots, a_n in ZZ / (q ZZ), $ given a bunch of +$ a_1, dots, a_n in ZZ \/ q ZZ, $ given a bunch of #emph[approximate] equations of the form $ a_1 x_1 + dots.h + a_n x_n = y + epsilon.alt , $ where each $epsilon.alt$ is a "small" error (in the linked example, $epsilon.alt$ @@ -111,12 +111,28 @@ Let’s say you randomly choose the first 4 rows: , kind: table ) -Now you add them up to get the following. | $upright(bold(x)) : y_0$ | | -\- | | (7, 5, 1, 6) : 6 | +Now you add them up to get the following. +#figure( + align(center)[#table( + columns: 1, + align: (auto,), + table.header([$upright(bold(x)) : y_0$],), + [(7, 5, 1, 6) : 6], +)], + kind: table +) Finally, let’s say your message is $m = 5$. So you set -$y = y_0 - m = 6 - 5 = 1$, and send the cyphertext: | -$upright(bold(x)) : y_0$ | | - | | (7, 5, 1, 6) : 1. | +$y = y_0 - m = 6 - 5 = 1$, and send the cyphertext: +#figure( + align(center)[#table( + columns: 1, + align: (auto,), + table.header([$upright(bold(x)) : y_0$],), + [(7, 5, 1, 6) : 1], +)], + kind: table +) == Decryption @@ -166,7 +182,6 @@ rather than just a single bit. When we do FHE, we’re going to apply many operations to a cyphertext, and each is going to cause the error to grow. We’re going to have to put -some effort into keeping the error under control – and then, when the -error inevitably grows beyond the permissible bound, we’ll need a -special technique ("bootstrapping") to refresh the cyphertext and start -anew. +some effort into keeping the error under control – +and the size of $q\/ r$ will determine how many operations +we can do before the error grows too big. diff --git a/easy/src/fhe3.typ b/easy/src/fhe3.typ index 96515f0..c62bb98 100644 --- a/easy/src/fhe3.typ +++ b/easy/src/fhe3.typ @@ -2,14 +2,20 @@ == The main idea: Approximate eigenvalues - -If you haven’t already, this might be a good time to go back and read -about the -#link("https://notes.0xparc.org/notes/learning-with-errors-exercise")[learning with errors] -problem and how you can use it to do -#link("https://hackmd.io/mQB8_nWPTm-Kyua7QgNLNw")[public-key cryptography];. - -You should at least understand the vague idea: We’re going pick some + +Now we want to turn the public-key encryption from @lwe-crypto +into a levelled FHE scheme. +In other words: +We want to be able to encrypt bits (0s and 1s) +and operate on them with AND and NOT gates. + +It might help you to imagine that, instead of AND and NOT, +the operations we want to encrypt are addition and multiplication. +If $x$ and $y$ are bits, then +NOT $x$ is just $1 - x$, and $x$ AND $y$ is just $x y$. +But it's easier to do algebra with $+$ and $*$. + +Recall the setup from @lwe-crypto: We’re going pick some large integer $q$ (in practice $q$ could be anywhere from a few thousand to $2^1000$), and do "approximate linear algebra" modulo $q$. In other words, we’ll do linear algebra, where all our calculations are done @@ -17,11 +23,12 @@ modulo $q$ – but we’ll also allow the calculations to have a small "error" $epsilon.alt$, which will typically be much, much smaller than $q$. -Here’s the setup. Our #emph[secret key] will be a vector -\$\\mathbf{v} = (v\_1, \\ldots, v\_n) \\in (\\ZZ / q \\ZZ)^n\$ – a +Here’s the new idea. +Our #emph[secret key] will be a vector +$ upright(bold(v)) = (v_1, dots, v_n) in (ZZ \/ q ZZ)^n $ – a vector of length $n$, where the entries are integers modulo $q$. Suppose we want to encode a message $mu$ that’s just a single bit, let’s say -$mu in { 0 , 1 }$. Our cyphertext will be a square $n$-by$n$ matrix $C$ +$mu in { 0 , 1 }$. Our cyphertext will be a square $n$-by-$n$ matrix $C$ such that $ C upright(bold(v)) approx mu upright(bold(v)) . $ Now if we assume that $upright(bold(v))$ has at least one "big" entry (say $v_i$), then decryption is easy: Just compute the $i$-th entry of @@ -29,20 +36,21 @@ $C upright(bold(v))$, and determine whether it is closer to $0$ or to $v_i$. With a bit of effort, it’s possible to make this into a public-key -cryptosystem. The main idea is to release a -#link("https://hackmd.io/mQB8_nWPTm-Kyua7QgNLNw")[table] of vectors +cryptosystem. Just like in @lwe-crypto, +the main idea is to release a +table of vectors $upright(bold(x))$ such that -$upright(bold(x)) dot.op upright(bold(v)) approx 0$, and use that as a +$ upright(bold(x)) dot.op upright(bold(v)) approx 0, $ and use that as a public key. Given $mu$ and the public key, you can find a matrix $C_0$ -such that $C_0 upright(bold(v)) approx 0$ – then take -$C = C_0 + mu \* upright(I d)$, where $upright(I d)$ is the identity +such that $ C_0 upright(bold(v)) approx 0 $ – then take +$ C = C_0 + mu upright(I d) $, where $upright(I d)$ is the identity matrix. And $C_0$ can be built row-by-row… but we won’t get into the details here. Indeed homomorphic encryption is already interesting without the public-key feature. If you assume the person encrypting the data knows $upright(bold(v))$, it’s easy (linear algebra, again) to find $C$ such -that $C upright(bold(v)) approx mu upright(bold(v))$. +that $ C upright(bold(v)) approx mu upright(bold(v)). $ To make homomorphic encryption work, we need to explain how to operate on $mu$. We’ll describe three operations: addition, NOT, and @@ -97,10 +105,11 @@ value of $a_1$. Now suppose I give you the vector $ upright(bold(x)) = (9 , 0 , 0 , 0) . $ I ask you for another vector -$ "Flatten" (upright(bold(x))) = upright(bold(x)) prime , $ where -$upright(bold(x)) prime$ has to have the following two properties: \* -$upright(bold(x)) prime dot.op upright(bold(v)) = upright(bold(x)) dot.op upright(bold(v))$, -and \* All the entries of $upright(bold(x)) prime$ are either 0 or 1. +$ "Flatten"(upright(bold(x))) = upright(bold(x)) prime , $ where +$upright(bold(x)) prime$ has to have the following two properties: +- $upright(bold(x)) prime dot.op upright(bold(v)) = upright(bold(x)) dot.op upright(bold(v))$, + and +- All the entries of $upright(bold(x)) prime$ are either 0 or 1. And you have to find this vector $upright(bold(x)) prime$ without knowing $a_1$. @@ -119,9 +128,10 @@ safe to reduce it mod 11. Similarly, if you know $upright(bold(v))$ has the form $ upright(bold(v)) = (a_1 , 2 a_1 , 4 a_1 , dots.h , 2^k a_1 , a_2 , 2 a_2 , 4 a_2 , dots.h , 2^k a_2 , dots.h , a_r , 2 a_r , 4 a_r , dots.h , 2^k a_r) , $ and you are given some matrix $C$ with coefficients in -\$\\ZZ / q \\ZZ\$, then you can compute another matrix $"Flatten" (C)$ -such that: \* $"Flatten" (C) upright(bold(v)) = C upright(bold(v))$, and -\* All the entries of $"Flatten" (C)$ are either 0 or 1. +$ZZ \/ q ZZ$, then you can compute another matrix $"Flatten"(C)$ +such that: +- $"Flatten"(C) upright(bold(v)) = C upright(bold(v))$, and +- All the entries of $"Flatten"(C)$ are either 0 or 1. The $"Flatten"$ process is essentially the same binary-expansion process we used above to turn $upright(bold(x))$ into $upright(bold(x)) prime$, @@ -131,7 +141,7 @@ So now, using this $"Flatten"$ operation, we can insist that all of our cyphertexts $C$ are matrices with coefficients in ${ 0 , 1 }$. For example, to multiply two messages $mu_1$ and $mu_2$, we first multiply the corresponding cyphertexts, then flatten the resulting product: -$ "Flatten" (C_1 C_2) . $ +$ "Flatten"(C_1 C_2) . $ Of course, revealing that the secret key $upright(bold(v))$ has this special form will degrade security. This cryptosystem is as secure as an @@ -170,16 +180,27 @@ bounded by $(n + 1) B$. In summary: We can start with cyphertexts having a very small error (if you think carefully about this -#link("https://hackmd.io/mQB8_nWPTm-Kyua7QgNLNw")[protocol];, you will +protocol, you will see that the error is bounded by approximately $n log q$). Every addition operation will double the error bound; every multiplication -("and" gate) will multiply it by $(n + 1)$. And you can’t allow the +(AND gate) will multiply it by $(n + 1)$. And you can’t allow the error to exceed $q \/ 2$ – otherwise the message cannot be decrypted. So you can perform calculations of up to approximately $log_n q$ steps. (In fact, it’s a question of #emph[circuit depth];: you can start with many more than $log_n q$ input bits, but no bit can follow a path of length greater than $log_n q$ AND gates.) -This gives us a #emph[levelled] fully homomorphic encryption protocol. -Next we’ll see a trick called "bootstrapping," which lets us turn this -into FHE. +This gives us a #emph[levelled] fully homomorphic encryption protocol: +it lets us evaluate abritrary circuits on encrypted data, +as long as those circuits have bounded depth. +If we need to evaluate a bigger circuit, we have two options. ++ Increase the value of $q$. + Of course, the cost of the computations increases with $q$. ++ Use some technique to "reset" the error + and start anew, as if with a freshly encrypted cyphertext. + + This approach is called "bootstrapping" and it incurs some hefty + computational costs. + But for very, very large circuits, it's the only viable option. + +Bootstrapping is beyond the scope of this book. diff --git a/easy/src/h-zksnark.typ b/easy/src/h-zksnark.typ new file mode 100644 index 0000000..8a0d0ef --- /dev/null +++ b/easy/src/h-zksnark.typ @@ -0,0 +1,29 @@ +#import "preamble.typ":* + +// copied from bigger project, incorporate into this one + +This part covers two constructions of the zkSNARK, +the *PLONK* and *Groth16* constructions. +Despite being fairly modern constructions, +these are arguably simpler and more informative to learn about than +the PCP construction that preceded them (which is covered in @pcp). + +The dependency chart of this chapter goes as follows: + +- @ec describes the discrete logarithm problem on an elliptic curve, + which provides a basis for everything afterwards. + +- @kzg and @ipa give two different *polynomial commitment schemes*, + which allow a prover Peggy to + + - commit to some polynomial $P(X) in FF_q [X]$ ahead of time, + - and then *open the commitment* at any input $z in FF_q$ while not revealing $P$ itself. + + The KZG scheme from @kzg is quite simple and elegant but requires a trusted setup. + In contrast, IPA from @ipa has fewer assumptions and is more versatile, + but it's slower and more complicated. + +- Regardless of whether KZG/IPA scheme is used, + we then show two constructions of a zkSNARK. + In @plonk we construct PLONK; + in @groth16 we construct Groth16. diff --git a/easy/src/intro.typ b/easy/src/intro.typ new file mode 100644 index 0000000..211a06f --- /dev/null +++ b/easy/src/intro.typ @@ -0,0 +1,163 @@ +#import "preamble.typ":* + +// copied from bigger project, edit into this one + += Introduction + +== What is programmable cryptography? + +Cryptography is everywhere now and needs no introduction. +"Programmable cryptography" is a term coined by 0xPARC for a second generation +of cryptographic primitives that have arisen in the last 15 or so years. + +To be concrete, let's consider two examples of what protocols designed by +classical cryptography can achieve: + +- _Proofs_. An example of this is digital signature algorithms like RSA, + where Alice can do some protocol to prove to Bob that a message was sent by her. + A more complicated example might be a + #link("https://w.wiki/9fXW", "group signature scheme"), + allowing one member of a group to sign a message on behalf of a group. + +- _Hiding inputs_: for example, consider + #link("https://w.wiki/9fXQ", "Yao's millionaire problem"), + where Alice and Bob wants to know which of them has more money + without learning the actual incomes. + +Classically, first-generation cryptography relied on coming up for a protocol +for solving given problems or computing certain functions. +The goal of the second-generation "programmable cryptography" can +then be described as: + +#quote[ + We want to devise cryptographic primitives that could + be programmed to work on *arbitrary* problems and functions, + rather than designing protocols on a per-problem or per-function basis. +] + +To draw an analogy, it's sort of like going from older single-purpose hardware, +like a digital alarm clock or thermostat, +to having a general-purpose device like a smartphone which can +do any computation so long as someone writes code for it. + +The quote on the title page +("I have a message $M$ such that $op("sha")(M) = "0x91af3ac..."$") +is a concrete example; +the hash function SHA is a particular set of arbitrary instructions, +yet programmable cryptography promises that such a proof can be made +using a general compiler rather than inventing an algorithm specific to SHA256. + +#todo[Brian's image of an alarm clock and a computer chip] + +== Ideas in programmable cryptography + +These notes focus on the following specific topics. + +=== The zkSNARK: proofs of general problems + +The *zkSNARK*, first described in 2012, was the first type of primitive +that arguably falls into the "programmable cryptography" umbrella. +It provides a way to produce proofs of _arbitrary_ problem statements, +at least once encoded as a system of equations in a certain way. +The name stands for: + +- *Zero-knowledge*: a person reading the proof doesn't learn anything + about the solution besides that it's correct. +- *Succinct*: the proof length is short (actually constant length). +- *Non-interactive*: the protocol is not interactive. +- *Argument*: technically not a "proof," but we won't worry about the difference. +- *of Knowledge*: the proof doesn't just show the system of equations has a solution; + it also shows that the prover knows one. + +So, you can think of these as generalizing something like a group signature +scheme to authenticating any sort of transaction: + +- A normal signature scheme is a (zero-knowledge, succinct, non-interactive) + proof that "I know Alice's private key". +- A group signature scheme can be construed as a succinct proof that + "I know one of Alice, Bob, or Charlie's private keys". +- But you could also use a zkSNARK to prove a statement like + "I know a message $M$ such that $sha(M) = "0x91af3ac..."$", + of course without revealing $M$ or anything about $M$. +- ... Or really any arbitrarily complicated statement. + +#todo[gubsheep's slide had a funny example with emoji, link it] + +These notes focus on two constructions, PLONK (@plonk) and Groth16 (@groth16). + +=== Multi-party computation (MPC) + +A *multi-party computation*, in which $n >= 2$ people want to +jointly compute some known function +$ F(x_1, ..., x_n) $ +where the $i$th person only knows the input $x_i$ +and does not learn the other inputs. + +For example, we saw earlier Yao's millionaire problem --- Alice and Bob +want to know who has a higher income without revealing the incomes themselves. +This is the case where $n=2$, $F = max$, and $x_i$ is the $i$'th person's income. + +Multi-party computation makes a promise that we'll be able to do this +for _any_ function $F$ as long as we implement it in code. + +=== Fully homomorphic encryption (FHE) + +In *fully homomorphic encryption*, one person encrypts some data $x$, +and then anybody can perform arbitrary operations on the encrypted data $x$ +without being able to read $x$. + +For example, imagine you have some private text that you want to +translate into another language. +You encrypt the text and feed it to your favorite FHE machine translation server. +You decrypt the server's output and get the translation. +The server only ever sees encrypted text, +so the server learns nothing about the text you translated. + +== Where these fit together + +ZkSNARKS, MPC, and FHE are just some of a huge zoo of cryptographic primitives, +from the elementary (public-key cryptography) +to the impossibly powerful (indistinguishability obfuscation). +There are protocols for zkSNARKS, MPC and FHE; +they are very slow, but they can be implemented and used in practice. + +This whole field is an active area of research. +On the one hand: Can we make existing tools (zkSNARKS, etc.) more efficient? +For example, the cost of doing a computation in zero knowledge +is currently about $10^6$ times the cost of doing the computation directly. +Can we bring that number down? +On the other hand: What other cryptographic games can we play +to develop new sorts of programmable cryptography functionality? + +At 0xPARC, we see this as a door to a new world. +What sort of systems can we build on top of programmable cryptography? + +#todo[Import Brian's tree. Talk about reduction? Evan, take a look at the flavor text, idk if I like it - Aard] + +== What's all the fuss about zero-knowledge anyhow? + +#figure( + image("../figures/care-about.png", width:90%), + caption: [Expectations vs. reality.] +) + +#todo[Aard suggests deleting the figure, it's cute but Aard isn't sure about the message] + +When we think about how to use programmable cryptography we need to be creative. +As an example, what can you do with a zkSNARK? + +One answer: You can prove that you have a solution to a system of equations. +Sounds pretty boring, unless you're an algebra student. + +Slightly better answer: You can prove that you have executed a program correctly, +revealing some or all of the inputs and outputs, as you please. +For example: You know a messame $M$ such that +$op("sha")(M) = "0xa91af3ac..."$, but you don't want to reveal $M$. +Or: You only want to reveal the first 30 bytes of $M$. +Or: You know a message $M$, and a digital signature proving that $M$ was signed by +[trusted authority], such that a certain neural network, run on the input $M$, outputs "Good." + +One recent application along these lines is +#link("https://tlsnotary.org", "TLSNotary"). +TLSNotary lets you certify a transcript of communications with a server +in a privacy-preserving way: you only reveal the parts you want to. diff --git a/easy/src/lwe.typ b/easy/src/lwe.typ index 3741a58..e553910 100644 --- a/easy/src/lwe.typ +++ b/easy/src/lwe.typ @@ -132,18 +132,18 @@ will be to make vectors with many $0$’s in the same places. $ a_1 in { 10 , 2 , 6 , 9 }. $ + $ (10 , 4 , 4 , 3 lr(|1|) { 0 , - 1 }) + (7 , 7 , 7 , 8 lr(|5|) { 0 , - 1 }) = (6 , 0 , 0 , 0 lr(|6|) { 0 , - 1 , - 2 }), $ which is nice because it has 3 zeroes! This gives - $ a_1 in { 1 , 8 , 10 } $. Combining with (2), we conclude that - $ a_1 = 10 $. + $ a_1 in { 1 , 8 , 10 }. $ Combining with (2), we conclude that + $ a_1 = 10. $ + We can reuse $ (4 , 7 , 0 , 0) : 8. $ Since we knew from (2) that $ 4 a_1 + 7 a_2 in { 7 , 8 }, $ we can substitute $ a_1 = 10 $ to get - $ 7 a_2 in { 0 , 1 } $. This forces $ a_2 = 0 $ because of (1). + $ 7 a_2 in { 0 , 1 }. $ This forces $ a_2 = 0 $ because of (1). At this point, basically any isolation of the first two variables would force a contradiction. For example, we can compute $ (8 , 6 , 6 , 9 lr(|1|) { 0 , - 1 }) + (5 , 4 , 5 , 2 lr(|2|) { 0 , - 1 }) = (2 , 10 , 0 , 0 lr(|3|) { 0 , - 1 , - 2 }) . $ -Since $ 2 a_1 + 10 a_2 = 9$, but $3 + { 0 , - 1 , - 2 } = { 1 , 2 , 3 } $, +Since $ 2 a_1 + 10 a_2 = 9$, but $3 + { 0 , - 1 , - 2 } = { 1 , 2 , 3 }, $ we have a contradiction. === Blue Set @@ -185,4 +185,4 @@ be done in various ways, as any triple of equations will set up 3 This is enough to conclude that $a_1 = 10$ and $a_2 = 8$, giving the answer $(10 , 8 , 10 , 10)$. -] \ No newline at end of file +] diff --git a/easy/src/plonk.typ b/easy/src/plonk.typ index 32c10ca..df32b35 100644 --- a/easy/src/plonk.typ +++ b/easy/src/plonk.typ @@ -288,7 +288,7 @@ $ $ Then the accumulator $F_Q in FF_q[T]$ is defined analogously. -So to proev @permcheck-poly, the following algorithm works: +So to prove @permcheck-poly, the following algorithm works: #algorithm[Permutation-check][ Suppose Peggy has committed $Com(P)$ and $Com(Q)$. diff --git a/easy/src/zkintro.typ b/easy/src/zkintro.typ new file mode 100644 index 0000000..6468529 --- /dev/null +++ b/easy/src/zkintro.typ @@ -0,0 +1,17 @@ +#import "preamble.typ":* + += Introduction to zkSNARKs + +Peggy has done some very difficult calculation. +She wants to prove to Victor that she did it. +Victor wants to check that Peggy did it, but he +is too lazy to redo the whole calculation himself. + +- Maybe Peggy wants to keep part of the calculation secret. + Maybe her calculation was "find a solution to this puzzle," + and she wants to prove that she found a solution + without saying what the solution is. +- Maybe it's just a really long, annoying calculation, + and Victor doesn't have the energy to check it all line-by-line. + +