Merge branch 'main' of github.com:tideofwords/0xparc-intro-book

0xPARC · Jul 8, 2024 · e701337 · e701337
2 parents 3f8ec5e + 084fe53
commit e701337
Show file tree

Hide file tree

Showing 13 changed files with 114 additions and 78 deletions.
diff --git a/easy/main.typ b/easy/main.typ
@@ -5,7 +5,7 @@
 }
 #let part(s) = {
   pagebreak(weak: true)
-  set text(fill: rgb("#002299"))
+  //set text(fill: rgb("#002299"))
   heading(offset: 0, s)
 }
 
@@ -18,8 +18,9 @@
 
 #quote[
   I can now prove to you that I have a message $M$ such that
-  $op("SHA")(M) = "0xa91af3ac..."$, without revealing $M$.
-  But not just for SHA. I can do this for any function you want.
+  $sha(M) = "0xa91af3ac..."$, without revealing $M$.
+  But not just for the hash function sha. 
+  I can do this for any function you want.
 ]
 
 #toc

diff --git a/easy/src/2pc-takeaways.typ b/easy/src/2pc-takeaways.typ
@@ -7,13 +7,13 @@
   function over their respective secret inputs. We can think of this 
   as your prototypical _2PC_ (two-party computation).
 2. The main ingredient of a garbled circuit is _garbled gates_, 
-  which area gates whose functionality is hidden. This can be done e.g. 
+  which area gates whose functionality is hidden. This can be done 
   by Alice precomputing different outputs of the garbled circuit 
   based on all possible inputs of Bob, and then letting Bob pick one.
 3. Bob "picks an input" with the technique of _oblivious transfer (OT)_. 
   This can be built in various ways, including with commutative 
   encryption or public-key cryptography.
-4. More generally, this means in theory a group of people can 
+4. More generally, it is also possible for a group of people to
   compute whatever secret function they want, which is the field of 
   _multiparty computation (MPC)_.
 ]
diff --git a/easy/src/ec.typ b/easy/src/ec.typ
@@ -328,7 +328,7 @@ for the prime $p := 2^(255)-19$.
 Its order is $8$ times a large prime
 $ q' := 2^(252) + 27742317777372353535851937790883648493. $
 In that case, to generate a random point on Curve25519 with order $q'$,
-one will usually take a random point in it and multiply it by $8$.
+one will usually take a random point on the curve and multiply it by $8$.
 
 BN254 is also engineered to have a property called _pairing-friendly_,
 which is defined in @pairing-friendly when we need it later.
@@ -372,7 +372,7 @@ given her published public key $[d]$.
   1. Alice picks a random scalar $r in FF_q$ (keeping this secret)
     and publishes $[r] in E$.
   2. Alice generates a number $n in FF_q$ by hashing $msg$ with all public information,
-    say $ n := sha([r], msg, [d]). $
+    say $ n := hash([r], msg, [d]). $
   3. Alice publishes the integer $ s := (r + d n) mod q. $
 
   In other words, the signature is the ordered pair $([r], s)$.
@@ -394,7 +394,7 @@ The number $r$ is called a _blinding factor_ because
 its use prevents Bob from stealing Alice's secret key $d$ from the published $s$.
 It's therefore imperative that $r$ isn't known to Bob
 nor reused between signatures, and so on.
-One way to do this would be to pick $r = sha(d, msg)$; this has the
+One way to do this would be to pick $r = hash(d, msg)$; this has the
 bonus that it's deterministic as a function of the message and signer.
 
 In @kzg we will use ideas quite similar to this to

diff --git a/easy/src/fhe-takeaways.typ b/easy/src/fhe-takeaways.typ
@@ -4,7 +4,8 @@
 
 #green[
 1. A _fully homomorphic encryption_ protocol allows Alice to delegate Bob to compute some function $f(x)$ for Alice in a way that Bob doesn't get to know $x$.
-2. The hard problem backing known FHE protocols is the _learning with errors (LWE)_ problem, which comes down to deciding if a system of "approximate equations" over $F_q$ are consistent.
-3. The main idea of this approach to FHEs is to use approximate eigenvalues as the encrypted computation and an "approximate eigenvector" as the secret key. Intuitively, adding and multiplying two matrices with different approximate eigenvalues for the same eigenvector approximately adds and multiplies the eigenvalues, respectively.
+2. The hard problem backing known FHE protocols is the _learning with errors (LWE)_ problem, which comes down to deciding if a system of "approximate equations" over $F_q$ is consistent.
+3. The main idea of this approach to FHEs is to use "approximate eigenvalues" as the encrypted computation and an "approximate eigenvector" as the secret key. 
+  Intuitively, adding and multiplying two matrices with different approximate eigenvalues for the same eigenvector approximately adds and multiplies the eigenvalues, respectively.
 4. To carefully do this, we actually need to control the error blowup with the _flatten_ operation. This creates a _leveled FHE_ protocol.
 ]
diff --git a/easy/src/fhe2.typ b/easy/src/fhe2.typ
@@ -147,8 +147,8 @@ computes $ upright(bold(x)) dot.op upright(bold(a)) = 4 . $ Plugging in
 $y = 1$, we see that $ 4 + epsilon.alt = 1 + m . $
 
 Now it’s a simple "rounding" problem. We know that $epsilon.alt$ is
-small and positive, so $1 + m$ is either $4$ or … a little more (In
-fact, it’s one of $4 , 5 , 6 , 7 , 8$.) On the other hand, since $m$ is
+small and positive, so $1 + m$ is either $4$ or … a little more. 
+(In fact, it’s one of $4 , 5 , 6 , 7 , 8$.) On the other hand, since $m$ is
 0 or 5, $1 + m$ had better be 1 or 6, so the only possibility is
 that $m = 5$ (so $1+m = 6$).
 

diff --git a/easy/src/fhe3.typ b/easy/src/fhe3.typ
@@ -161,7 +161,7 @@ bigger, say $n approx r log q$, to get the same level of security.
 Now let’s compute more carefully what happens to the error when we add,
 negate, and multiply bits. Suppose
 $ C_1 upright(bold(v)) = mu_1 upright(bold(v)) + epsilon.alt_1 , $ where
-$epsilon.alt_1$ is some vector with all its entries upper bounded by some
+$epsilon.alt_1$ is some vector with all its entries bounded by some
 $B$. (And similarly for $C_2$ and $mu_2$.)
 
 When we add two ciphertexts, the errors add:
@@ -207,4 +207,4 @@ If we need to evaluate a bigger circuit, we have two options:
 + Use some technique to "reset" the error
   and start anew, as if with a freshly encrypted ciphertext. This approach is called _bootstrapping_ and it incurs some hefty
   computational costs.
-  But for very, very large circuits, it's the only viable option. Bootstrapping is beyond the scope of this book.
+  But for large circuits, it's the only viable option. Bootstrapping is beyond the scope of this book.
diff --git a/easy/src/fs.typ b/easy/src/fs.typ
@@ -62,10 +62,10 @@ Fiat--Shamir turns it into the following noninteractive protocol.
   known to both Peggy and Victor.
 
   1. Peggy sends $Com(F)$ and $Com(H)$.
-  2. Peggy computes $lambda in FF_q$ by $lambda = sha(Com(F), Com(H))$.
+  2. Peggy computes $lambda in FF_q$ by $lambda = hash(Com(F), Com(H))$.
   3. Peggy opens both $Com(F)$ and $Com(H)$ at $lambda$.
   4. Victor verifies that
-    $lambda = sha(Com(F), Com(H))$ and $F(lambda) = Z(lambda) H(lambda)$.
+    $lambda = hash(Com(F), Com(H))$ and $F(lambda) = Z(lambda) H(lambda)$.
 ]
 
 We can apply the Fiat--Shamir heuristic to the full PLONK protocol.

diff --git a/easy/src/intro.typ b/easy/src/intro.typ
@@ -8,12 +8,21 @@
 
 Cryptography is so ubiquitous that it has become invisible:
 - _Encryption_ (hiding and then decoding messages) make people talking to each other over apps and computers talking to each other over protocols (like SSH) secure.
-- _Digital signatures_ (signing a message with some data that anyone can verify must come from some specific identity) authenticates people's identity, so you know that the website you are going to is actually what it says it is.
-- _Key exchanges_ (allowing two parties to agree on a secret piece of data, even talking over an public channel) allows people to set up instructure remotely to do other cryptography, such as faster encryption algorithms.
-
-However, there is actually a lot more cryptography that have been implemented in academic and other smaller circles, such as #cite("https://w.wiki/9fXW", "group signature schemes") (more advanced versions of digital signatures supporting multiple participants) or  commitment schemes (general methods to commit to some secret that is to be revealed later in a way that prevents cheating).
-
-Even beyond this, there is cryptography that have been theoretically constructed but barely (or never) tried in practice, often with a ambitious sense of scale. Their spirit can be summarized as:
+- _Digital signatures_ 
+  (signing a message with some data that anyone can verify must come from some specific identity) 
+  authenticate people's identity, so you know that the website you are going to is actually what it says it is.
+- _Key exchanges_ (allowing two parties to agree on a secret piece of data, even talking over an public channel) 
+  allow people to set up secure connections remotely, 
+  without having to meet in person to agree on a key.
+
+However, there is actually a lot more cryptography that has been implemented in academic and other smaller circles, 
+such as #cite("https://w.wiki/9fXW", "group signature schemes") 
+(more advanced versions of digital signatures supporting multiple participants) 
+and commitment schemes (general methods to commit to some secret that is to be revealed later in a way that prevents cheating).
+
+Even beyond this, there is cryptography that has been theoretically constructed 
+but barely (or never) tried in practice, often with a ambitious sense of scale. 
+Its spirit can be summarized as:
 
 #quote[
   We want cryptography that can
@@ -28,15 +37,15 @@ do any computation so long as someone writes code for it.
 
 #remark[
   The quote on the title page
-("I have a message $M$ such that $op("sha")(M) = "0x91af3ac..."$")
+("I have a message $M$ such that $sha(M) = "0x91af3ac..."$")
 is a concrete example.
-The hash function SHA is a particular set of arbitrary instructions,
+The hash function sha is a particular set of arbitrary instructions,
 yet programmable cryptography promises that such a proof can be made
-using a general compiler rather than inventing an algorithm specific to SHA256.
+using a general compiler rather than inventing an algorithm specific to SHA-256.
 ]
 
 This led 0xPARC to coin the term _programmable cryptography_ to differentiate 
-this "second generation" technology from "classical" cryptography that solve 
+this "second-generation" technology from "classical" cryptography that solve 
 specific problems and/or involve specific functions. 
 
 == Ideas in programmable cryptography
@@ -76,7 +85,7 @@ statements of the form:
 #quote[
   I know $X$ such that $F(X, Y) = Z$, where $Y,Z$ are public.
 ]
-once the statement is encoded as a system of equations. One such statement would be "I know $M$ such that $op("SHA256") (M) = Y$."
+once the statement is encoded as a system of equations. One such statement would be "I know $M$ such that $sha(M) = Y$."
 
 SNARKS are an active area of research, and many different SNARKs are known.
 Our work focuses on a particular example, PLONK (@plonk).
@@ -88,19 +97,24 @@ language. While many services today will do this, even for free, we can also
 imagine that you care about security a lot and you really don't want the 
 translating service to know anything about your text at all (e.g. selling the
 text to someone else, adding your text to large language models that can then
-be reverse-engineered to find your private information, blackmail you...).
+be reverse-engineered to find your private information, blackmailing you...).
 
 In _fully homomorphic encryption (FHE)_, one person encrypts some data $x$,
 and then a second person can perform arbitrary operations on the encrypted data
 $x$ without being able to read $x$. 
 
-With this technology, you have a solution to your problem! (and also much more, 
-such as a dating service who does not even know the names of people it provides 
-matchmaking to) You simply encrypt your text $Enc(x)$ and send it to your FHE machine translation server. The server will faithfully translate it into 
+With this technology, you have a solution to your problem!  
+You simply encrypt your text $Enc(x)$ and send it to your FHE machine translation server. 
+The server will faithfully translate it into 
 another language and give you $Enc(y)$, where $y$ is the translation of $x$. 
 You can then decrypt and obtain $y$, knowing that the server cannot extract 
 anything meaningful from $Enc(x)$ without your secret key.
 
+(You could imagine many more applications of FHE, 
+such as a dating service that does not even know the names of people it 
+provides 
+matchmaking to.)
+
 == From One Door to the Next
 
 Programmable cryptography has both a surprisingly high amount of theory but 
@@ -114,14 +128,6 @@ At least for the protocols we mention, they can be implemented, but usually at a
 cost of doing the computation directly). Can we bring that number down? What
 other cryptographic systems can be build on top of this technology?
 
-In the Labyrinth of Cryptography, behind us are a series of doors and rooms 
-that housed great Ideas in first-generation cryptography; we have
-explored, exploited, and mastered these Ideas for 
-many decades. After a specific door, however, the rooms in the Labyrinth
-suddenly now house Ideas at a much bigger scale, as if we stepped into a 
-completely different biome. In front of us, intrepid explorers have actually gone even further, into rooms that house even bigger behemoths of Ideas, such
-as witness encryption (WE) and indistinguishability obfuscation (IO). 
-
 It is easy to be carried away by the staggering possibilities and imagine a
 perfect "post-cryptographic" world where everyone has control over all their 
 data and everyone's security preferences are completely fulfilled. It is also 

diff --git a/easy/src/kzg.typ b/easy/src/kzg.typ
@@ -42,7 +42,8 @@ Then anyone in the world can use the resulting sequence for KZG commitments.
   this is a case of the discrete logarithm problem.
 
   You can make the protocol somewhat more secure by involving several different trusted parties.
-  The first party chooses a random $s_1$, computes $[s_1^0], ..., [s_1^M]$, and then discards s_1.
+  The first party chooses a random $s_1$, computes $[s_1^0], ..., [s_1^M]$, 
+  and then discards $s_1$.
   The second party chooses $s_2$ and computes
   $[(s_1 s_2)^0], ..., [(s_1 s_2)^M]$.
   And so forth.
@@ -167,12 +168,25 @@ To be fully explicit, here is the algorithm:
   Peggy can establish the value of $F$ at any point in $FF_q$.
   Peggy wants to convince Victor that $F$ vanishes on a given finite set $S subset.eq FF_q$.
 
-  1. Both parties compute the polynomial
+  1. If she has not already done so, Peggy sends to Victor
+    a commitment $Com(F)$ to $F$.#footnote[
+      In fact, it is enough for Peggy to have some way
+      to prove to Victor the values of $F$.
+
+      So for example, if $F$ is a product of two polynomials
+      $F = F_1 F_2$, 
+      and Peggy has already sent commitments to $F_1$ and $F_2$,
+      then there is no need for Peggy to commit to $F$.
+
+      Instead, in Step 5 below, Peggy opens $Com(F_1)$ and $Com(F_2)$ at $lambda$,
+      and that proves to Victor the value of $F(lambda) = F_1 (lambda) F_2 (lambda)$.
+    ]
+  2. Both parties compute the polynomial
     $ Z(X) := product_(z in S) (X-z) in FF_q [X]. $
-  2. Peggy does polynomial long division to compute $H(X) = F(X) / Z(X)$.
-  3. Peggy sends $Com(H)$.
-  4. Victor picks a random challenge $lambda in FF_q$
+  3. Peggy does polynomial long division to compute $H(X) = F(X) / Z(X)$.
+  4. Peggy sends $Com(H)$.
+  5. Victor picks a random challenge $lambda in FF_q$
     and asks Peggy to open $Com(H)$ at $lambda$,
     as well as the value of $F$ at $lambda$.
-  5. Victor verifies $F(lambda) = Z(lambda) H(lambda)$.
+  6. Victor verifies $F(lambda) = Z(lambda) H(lambda)$.
 ] <root-check>
diff --git a/easy/src/mpc.typ b/easy/src/mpc.typ
@@ -12,7 +12,7 @@ what could be learned by knowing both $a$ and $f (a , b)$), and likewise
 for Bob.
 
 Yao’s Garbled Circuits is one of the most well-known 2PC protocols
-(Vitalik has a great explanation on his
+(Vitalik Buterin has a great explanation on his
 #cite("https://vitalik.eth.limo/general/2020/03/21/garbled.html")[blog];).
 The protocol is quite clever, and optimized variants of the protocol are
 being
@@ -98,7 +98,7 @@ what you think of
 when you think of plain-vanilla encryption:
 You use a secret key $K$ to encrypt a message $m$,
 and then you use the same secret key $K$ to decrypt it.]
-encryption scheme#footnote[We'll talk later about what sort of encryption scheme is suitable for this...]
+encryption scheme
 $Enc$ and publish the following table:
 
 #table(
@@ -169,10 +169,10 @@ We'll need to make two changes to the protocol.
   so the outputs will be (the passwords encoding) 0, 0, 0, 1.
   #table(
   columns: 2,
-  [$sha(P_0^(text("left")), P_0^(text("right")))$], [$Enc_(P_0^(text("left")), P_0^(text("right"))) (P_0^(text("out")))$],
-  [$sha(P_0^(text("left")), P_1^(text("right")))$], [$Enc_(P_0^(text("left")), P_1^(text("right"))) (P_0^(text("out")))$],
-  [$sha(P_1^(text("left")), P_0^(text("right")))$], [$Enc_(P_1^(text("left")), P_0^(text("right"))) (P_0^(text("out")))$],
-  [$sha(P_1^(text("left")), P_1^(text("right")))$], [$Enc_(P_1^(text("left")), P_1^(text("right"))) (P_1^(text("out")))$],
+  [$hash(P_0^(text("left")), P_0^(text("right")))$], [$Enc_(P_0^(text("left")), P_0^(text("right"))) (P_0^(text("out")))$],
+  [$hash(P_0^(text("left")), P_1^(text("right")))$], [$Enc_(P_0^(text("left")), P_1^(text("right"))) (P_0^(text("out")))$],
+  [$hash(P_1^(text("left")), P_0^(text("right")))$], [$Enc_(P_1^(text("left")), P_0^(text("right"))) (P_0^(text("out")))$],
+  [$hash(P_1^(text("left")), P_1^(text("right")))$], [$Enc_(P_1^(text("left")), P_1^(text("right"))) (P_1^(text("out")))$],
 )
 
 == How Bob uses one gate
@@ -189,11 +189,11 @@ Let's play through one round of Bob's gate-using protocol.
 2. Bob takes the two passwords, concatenates them, and computes a hash.
   Now Bob has
   $
-    sha(P_0^(text("left")), P_1^(text("right"))).
+    hash(P_0^(text("left")), P_1^(text("right"))).
   $
 
 3. Bob finds the row of the table indexed by
-  $sha(P_0^(text("left")), P_1^(text("right")))$,
+  $hash(P_0^(text("left")), P_1^(text("right")))$,
   and he uses it to look up
   $
     Enc_(P_0^(text("left")), P_1^(text("right"))) (P_0^(text("out"))).
@@ -204,7 +204,7 @@ Let's play through one round of Bob's gate-using protocol.
   to decrypt
   $P_0^(text("out")).$
 
-5. Now Bob has the password for the bit 0, to feed into the next gate --
+5. Now Bob has the password for the bit 0 to feed into the next gate --
   but he doesn't know his bit is 0.
 
 So Bob is exactly where he started:

diff --git a/easy/src/ot.typ b/easy/src/ot.typ
@@ -61,7 +61,7 @@ because he doesn't know the keys.
 No problem!
 Bob just picks out the $i$-th ciphertext $Enc_a (x_i)$,
 adds his own layer of encryption onto it,
-and sends the resulting doubly-encoded message back to Alice:
+and sends the resulting doubly-encrypted message back to Alice:
 $
   Enc_b (Enc_a (x_i)).
 $