FHE stuff draft

0xPARC · Jun 10, 2024 · b167fad · b167fad
1 parent d8b2b33
commit b167fad
Show file tree

Hide file tree

Showing 6 changed files with 613 additions and 0 deletions.
diff --git a/easy/main.typ b/easy/main.typ
@@ -33,3 +33,10 @@
 #chapter("src/kzg.typ")
 #chapter("src/plonk.typ")
 
+#part[Levelled fully homomorphic encryption]
+#chapter("src/fhe0.typ")
+#chapter("src/lwe.typ")
+#chapter("src/fhe2.typ")
+#chapter("src/fhe3.typ")
+
+
diff --git a/easy/src/fhe0.typ b/easy/src/fhe0.typ
@@ -0,0 +1,59 @@
+#import "preamble.typ":*
+
+= Introduction to fully homomorphic encryption
+<fhe-intro>
+
+Fully homomorphic encryption (FHE) lets you encrypt a message, and then
+other people can perform arbitrary operations on the encrypted message
+without being able to read the message.
+
+Levelled FHE is a sort of weaker version of FHE. Like FHE, levelled FHE
+lets you perform operations on encrypted data. But unlike FHE, there
+will be a limit on the number of operations you can perform before the
+data must be decrypted.
+
+Loosely speaking, the encryption procedure will involve some sort of
+"noise" or "error." As long as the error is not too big, the message can
+be decoded without trouble. But each operation on the encrypted data
+will cause the error to grow – and if it grows beyond some maximum error
+tolerance, the message will be lost. So there is a limit on how many
+operations you can do before the error gets too big.
+
+As a sort of silly example, imagine your message is a whole number
+between 0 and 10 (so it’s one of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10), and
+your "encryption" scheme encodes the message as a real number that is
+very close to the message. So if the cyphertext is 1.999832, well then
+that means the original message was 2. The decryption procedure is
+"round to the nearest integer."
+
+(You might be thinking: This is some pretty terrible cryptography,
+because the message isn’t secure. Anyone can figure out how to round a
+number, no secret key required. Yep, you’re right. The actual encryption
+#link("https://hackmd.io/mQB8_nWPTm-Kyua7QgNLNw")[scheme] is more
+complicated. But it still has this "rounding-off-errors" feature, and
+that’s what I want to focus on right now.)
+
+Now imagine that the "operations" you want to perform are addition. (If
+you like, imagine doing the addition modulo 11, so if a number gets too
+big, it "wraps around.") Well, every time you add two encrypted numbers
+($1.999832 + 2.999701 = 4.999533$), the errors add as well. After too
+many operations, the error will exceed $0.5$, and the rounding procedure
+won’t give the right answer anymore.
+
+But as long as you’re careful not to go over the error limit, you can
+add cyphertexts with confidence.
+
+In fact, for our levelled FHE protocol, our message will be a bit: 
+either 0 or 1;
+our operations will be the logic gates AND and NOT.
+Any logic circuit can be built out of AND and NOT gates,
+so we'll be able to perform arbitrary calculations
+within the FHE encryption.
+
+Our protocol uses a cryptosystem built 
+from a problem called "learning with errors."
+"Learning with errors" is kind of a strange name;
+I'd call it "approximate linear algebra modulo $q$."
+Anyway, we'll start with the learning-with-errors problem
+(@lwe) and how to build cryptography on top of it (@lwe-crypto)
+before we get back to levelled FHE.
diff --git a/easy/src/fhe2.typ b/easy/src/fhe2.typ
@@ -0,0 +1,172 @@
+#import "preamble.typ":*
+
+= Public-Key Cryptography from Learning with Errors
+<lwe-crypto>
+The
+learning with errors
+problem (@lwe) is one of those "hard problems that you can build cryptography
+on." The problem is to solve for constants
+$ a_1, dots, a_n in ZZ / (q ZZ), $ given a bunch of
+#emph[approximate] equations of the form
+$ a_1 x_1 + dots.h + a_n x_n = y + epsilon.alt , $ where each
+$epsilon.alt$ is a "small" error (in the linked example, $epsilon.alt$
+is either 0 or 1).
+
+In @lwe
+we saw how even a small case of this problem ($q = 11$, $n = 4$) can be
+annoyingly tricky. In the real world, you should imagine that $n$ and
+$q$ are much bigger – maybe $n$ is in the range
+$100 lt.eq n lt.eq 1000$, and $q$ could be anywhere from $n^2$ to
+$2^(sqrt(n))$, say.
+
+Now let’s see how to turn this into a public-key cryptosystem. We’ll use
+the same numbers from the "blue set" in @lwe. In fact, that "blue
+set" will be exactly the public key.
+
+#figure(
+  align(center)[#table(
+    columns: 1,
+    align: (auto,),
+    table.header([Public Key],),
+    table.hline(),
+    [(1, 0, 1, 7) : 2],
+    [(5, 8, 4, 10) : 9],
+    [(7, 7, 8, 5) : 3],
+    [(5, 1, 10, 6) : 3],
+    [(8, 0, 2, 4) : 1],
+    [(9, 3, 0, 6) : 9],
+    [(0, 6, 1, 6) : 9],
+    [(0, 4, 9, 7) : 5],
+    [(10, 7, 4, 10) : 10],
+    [(5, 5, 10, 6) : 8],
+    [(10, 7, 3, 1) : 9],
+    [(0, 2, 5, 5) : 6],
+    [(9, 10, 2, 1) : 2],
+    [(3, 7, 2, 1) : 5],
+    [(2, 3, 4, 5) : 3],
+    [(2, 1, 6, 9) : 3],
+  )]
+  , kind: table
+  )
+
+The private key is simply the vector $a$.
+
+#figure(
+  align(center)[#table(
+    columns: 1,
+    align: (auto,),
+    table.header([Private Key],),
+    table.hline(),
+    [$upright(bold(a))$ = (10, 8, 10, 10)],
+  )]
+  , kind: table
+  )
+
+== How to encrypt $mu$?
+<how-to-encrypt-mu>
+Suppose you have a message $m in { 0 , 5 }$. (You’ll see in a moment why
+we insist that $mu$ is one of these two values.) The cyphertext to
+encrypt $m$ will be a pair $(upright(bold(x)) : y)$, where $x$ is a
+vector, $y$ is a scalar, and
+$upright(bold(x)) dot.op upright(bold(a)) + epsilon.alt = y + mu$, where
+$epsilon.alt$ is "small".
+
+How to do the encryption? If you’re trying to encrypt, you only have
+access to the public key – that list of pairs $(upright(bold(x)) : y)$
+above. You want to make up your own $upright(bold(x))$, for which you
+know approximately the value $upright(bold(x)) dot.op upright(bold(a))$.
+You could just take one of the vectors $upright(bold(x))$ from the
+table, but that wouldn’t be very secure: if I see your cyphertext, I can
+find that $upright(bold(x))$ in the table and use it to decrypt $mu$.
+
+Instead, you are going to combine several rows of the table to get your
+vector $upright(bold(x))$. Now you have to be careful: when you combine
+rows of the table, the errors will add up. We’re guaranteed that each
+row of the table has $epsilon.alt$ either $0$ or $1$. So if you add at
+most $4$ rows, then the total $epsilon.alt$ will be at most $4$. Since
+$mu$ is either $0$ or $5$ (and we’re working modulo $q = 11$), that’s
+just enough to determine $mu$ uniquely.
+
+So, here’s the method. You choose at random 4 (or fewer) rows of the
+table, and add them up to get a pair $(upright(bold(x)) : y_0)$ with
+$upright(bold(x)) dot.op upright(bold(a)) approx y_0$. Then you take
+$y = y_0 - mu$ (mod $q = 11$ of course), and send the message
+$(upright(bold(x)) : y)$.
+
+== An example
+<an-example>
+Let’s say you randomly choose the first 4 rows:
+
+#figure(
+  align(center)[#table(
+    columns: 1,
+    align: (auto,),
+    table.header([Some rows of public key],),
+    table.hline(),
+    [(1, 0, 1, 7) : 2],
+    [(5, 8, 4, 10) : 9],
+    [(7, 7, 8, 5) : 3],
+    [(5, 1, 10, 6) : 3],
+  )]
+  , kind: table
+  )
+
+Now you add them up to get the following. | $upright(bold(x)) : y_0$ | |
+\- | | (7, 5, 1, 6) : 6 |
+
+Finally, let’s say your message is $m = 5$. So you set
+$y = y_0 - m = 6 - 5 = 1$, and send the cyphertext: |
+$upright(bold(x)) : y_0$ | | - | | (7, 5, 1, 6) : 1. |
+
+== Decryption
+<decryption>
+Decryption is easy! The decryptor knows
+$ upright(bold(x)) dot.op upright(bold(a)) + epsilon.alt = y + mu $
+where $0 lt.eq epsilon.alt lt.eq 4$.
+
+Plugging in $upright(bold(x))$ and $upright(bold(a))$, the decryptor
+computes $ upright(bold(x)) dot.op upright(bold(a)) = 4 . $ Plugging in
+$y = 1$, we see that $ 4 + epsilon.alt = 1 + mu . $
+
+Now it’s a simple "rounding" problem. We know that $epsilon.alt$ is
+small and positive, so $1 + mu$ is either $4$ or … a little more. (In
+fact, it’s one of $4 , 5 , 6 , 7 , 8$.) On the other hand, since $mu$ is
+0 or 5, well, $1 + mu$ had better be 1 or 6… so the only possibility is
+that $1 + mu = 6$, and $mu = 5$.
+
+== How does this work in general?
+<how-does-this-work-in-general>
+In practice, $n$ and $q$ are often much larger. Maybe $n$ is in the
+hundreds, and $q$ could be anywhere from "a little bigger than $n$" to
+"almost exponentially large in $n$," say $q = 2^(sqrt(n))$. In fact, to
+do FHE, we’re going to want to take $q$ pretty big, so you should
+imagine that $q approx 2^(sqrt(n))$.
+
+For security, the encryption algorithm shouldn’t just take add up 3 or 4
+rows of the public key. In fact we want the encryption algorithm to add
+at least $log (q^n) = n log q$ rows – to be safe, maybe make that number
+a little bigger, say $m = 2 n log q$. Of course, for this to work, the
+public key has to have at least $m$ rows.
+
+So in practice, the public key will have $m = 2 n log q$ rows, and the
+encryption algorithm will be "select some subset of the rows at random,
+and add them up".
+
+Of course, combining $m$ rows will have the effect of multiplying the
+error by $m$ – so if the initial $epsilon.alt$ was bounded by $1$, then
+the error in the cyphertext will be at most $m$. But remember that $q$
+is exponentially large compared to $m$ and $n$ anyway, so a mere factor
+of $m$ isn’t going to scare us!
+
+Now we could insist that the message is just a single bit – either $0$
+or $⌊q / 2⌋$. Or we could allow the message to be any multiple of some
+constant $r$, where $r$ is bigger than the error bound (right now that’s
+$m$) – which allows you to encode a message space of size $q \/ r$
+rather than just a single bit.
+
+When we do FHE, we’re going to apply many operations to a cyphertext,
+and each is going to cause the error to grow. We’re going to have to put
+some effort into keeping the error under control – and then, when the
+error inevitably grows beyond the permissible bound, we’ll need a
+special technique ("bootstrapping") to refresh the cyphertext and start
+anew.