Skip to content
Patrick Jattke edited this page Aug 27, 2020 · 13 revisions

Task

The cardio benchmark is based on a medical diagnostic algorithm that was presented by Carpov et al. in Practical privacy-preserving medical diagnosis using homomorphic encryption as demonstration for a cloud-based and privacy-preserving cardiac risk factor assessment service. The algorithm is used by Cingulata as an example/test application.

The cardio program has in total 7 encrypted inputs. The inputs of the program are:

  • sex (1 bit): female (false) or male (true)
  • antecedents (1 bit): true if there was a cardiology disease in the family history, otherwise false
  • smoker (1 bit): no (false) or yes (true)
  • diabetes (1 bit): no (false) or yes (true)
  • high_blood_pressure (1 bit): high blood pressure (true) or normal blood pressure (false)
  • age (8 bit): the age in years
  • hdl_cholesterol (8 bit): the HDL cholesterol level
  • height (8 bit): height in centimeters
  • weight (8 bit): weight in kilograms
  • phy_activity (8 bit): physical activity in minutes per day
  • drinking_habits (8 bit): number of glasses of alcoholic beverages per day

The cardio program uses trans-ciphering that allows to switch from some data encrypted under a classic overhead-free symmetric cryptosystem to the same data encrypted under FHE, without exposing the data in clear text at any time. For that, the program uses the stream cipher Kreyvium, a 128-bit version of the Trivium algorithm that allows efficient FHE execution by only involving XOR operations.

The inputs are encrypted on the client-side using a key-stream generated by the Kreyvium cipher with the private key SYMSK and an initialization value IV. In our benchmark program, we use a precomputed keystream (like in the Cingulata test program). The encryption is a simple XOR operation and as such can be performed efficiently on a device with low resources. On the server-side, the Kreyvium algorithm is executed homomorphically using the FHE-encrypted private key [SYMSK]FHE with the same IV to ensure that client and server use the same key-stream. This process is called key-stream mining and takes places in advance to minimize latency. After receiving the data from the client and homomorphically executing the Kreyvium algorithm on the server, which basically consists of simple XOR operations, the data is encrypted under FHE and can be processed further. This transciphering procedure is depicted in Figure 1.

Figure 1: The computation workflow of the cardio program using transciphering and FHE. Image taken from Carpov et al.

The main part is the homomorphic execution of the cardiac risk factor assessment algorithm. This consists of a set of rules that are given in Listing 1. These rules compute a score between 0 and 9 where a larger value means a higher risk for a cardiology disease.

+1  if man and age > 50 years
+1  if cardiology disease in family history
+1  if smoking
+1  if diabetic
+1  if high blood pressure
+1  if HDL cholesterol less than 40
+1  if weight > height-90
+1  if daily physical activity less than 30 minutes
+1  if man and alcohol consumption more than 3 glasses/day
+1  if woman and alcohol consumption more than 2 glasses/day

Listing 1: Rules of the cardiac risk factor assessment algorithm.

Implementation in Tools

Cingulata / Lobster / Multi-Start

We use Cingulata to run the Cardio circuit that is created by Cingulata and its ABC-optimized variant. Further, we run the optimized circuit generated by Lobster and the optimized circuit generated by the Multi-Start heuristic. As we do not have access to the Multi-Start code base, we use the optimized circuit that is generated in the Lobster codebase as part of Lobster's evaluation.

Summarized, our benchmark considers following configurations:

Folder Name Description Num. Circuits
cardio-cingulata The Cardio circuit generated by Cingulata and the ABC-optimized circuit therefrom. Both use the original Cingulata parameters. 2
cardio-lobster-baseline The Cardio circuit used by Lobster and the ABC-optimized circuit therefrom. Both use the original Cingulata parameters. 2
cardio-lobster The Lobster-optimized circuit with the original Cingulata parameters. 1
cardio-lobster-optimal-params The Lobster-optimized circuit with custom parameters determined by CinguParam. 1
cardio-multistart The Multi-Start-optimized circuit with the original Cingulata parameters. 1
cardio-multistart-optimal-params The Multi-Start-optimized circuit with custom parameters determined by CinguParam. 1

Cingulata

The implementation in Cingulata uses an illegal transformation of the initial rule that is described in the paper (Practical privacy-preserving medical diagnosis using homomorphic encryption):

Original rule:
+1 if weight > height-90

--— is not equal to ----

Cingulata's Cardio implementation:
+1 if height+10 < weight

Also, the addition (+10) did not work properly as 10 must be declared as a CiInt to make use of the overloaded addition. The error was fixed by changing the rule to +1 if height < (weight+90) and declaring 90 as CiInt. This change is also considered in the implementation of cardio in all other tools.

Cingulata encodes the inputs as 7*8 bit = 56 bits. The 1-bit flags (sex, antecedents, smoker, diabetes, and high_blood_pressure) are encoded in a single 8-bit ciphertext.

At the end of the test run, a file fhe_parameters.txt is created containing the FHE parameters that looks as follow:

== FHE parameters ====
n: 16384
q: {40,50,56,60,60,60} (326 bit)
T: 2

The parameters are extracted from the file fhe_params.xml, created by CinguParam, and is also included in our SoK repository. The fhe_parameters.txt file is uploaded to the respective folder in the S3 bucket for documentation purposes.

Lobster Baseline

For comparison, we executed the original Lobster circuit and the Multi-Start circuit using the Lobster code base.

The following results show the Lobster benchmarks executed on the same AWS machine as our benchmarks use (m5n.large):

root@dfcf63f5b942:~/Lobster# ./gen_table_rewriting.sh

Optimization logs will be saved in paper_result directory

  Name      Old Depth            Carpov.el.al                       Lobster
                           New Depth      Opt Time(s)      New Depth      Opt Time(s)
cardio.eqn         10              9              20s              8              32s
dsort.eqn           9              8           1m 25s              8           2m 18s
hd01.eqn            6              6               2s              6              22s
hd02.eqn            6              6               3s              6              33s
hd03.eqn            5              5               1s              5               5s
hd04.eqn           10              9               3s              8              16s
hd05.eqn            7              7               9s              7           3m  5s
hd06.eqn            7              7               9s              7           4m 35s
hd07.eqn            5              5               0s              3               4s
hd08.eqn            6              5               1s              5               3s
hd09.eqn           14             12               6s             12           2m 23s
hd10.eqn            6              5               1s              5               5s
hd11.eqn           18             17               9s             15           1m 12s
hd12.eqn           16             15               9s             15              42s
bar.eqn            12             12              57s       ... aborted because would take too long!

For the circuit evaluation, we obtained the following results:

root@dfcf63f5b942:~/Lobster# ./gen_table_eval_time.sh

Evaluation result will be saved in paper_result directory

 Name                             Eval.Time
                   Original     Baseline-opted   Lobster-opted
cardio.eqn          14m 54s            10m 42s          7m 12s
dsort.eqn           11m 26s             8m 51s          8m 54s
hd01.eqn                  -                  -               -
hd02.eqn                  -                  -               -
hd03.eqn                  -                  -               -
hd04.eqn             8m 28s             5m 25s          3m 24s
hd05.eqn                  -                  -               -
hd06.eqn                  -                  -               -
hd07.eqn             1m  1s                  -             24s
hd08.eqn             2m 24s             1m  3s          1m  2s
hd09.eqn            13m 33s             9m 58s          9m 30s
hd10.eqn             3m  3s             1m 30s          1m 28s
... aborted manually.

Notably, the execution time of the Cardio circuit is much higher than using Cingulata. Although Lobster uses an outdated version of HElib from 2017 (derived from license in header files), we do not expect such a large difference in the circuit's evaluation time. However, as the circuits in Lobster are specified on a very low-level and thus not easily comparable with the Cingulata circuit, we cannot verify whether this circuit is correct or if there are any other issues.

SEAL

The parameters used by each of the configurations explained following (SEAL-BFV, SEAL-BFV-SIMD, SEAL-CKKS-SIMD) are written at the end of the test run to a file named fhe_parameters.txt that looks as follow:

/
| Encryption parameters :
|   scheme: BFV
|   poly_modulus_degree: 16384
|   coeff_modulus size: 338 (30 + 40 + 44 + 50 + 54 + 60 + 60) bits
|   plain_modulus: 2
\

This file is uploaded to the respective folder in the S3 bucket for documentation purposes.

SEAL-BFV

SEAL does not provide a native support for binary data and computation. Binary support was emulated by choosing the plaintext poly modulus degree 2 and implementing binary circuits for the following arithmetic functions manually:

  • Addition (Sklansky adder by Harris)
  • Lower operator (Logarithmic depth lower comparator by Garay et al.)

Each bit is represented by a single ciphertext. Eight ciphertexts are united into a vector (std::vector<seal::Ciphertext>) to represent each cardio input in a 8-bit binary representation. This bit vector is also the input type for the manually implemented arithmetic functions, however, these functions itself work on the bit-/ciphertext-level by using the arithmetic functions provided by SEAL (addition, multiplication).

The cardiac risk factor assessment algorithm is implemented based on the Cingulata implementation. The only deviation from that is that involved constants (e.g., 50 in man && age > 50) are also encrypted, i.e., represented as vector consisting of seal::Ciphertexts. This is made to facilitate the implementation as otherwise some changes in the manually implemented arithmetic circuits would be required, e.g., we need to use special SEAL operations that accept a seal::Plaintext as argument.

SEAL-BFV-SIMD

Binary emulation with batching cannot easily be realized in SEAL-BFV due to the limitations in the choice of the plaintext modulus. The SEAL Manual v2.3.1 states:

Batching only works when the plaintext modulus t is chosen to be a prime number and congruent to 1 (mod 2n), which we assume to be the case 2. (Footnote: Note that this means t > 2n, which can in some cases turn out to be an annoying limitation.)

Because of this limitation, the result of a XOR b which is computed as a + b is no guaranteed to be a valid binary number (i.e., 0 or 1) in case that the plaintext modulus t is greater than 2. However, as described in a SO post, we can compute a XOR b as (a-b)^2 where a, b are binary digits to bypass this limitation and get the correct result.

The batched BFV-variant of the cardio program was derived from the batched CKKS variant by removing all modulus switching operations, rescaling operations, and by adapting the parameters accordingly.

SEAL-CKKS-SIMD

For implementing the cardio program in SEAL based on the CKKS scheme, we use a non-binary plaintext space. The AND operation can easily be emulated in arithmetic by using multiplication:

IN_1  IN_2  AND(IN_1, IN_2)
----------------------------
0     0     0
0     1     1
1     0     1
1     1     1

In a binary plaintext space, we would typically use addition to represent the XOR operation:

IN_1  IN_2  XOR(IN_1, IN_2)
----------------------------
0     0     0
0     1     1
1     0     1
1     1     0   <-- this does not work in a non-binary field

However, if we have a non-binary plaintext space then 1+1 would simply be 2 instead of 0. Because of that, we cannot use addition to emulate XOR but need to use an approximation of XOR for binary inputs:

XOR(a, b)  ==  (a-b)^2,   for a,b ∈ [0,1]

(See Mathematical (Arithmetic) representation of XOR for its derivation.)

The input given to the program is:

idx  name
----------
00   man 
01   antecedent 
02   smoking 
03   diabetic 
04   pressure 
05   man 
06   !man
07   age 
08   1 
09   1 
10   1 
11   1 
12   drinking 
13   drinking 
14   hdl 
15   height 
16   phy_act 
17   weight+90

Each value of the input is encoded into a 8-bit binary representation wherein each ciphertext slot represents one bit. The inputs 1 (index 08-11) are only given to simplify the data alignment for batching. Also, we pass for simplicity man multiple times, the negation (!man), and also weight+90. The negation of man could also easily be computed at server side. As the computation of weight+90 would require additionally an (8-bit) adder circuit, the client does this computation. To not expose any information about our risk assessment algorithm, we could in a real scenario require the client to send the numeric inputs with different offsets, e.g., weight, weight+10, weight+20, ..., weight+90, etc., or just implement a adder circuit and do the maths by ourselves.

The server converts the single input ciphertext into 17 different ciphertexts, a ciphertext for the boolean flags, one ciphertext for each bit of the lower-equation's left-hand side, and one for each bit of the lower-equation's right-hand side.

For implementing cardio using batching in SEAL-CKKS, we then reformulate the initial conditions:

+1  if man                                  && 50 < age
+1  if antecedent                           && 0 < 1
+1  if smoking                              && 0 < 1
+1  if diabetic                             && 0 < 1
+1  if high blood pressure                  && 0 < 1 
+1  if man                                  && 3 < alc_consumption
+1  if (!man)                               && 2 < alc_consumption
+1  if TRUE                                 && HDL < 40 
+1  if TRUE                                 && height < weight+90
+1  if TRUE                                 && phy_act < 30

into a structure that allows to easily execute the batched operations. Values enclosed in brackets [...] are those passed as input to the program. Other values are merged into the ciphertexts at the server side.

+1  if [man                      &&   50        < [age]
+1  if [antecedent               &&   0         < [1]
+1  if [smoking                  &&   0         < [1]
+1  if [diabetic                 &&   0         < [1]
+1  if [high blood pressure      &&   0         < [1] 
+1  if [man]                     &&   3         < [alc_consumption]
+1  if [!man]                    &&   2         < [alc_consumption]
+1  if [1]                       &&   [HDL]     < 40 
+1  if [1]                       &&   [height]  < weight+90
+1  if [1]                       &&   [phy_act] < 30

Challenges in the implementation of Cardio were primarily related to the peculiarities of CKKS. For example, in addition to the already existing relinearizations that were added for BFV, it was required to take care of a ciphertext's scale and its current moduli level. For simplicity, we always rescale all ciphertexts to the initial scale (40 bits), after each operation that modifies the scale. This was very helpful also for methods that are called with different ciphertexts because by ensuring a scale of 40, we could apply operations on the inputs without checking the compatibility of their scale first.

Important to note is that CKKS is an approximate numbers scheme, i.e., numbers are not always exact. For example, a 1.0 could be represented in SEAL as 1.12134. Because of that, the impreciseness adds up and can lead to a wrong result if the parameters are not chosen properly. For example, the coefficient moduli used for the ciphertext had to be increased from initially 30 bits to 40 bits to achieve a correct result.

Optimizations that our batched CKKS implementation includes:

  • Generate exactly those Galois keys that are required for the used rotations to reduce the time needed for key generation.
  • Minimize the minimum number of moduli levels (i.e., ciphertext size) based on the computation.
  • Align the data to equidistant positions in the ciphertext based on how data is arranged in the input, e.g., place data of interest at ciphertext slots 8, 16, 24, etc. It does not make any difference if data is in sequential order (e.g., in slots 1, 2, 3) but aligning data to a sequential order requires more expensive rotations.
  • Use Ciphertext-Plaintext operations instead of Ciphertext-Ciphertext operations whenever possible as these are cheaper and do not increase the scale of the result, which saves an expensive rescale operation.
Clone this wiki locally