-
Notifications
You must be signed in to change notification settings - Fork 16
Cardio
The cardio
benchmark is based on a medical diagnostic algorithm that was presented by Carpov et al. in Practical privacy-preserving medical diagnosis using homomorphic encryption as demonstration for a cloud-based and privacy-preserving cardiac risk factor assessment service.
The algorithm is used by Cingulata as an example/test application.
The cardio program has in total 7 encrypted inputs. The inputs of the program are:
-
sex
(1 bit): female (false) or male (true) -
antecedents
(1 bit): true if there was a cardiology disease in the family history, otherwise false -
smoker
(1 bit): no (false) or yes (true) -
diabetes
(1 bit): no (false) or yes (true) -
high_blood_pressure
(1 bit): high blood pressure (true) or normal blood pressure (false) -
age
(8 bit): the age in years -
hdl_cholesterol
(8 bit): the HDL cholesterol level -
height
(8 bit): height in centimeters -
weight
(8 bit): weight in kilograms -
phy_activity
(8 bit): physical activity in minutes per day -
drinking_habits
(8 bit): number of glasses of alcoholic beverages per day
The cardio
program uses trans-ciphering that allows to switch from some data encrypted under a classic overhead-free symmetric cryptosystem to the same data encrypted under FHE, without exposing the data in clear text at any time.
For that, the program uses the stream cipher Kreyvium, a 128-bit version of the Trivium algorithm that allows efficient FHE execution by only involving XOR operations.
The inputs are encrypted on the client-side using a key-stream generated by the Kreyvium cipher with the private key SYMSK and an initialization value IV. In our benchmark program, we use a precomputed keystream (like in the Cingulata test program). The encryption is a simple XOR operation and as such can be performed efficiently on a device with low resources. On the server-side, the Kreyvium algorithm is executed homomorphically using the FHE-encrypted private key [SYMSK]FHE with the same IV to ensure that client and server use the same key-stream. This process is called key-stream mining and takes places in advance to minimize latency. After receiving the data from the client and homomorphically executing the Kreyvium algorithm on the server, which basically consists of simple XOR operations, the data is encrypted under FHE and can be processed further. This transciphering procedure is depicted in Figure 1.
Figure 1: The computation workflow of the cardio program using transciphering and FHE. Image taken from Carpov et al.
The main part is the homomorphic execution of the cardiac risk factor assessment algorithm. This consists of a set of rules that are given in Listing 1. These rules compute a score between 0 and 9 where a larger value means a higher risk for a cardiology disease.
+1 if man and age > 50 years
+1 if cardiology disease in family history
+1 if smoking
+1 if diabetic
+1 if high blood pressure
+1 if HDL cholesterol less than 40
+1 if weight > height-90
+1 if daily physical activity less than 30 minutes
+1 if man and alcohol consumption more than 3 glasses/day
+1 if woman and alcohol consumption more than 2 glasses/day
Listing 1: Rules of the cardiac risk factor assessment algorithm.
We use Cingulata to run the Cardio circuit that is created by Cingulata and its ABC-optimized variant. Further, we run the optimized circuit generated by Lobster and the optimized circuit generated by the Multi-Start heuristic. As we do not have access to the Multi-Start code base, we use the optimized circuit that is generated in the Lobster codebase as part of Lobster's evaluation.
Summarized, our benchmark considers following configurations:
Folder Name | Description | Num. Circuits |
---|---|---|
cardio-cingulata | The Cardio circuit generated by Cingulata and the ABC-optimized circuit therefrom. Both use the original Cingulata parameters. | 2 |
cardio-lobster-baseline | The Cardio circuit used by Lobster and the ABC-optimized circuit therefrom. Both use the original Cingulata parameters. | 2 |
cardio-lobster | The Lobster-optimized circuit with the original Cingulata parameters. | 1 |
cardio-lobster-optimal-params | The Lobster-optimized circuit with custom parameters determined by CinguParam. | 1 |
cardio-multistart | The Multi-Start-optimized circuit with the original Cingulata parameters. | 1 |
cardio-multistart-optimal-params | The Multi-Start-optimized circuit with custom parameters determined by CinguParam. | 1 |
The implementation in Cingulata uses an illegal transformation of the initial rule that is described in the paper (Practical privacy-preserving medical diagnosis using homomorphic encryption):
Original rule:
+1 if weight > height-90
--— is not equal to ----
Cingulata's Cardio implementation:
+1 if height+10 < weight
Also, the addition (+10
) did not work properly as 10
must be declared as a CiInt
to make use of the overloaded addition.
The error was fixed by changing the rule to +1 if height < (weight+90)
and declaring 90
as CiInt
.
This change is also considered in the implementation of cardio in all other tools.
Cingulata encodes the inputs as 7*8 bit = 56 bits. The 1-bit flags (sex
, antecedents
, smoker
, diabetes
, and high_blood_pressure
) are encoded in a single 8-bit ciphertext.
At the end of the test run, a file fhe_parameters.txt
is created containing the FHE parameters that looks as follow:
== FHE parameters ====
n: 16384
q: {40,50,56,60,60,60} (326 bit)
T: 2
The parameters are extracted from the file fhe_params.xml
, created by CinguParam, and is also included in our SoK repository.
The fhe_parameters.txt
file is uploaded to the respective folder in the S3 bucket for documentation purposes.
For comparison, we executed the original Lobster circuit and the Multi-Start circuit using the Lobster code base.
The following results show the Lobster benchmarks executed on the same AWS machine as our benchmarks use (m5n.large):
root@dfcf63f5b942:~/Lobster# ./gen_table_rewriting.sh
Optimization logs will be saved in paper_result directory
Name Old Depth Carpov.el.al Lobster
New Depth Opt Time(s) New Depth Opt Time(s)
cardio.eqn 10 9 20s 8 32s
dsort.eqn 9 8 1m 25s 8 2m 18s
hd01.eqn 6 6 2s 6 22s
hd02.eqn 6 6 3s 6 33s
hd03.eqn 5 5 1s 5 5s
hd04.eqn 10 9 3s 8 16s
hd05.eqn 7 7 9s 7 3m 5s
hd06.eqn 7 7 9s 7 4m 35s
hd07.eqn 5 5 0s 3 4s
hd08.eqn 6 5 1s 5 3s
hd09.eqn 14 12 6s 12 2m 23s
hd10.eqn 6 5 1s 5 5s
hd11.eqn 18 17 9s 15 1m 12s
hd12.eqn 16 15 9s 15 42s
bar.eqn 12 12 57s ... aborted because would take too long!
For the circuit evaluation, we obtained the following results:
root@dfcf63f5b942:~/Lobster# ./gen_table_eval_time.sh
Evaluation result will be saved in paper_result directory
Name Eval.Time
Original Baseline-opted Lobster-opted
cardio.eqn 14m 54s 10m 42s 7m 12s
dsort.eqn 11m 26s 8m 51s 8m 54s
hd01.eqn - - -
hd02.eqn - - -
hd03.eqn - - -
hd04.eqn 8m 28s 5m 25s 3m 24s
hd05.eqn - - -
hd06.eqn - - -
hd07.eqn 1m 1s - 24s
hd08.eqn 2m 24s 1m 3s 1m 2s
hd09.eqn 13m 33s 9m 58s 9m 30s
hd10.eqn 3m 3s 1m 30s 1m 28s
... aborted manually.
Notably, the execution time of the Cardio circuit is much higher than using Cingulata. Although Lobster uses an outdated version of HElib from 2017 (derived from license in header files), we do not expect such a large difference in the circuit's evaluation time. However, as the circuits in Lobster are specified on a very low-level and thus not easily comparable with the Cingulata circuit, we cannot verify whether this circuit is correct or if there are any other issues.
The parameters used by each of the configurations explained following (SEAL-BFV, SEAL-BFV-SIMD, SEAL-CKKS-SIMD) are written at the end of the test run to a file named fhe_parameters.txt
that looks as follow:
/
| Encryption parameters :
| scheme: BFV
| poly_modulus_degree: 16384
| coeff_modulus size: 338 (30 + 40 + 44 + 50 + 54 + 60 + 60) bits
| plain_modulus: 2
\
This file is uploaded to the respective folder in the S3 bucket for documentation purposes.
SEAL does not provide a native support for binary data and computation. Binary support was emulated by choosing the plaintext poly modulus degree 2 and implementing binary circuits for the following arithmetic functions manually:
- Addition (Sklansky adder by Harris)
- Lower operator (Logarithmic depth lower comparator by Garay et al.)
Each bit is represented by a single ciphertext.
Eight ciphertexts are united into a vector (std::vector<seal::Ciphertext>
) to represent each cardio input in a 8-bit binary representation.
This bit vector is also the input type for the manually implemented arithmetic functions, however, these functions itself work on the bit-/ciphertext-level by using the arithmetic functions provided by SEAL (addition, multiplication).
The cardiac risk factor assessment algorithm is implemented based on the Cingulata implementation.
The only deviation from that is that involved constants (e.g., 50
in man && age > 50
) are also encrypted, i.e., represented as vector consisting of seal::Ciphertext
s.
This is made to facilitate the implementation as otherwise some changes in the manually implemented arithmetic circuits would be required, e.g., we need to use special SEAL operations that accept a seal::Plaintext
as argument.
Binary emulation with batching cannot easily be realized in SEAL-BFV due to the limitations in the choice of the plaintext modulus. The SEAL Manual v2.3.1 states:
Batching only works when the plaintext modulus t is chosen to be a prime number and congruent to 1 (mod 2n), which we assume to be the case 2. (Footnote: Note that this means t > 2n, which can in some cases turn out to be an annoying limitation.)
Because of this limitation, the result of a XOR b
which is computed as a + b
is no guaranteed to be a valid binary number (i.e., 0 or 1) in case that the plaintext modulus t is greater than 2.
However, as described in a SO post, we can compute a XOR b
as (a-b)^2
where a, b are binary digits to bypass this limitation and get the correct result.
The batched BFV-variant of the cardio program was derived from the batched CKKS variant by removing all modulus switching operations, rescaling operations, and by adapting the parameters accordingly.
For implementing the cardio program in SEAL based on the CKKS scheme, we use a non-binary plaintext space. The AND operation can easily be emulated in arithmetic by using multiplication:
IN_1 IN_2 AND(IN_1, IN_2)
----------------------------
0 0 0
0 1 1
1 0 1
1 1 1
In a binary plaintext space, we would typically use addition to represent the XOR operation:
IN_1 IN_2 XOR(IN_1, IN_2)
----------------------------
0 0 0
0 1 1
1 0 1
1 1 0 <-- this does not work in a non-binary field
However, if we have a non-binary plaintext space then 1+1 would simply be 2 instead of 0. Because of that, we cannot use addition to emulate XOR but need to use an approximation of XOR for binary inputs:
XOR(a, b) == (a-b)^2, for a,b ∈ [0,1]
(See Mathematical (Arithmetic) representation of XOR for its derivation.)
The input given to the program is:
idx name
----------
00 man
01 antecedent
02 smoking
03 diabetic
04 pressure
05 man
06 !man
07 age
08 1
09 1
10 1
11 1
12 drinking
13 drinking
14 hdl
15 height
16 phy_act
17 weight+90
Each value of the input is encoded into a 8-bit binary representation wherein each ciphertext slot represents one bit.
The inputs 1
(index 08-11) are only given to simplify the data alignment for batching.
Also, we pass for simplicity man
multiple times, the negation (!man
), and also weight+90
.
The negation of man
could also easily be computed at server side.
As the computation of weight+90
would require additionally an (8-bit) adder circuit, the client does this computation.
To not expose any information about our risk assessment algorithm, we could in a real scenario require the client to send the numeric inputs with different offsets, e.g., weight
, weight+10
, weight+20
, ..., weight+90
, etc., or just implement a adder circuit and do the maths by ourselves.
The server converts the single input ciphertext into 17 different ciphertexts, a ciphertext for the boolean flags, one ciphertext for each bit of the lower-equation's left-hand side, and one for each bit of the lower-equation's right-hand side.
For implementing cardio using batching in SEAL-CKKS, we then reformulate the initial conditions:
+1 if man && 50 < age
+1 if antecedent && 0 < 1
+1 if smoking && 0 < 1
+1 if diabetic && 0 < 1
+1 if high blood pressure && 0 < 1
+1 if man && 3 < alc_consumption
+1 if (!man) && 2 < alc_consumption
+1 if TRUE && HDL < 40
+1 if TRUE && height < weight+90
+1 if TRUE && phy_act < 30
into a structure that allows to easily execute the batched operations. Values enclosed in brackets [...] are those passed as input to the program. Other values are merged into the ciphertexts at the server side.
+1 if [man && 50 < [age]
+1 if [antecedent && 0 < [1]
+1 if [smoking && 0 < [1]
+1 if [diabetic && 0 < [1]
+1 if [high blood pressure && 0 < [1]
+1 if [man] && 3 < [alc_consumption]
+1 if [!man] && 2 < [alc_consumption]
+1 if [1] && [HDL] < 40
+1 if [1] && [height] < weight+90
+1 if [1] && [phy_act] < 30
Challenges in the implementation of Cardio were primarily related to the peculiarities of CKKS. For example, in addition to the already existing relinearizations that were added for BFV, it was required to take care of a ciphertext's scale and its current moduli level. For simplicity, we always rescale all ciphertexts to the initial scale (40 bits), after each operation that modifies the scale. This was very helpful also for methods that are called with different ciphertexts because by ensuring a scale of 40, we could apply operations on the inputs without checking the compatibility of their scale first.
Important to note is that CKKS is an approximate numbers scheme, i.e., numbers are not always exact. For example, a 1.0 could be represented in SEAL as 1.12134. Because of that, the impreciseness adds up and can lead to a wrong result if the parameters are not chosen properly. For example, the coefficient moduli used for the ciphertext had to be increased from initially 30 bits to 40 bits to achieve a correct result.
Optimizations that our batched CKKS implementation includes:
- Generate exactly those Galois keys that are required for the used rotations to reduce the time needed for key generation.
- Minimize the minimum number of moduli levels (i.e., ciphertext size) based on the computation.
- Align the data to equidistant positions in the ciphertext based on how data is arranged in the input, e.g., place data of interest at ciphertext slots 8, 16, 24, etc. It does not make any difference if data is in sequential order (e.g., in slots 1, 2, 3) but aligning data to a sequential order requires more expensive rotations.
- Use Ciphertext-Plaintext operations instead of Ciphertext-Ciphertext operations whenever possible as these are cheaper and do not increase the scale of the result, which saves an expensive rescale operation.
- Home
- Compilers & Optimizations
- Libraries
- Benchmark Programs
- Implementation Docs