You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been doing some tests with Gramine and I found out a result that I can't explain.
I've been testing the same application that can be found here with different enclave size on Microsoft Azure. For these tests I used different machines from the DCdsv3-series: Standard_DC1s_v3, Standard_DC2s_v3, Standard_DC4s_v3, Standard_DC24s_v. I am testing each machine size for different input size (going from 2^10 to 2^16 rows).
My assumption was that all enclave sizes would be sufficient to store and compute the data. So I didn't expect any influence of enclave size on execution time. But the figure clearly shows that execution time is lower when enclave size is higher. Do you have an idea to explain this ?
In particular, different hardware has different versions of SGX, different EPC size, different quirks (for example, older machines do not support rdtsc instruction inside of the SGX enclave, newer machines do support it), etc.
Second, I recommend to also re-do your experiments with https://gramine.readthedocs.io/en/stable/manifest-syntax.html#pre-heating-enclave. My suspicion is that the fact that some enclave memory is not pre-faulted plays a significant role in your experiments.
Third, your experiments seem to be too short? I notice execution times less than 0.1s. This makes the "jitter" factor too significant to do proper analysis. Maybe there were some fluctuations at some execution runs? Or maybe there was some background job that started working at some execution runs? Or maybe the enclave thread was re-scheduled to a different CPU core by Linux for some reason, thus needing to re-populate the local CPU cache?
I would suggest to run your main execution logic in a look or something, to get to at least 10s execution times. This will smooth out all random perf-affecting factors.
Fourth, for such experiments, I highly recommend to run taskset -c ... gramine-sgx, to limit the CPU cores on which Linux will schedule enclave threads. This way you'll remove one factor of performance variation.
UPDATE: I noticed that you have sgx.profile.enable = "main". This may also affect performance. Also, this option means that you built Gramine in debug or debugoptimized mode, which is not advised for performance experiments. I'd recommend to remove this manifest option, rebuild Gramine in release mode, and redo the experiments again.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello,
I've been doing some tests with Gramine and I found out a result that I can't explain.
I've been testing the same application that can be found here with different enclave size on Microsoft Azure. For these tests I used different machines from the DCdsv3-series: Standard_DC1s_v3, Standard_DC2s_v3, Standard_DC4s_v3, Standard_DC24s_v. I am testing each machine size for different input size (going from 2^10 to 2^16 rows).
You can find the result here
My assumption was that all enclave sizes would be sufficient to store and compute the data. So I didn't expect any influence of enclave size on execution time. But the figure clearly shows that execution time is lower when enclave size is higher. Do you have an idea to explain this ?
Beta Was this translation helpful? Give feedback.
All reactions