-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Page Cache support for trusted files #1776
base: master
Are you sure you want to change the base?
Page Cache support for trusted files #1776
Conversation
Signed-off-by: jkr0103 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 13 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jkr0103)
pal/src/host/linux-sgx/enclave_framework.c
line 697 at r1 (raw file):
buf_pos += chunk_size; } else { memcpy(tmp_chunk, chunk->data, chunk_size);
Don't need to use tmp_chunk
and this memcpy
since we don't SHA256Update the chunk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 13 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jkr0103)
pal/src/host/linux-sgx/enclave_framework.c
line 690 at r1 (raw file):
if (lruc_find(tf->cache, chunk_number) != NULL) { tf_chunk_t* chunk = lruc_get(tf->cache, chunk_number);
Can move lruc_get
to if()
and remove the lruc_find
.
Signed-off-by: jkr0103 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 13 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @llly)
pal/src/host/linux-sgx/enclave_framework.c
line 690 at r1 (raw file):
Previously, llly (Li Xun) wrote…
Can move
lruc_get
toif()
and remove thelruc_find
.
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 697 at r1 (raw file):
Previously, llly (Li Xun) wrote…
Don't need to use
tmp_chunk
and thismemcpy
since we don't SHA256Update the chunk.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 12 of 13 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103)
pal/src/host/linux-sgx/enclave_framework.c
line 653 at r2 (raw file):
if (g_tf_max_chunks_in_cache > 0 && ++lcu_remove_count == 100) { log_always("High frequenty of this log indicates Trusted files chunks exceed the" " `tf_max_chunks_in_cache` limit. Please increase it in the manifest"
sgx.tf_max_chunks_in_cache
Code quote:
tf_max_chunks_in_cache
pal/src/host/linux-sgx/enclave_framework.c
line 714 at r2 (raw file):
* directly copy into buf (without a scratch buffer) and hash in-place */ if (!sgx_copy_to_enclave(buf_pos, chunk_size, umem + chunk_offset, chunk_size)) { goto failed;
ret
is not set before goto failed
. Seems to be a bug in current code.
pal/src/host/linux-sgx/enclave_framework.c
line 733 at r2 (raw file):
* needed by the caller */ if (!sgx_copy_to_enclave(tmp_chunk, chunk_size, umem + chunk_offset, chunk_size)) { goto failed;
ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @llly)
pal/src/host/linux-sgx/enclave_framework.c
line 714 at r2 (raw file):
Previously, llly (Li Xun) wrote…
ret
is not set beforegoto failed
. Seems to be a bug in current code.
+1. Should set smth like ret = -PAL_ERROR_DENIED;
pal/src/host/linux-sgx/enclave_framework.c
line 733 at r2 (raw file):
Previously, llly (Li Xun) wrote…
ditto.
+1. Should set smth like ret = -PAL_ERROR_DENIED;
Signed-off-by: jkr0103 <[email protected]>
Signed-off-by: jkr0103 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 13 files reviewed, 3 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @llly)
pal/src/host/linux-sgx/enclave_framework.c
line 653 at r2 (raw file):
Previously, llly (Li Xun) wrote…
sgx.tf_max_chunks_in_cache
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 714 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
+1. Should set smth like
ret = -PAL_ERROR_DENIED;
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 733 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
+1. Should set smth like
ret = -PAL_ERROR_DENIED;
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 13 files reviewed, 8 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @llly)
CI-Examples/nginx/nginx-gramine.conf.template
line 43 at r3 (raw file):
open_file_cache_valid 60s; open_file_cache_min_uses 2; open_file_cache_errors on;
Why did you modify this file?
common/src/protected_files/meson.build
line 5 at r3 (raw file):
'../../../pal/include/pal', '../../../pal/include/arch' / host_machine.cpu_family(), '../../../pal/include/arch' / host_machine.cpu_family() / 'linux'),
Why is this required? This is wrong, common code must not depend on PAL at all.
pal/src/host/linux-sgx/enclave_framework.c
line 702 at r3 (raw file):
buf_pos += copy_end - copy_start; } } else {
Sorry, this if {} else {}
looks ugly. Why not just continue
?
for (...) {
if (g_tf_max_chunks_in_cache > 0 && (chunk = lruc_get(tf->cache, chunk_number)) != NULL) {
...
continue;
}
/* we didn't find the chunk in the TF cache, must copy into enclave and add to the TF cache */
... old code plus your `tf_append_chunk()` ...
pal/src/host/linux-sgx/enclave_framework.c
line 774 at r3 (raw file):
failed: free(tmp_chunk);
Can you add here before free()
the following line:
assert(ret < 0);
This way we'll always immediately know about the problems with "forgotten" ret
values.
pal/src/host/linux-sgx/enclave_tf.h
line 92 at r3 (raw file):
*/ int tf_append_chunk(struct trusted_file* tf, uint8_t* chunk, uint64_t chunk_size, uint64_t chunk_number);
What's the point of this declaration? This func is a helper and it is used only in a single file. So just declare it there as static
and remove from this header.
Also, the comments are useless, it's obvious what this does and what its parameters are.
Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 10 of 13 files reviewed, 7 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @llly)
common/src/protected_files/meson.build
line 5 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why is this required? This is wrong, common code must not depend on PAL at all.
Done. I fixed it myself in the fixup commit (since it was a bit non-trivial to do).
Signed-off-by: jkr0103 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 8 of 13 files reviewed, 7 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @llly)
pal/src/host/linux-sgx/enclave_framework.c
line 702 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Sorry, this
if {} else {}
looks ugly. Why not justcontinue
?for (...) { if (g_tf_max_chunks_in_cache > 0 && (chunk = lruc_get(tf->cache, chunk_number)) != NULL) { ... continue; } /* we didn't find the chunk in the TF cache, must copy into enclave and add to the TF cache */ ... old code plus your `tf_append_chunk()` ...
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 774 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Can you add here before
free()
the following line:assert(ret < 0);
This way we'll always immediately know about the problems with "forgotten"
ret
values.
Done.
CI-Examples/nginx/nginx-gramine.conf.template
line 43 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why did you modify this file?
removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4, 3 of 3 files at r5, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @jkr0103)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
pal/src/host/linux-sgx/enclave_tf.h
line 92 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point of this declaration? This func is a helper and it is used only in a single file. So just declare it there as
static
and remove from this header.Also, the comments are useless, it's obvious what this does and what its parameters are.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
pal/src/host/linux-sgx/enclave_tf_structs.h
line 47 at r5 (raw file):
typedef struct _tf_chunk { uint64_t chunk_number; uint8_t data[TRUSTED_CHUNK_SIZE];
Shell we allocate memory to this array dynamically as it will be waste of 16k memory when sgx.tf_max_chunks_in_cache
is not used in the manifest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 13 files at r1, 2 of 2 files at r4, 3 of 3 files at r5, all commit messages.
Reviewable status: all files reviewed, 38 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103)
a discussion (no related file):
Please test your PR extensively on many workloads, including multi-threaded ones.
a discussion (no related file):
The current PR has a cache-per-trusted-file. I wonder if we want to have a global cache for all trusted files. This will need a global lock unfortunately, and some more sophisticated cache->key
-- currently we can get away with a simple "chunk index"; but with a global cache the key must reflect (trusted file, chunk index)
tuple.
@mkow @kailun-qin Thoughts?
-- commits
line 2 at r5:
We'll need to modify this title and we'll need to add the explanatory text in the commit body.
Need to mention:
- New manifest option
sgx.tf_max_chunks_in_cache
- That we have an explicit
common_lru_cache_dep
dependency in Meson files now - Copy some text from Documentation, to have a quick explanation what it is and why needed
I can do it during final rebase.
CI-Examples/nginx/nginx.manifest.template
line 36 at r5 (raw file):
sgx.max_threads = {{ '1' if env.get('EDMM', '0') == '1' else '4' }} # Tune this parameters based on your application and system configuration for best performance
That's a useless comment. It's too generic.
Can you specify exactly why our example with Nginx requires this 48
, and what it means exactly (i.e., 48 trusted-file chunks is enough to cover common Nginx static HTTP files which will be cached by Gramine).
common/src/meson.build
line 14 at r5 (raw file):
'init.c', 'location.c', 'lru_cache.c',
This looks wrong. Please remove from this list. You should always use common_src_lru_cache_dep
from now on.
Documentation/manifest-syntax.rst
line 973 at r5 (raw file):
.. _trusted-files-max-chunks-in-cache: Trusted files Page Cache optimization: maximum chunks in cache
Depending on the name that we will choose for the manifest option, this title will be the same as the manifest option. I.e. something like:
Trusted files' cache size
^^^^^^^^^^^^^^^^^^^^^^^^^
Documentation/manifest-syntax.rst
line 978 at r5 (raw file):
:: sgx.tf_max_chunks_in_cache = [NUM]
Hm, now I don't like the name... I think sgx.trusted_files_cache_size = [NUM]
would be more understandable to end users.
Let's wait for other reviewers to chime in before we change this name, especially @kailun-qin and @mkow
Documentation/manifest-syntax.rst
line 981 at r5 (raw file):
(Default: 0) This syntax specifies a limit on the number of chunks in the cache for trusted
Hm, now I understand that this "file chunks" is confusing to end users. Users should specify the actual "cache size" in KB/MB/GB, just like other sizes in the manifest.
Then Gramine internally will divide this cache size (in bytes) over the single-chunk-size to get the number of chunks, and Gramine will use this number of chunks in its code.
@kailun-qin @mkow What do you think? Will the size-in-bytes be more understandable for users?
Documentation/manifest-syntax.rst
line 983 at r5 (raw file):
This syntax specifies a limit on the number of chunks in the cache for trusted files. By default, this optimization is disabled as the limit should be application-specific.
There is a lot of information missing from this description:
- What is this option exactly? (It specifies the sub-region of each trusted file is cached. Also, only files that are opened more than once are being cached.)
- What is the cache eviction policy? (It is classic Least Recently Used aka LRU.)
- When users would want to use this optimization? (When there are just a few frequently-reused files.)
- What is the downside of this optimization? (If the files are rarely reused, then this optimization unnecessarily fills out enclave memory. Or if there are too many files, then too much memory will be spent on the cache).
Documentation/performance.rst
line 312 at r5 (raw file):
As the trusted file is read, it is split into chunks, and we compute SHA256 hash for each chunk. Gramine doesn't implement a Page Cache feature, so it
I would remove this doesn't implement a Page Cache feature
-- this is taken from our GitHub discussions, and I referred in that sentence to the Linux feature. I would simply start with the second half of the sentence:
...
hash for each chunk. Gramine doesn't have an optimization of keeping the trusted file's
content in enclave memory, which implies re-reading and re-hashing the same file contents
every time the file is read at the same offset. To address this performance bottleneck ...
pal/src/host/linux-sgx/enclave_framework.c
line 629 at r5 (raw file):
static int tf_append_chunk(struct trusted_file* tf, uint8_t* chunk, uint64_t chunk_size, uint64_t chunk_number) {
Fix indentation
pal/src/host/linux-sgx/enclave_framework.c
line 629 at r5 (raw file):
static int tf_append_chunk(struct trusted_file* tf, uint8_t* chunk, uint64_t chunk_size, uint64_t chunk_number) {
The operations on tf->times_first_chunk_loaded
and tf->cache
must be protected by a spinlock, otherwise two threads working on the same file can lead to a segfault or inconsistent state.
pal/src/host/linux-sgx/enclave_framework.c
line 631 at r5 (raw file):
uint64_t chunk_size, uint64_t chunk_number) { if (chunk_number == 0) { tf->times_first_chunk_loaded++;
We can have an integer overflow here theoretically. Since this is purely theoretical, I would smth like this:
if (chunk_number == 0) {
tf->times_first_chunk_loaded++;
if (tf->times_first_chunk_loaded == UINT64_MAX)
BUG(); /* should not be possible in practice */
}
pal/src/host/linux-sgx/enclave_framework.c
line 632 at r5 (raw file):
if (chunk_number == 0) { tf->times_first_chunk_loaded++; }
You need to explain what this logic is in a comment.
pal/src/host/linux-sgx/enclave_framework.c
line 634 at r5 (raw file):
} if (tf->times_first_chunk_loaded > 1) {
Please revert for better readability:
if (tf->times_first_chunk_loaded < 1)
pal/src/host/linux-sgx/enclave_framework.c
line 635 at r5 (raw file):
if (tf->times_first_chunk_loaded > 1) { tf_chunk_t* new_chunk = calloc(1, sizeof(*new_chunk));
Please use malloc()
You then will have to add memset(0)
after memcpu()
, to zero out the rest of the chunk (if the requested size was smaller than the chunk size)
pal/src/host/linux-sgx/enclave_framework.c
line 645 at r5 (raw file):
free(new_chunk); return -PAL_ERROR_NOMEM; }
Please add a newline after this line
pal/src/host/linux-sgx/enclave_framework.c
line 646 at r5 (raw file):
return -PAL_ERROR_NOMEM; } if (lruc_size(tf->cache) > (size_t)g_tf_max_chunks_in_cache) {
You don't need the type cast
pal/src/host/linux-sgx/enclave_framework.c
line 650 at r5 (raw file):
lruc_remove_last(tf->cache); #ifdef DEBUG static int lcu_remove_count = 0;
uint64_t
pal/src/host/linux-sgx/enclave_framework.c
line 650 at r5 (raw file):
lruc_remove_last(tf->cache); #ifdef DEBUG static int lcu_remove_count = 0;
This variable is more like a log-frequency throttler, call it tf_cache_log_throttler
pal/src/host/linux-sgx/enclave_framework.c
line 651 at r5 (raw file):
#ifdef DEBUG static int lcu_remove_count = 0; if (g_tf_max_chunks_in_cache > 0 && ++lcu_remove_count == 100) {
If you will do:
if (!g_tf_max_chunks_in_cache)
return 0;
at the beginning of this func, then you won't need the first clause in this IF.
pal/src/host/linux-sgx/enclave_framework.c
line 651 at r5 (raw file):
#ifdef DEBUG static int lcu_remove_count = 0; if (g_tf_max_chunks_in_cache > 0 && ++lcu_remove_count == 100) {
lcu_remove_count % 100 == 0
pal/src/host/linux-sgx/enclave_framework.c
line 655 at r5 (raw file):
" `sgx.tf_max_chunks_in_cache` limit. Please increase it in the manifest" " file to get the best performance."); lcu_remove_count = 0;
No need to reset to 0, instead better move the increment to here:
lcu_remove_count++;
pal/src/host/linux-sgx/enclave_framework.c
line 682 at r5 (raw file):
off_t chunk_offset = aligned_offset; struct trusted_file* tf = get_trusted_or_allowed_file(path);
This is an expensive operation.
You should instead pass tf
as an argument to this function. The callers of this function must have access to tf
. To do this, seems like you need to memorize tf
during file_open()
inside a new file handle.file.tf
.
pal/src/host/linux-sgx/enclave_framework.c
line 683 at r5 (raw file):
struct trusted_file* tf = get_trusted_or_allowed_file(path); int chunk_number = chunk_offset/TRUSTED_CHUNK_SIZE;
What's the point of this helper variable?
You can just calculate this variable on demand, based on chunk_offset
. No need to drag it around.
pal/src/host/linux-sgx/enclave_framework.c
line 703 at r5 (raw file):
} continue; }
Please add a newline and a comment, to make it clear that below we start another case:
/* we didn't find the chunk in the TF cache, must copy into enclave and add to the TF cache */
pal/src/host/linux-sgx/enclave_framework.c
line 713 at r5 (raw file):
if (chunk_offset >= offset && chunk_end <= end) { /* if current chunk-to-copy completely resides in the requested region-to-copy, * directly copy into buf (without a scratch buffer) and hash in-place */
Fix indentation (revert this accidental change)
pal/src/host/linux-sgx/enclave_framework.c
line 719 at r5 (raw file):
} if(g_tf_max_chunks_in_cache > 0) {
if (
-- add space
pal/src/host/linux-sgx/enclave_framework.c
line 719 at r5 (raw file):
} if(g_tf_max_chunks_in_cache > 0) {
Why do you need this if (g_tf_...)
here? Why not move it inside td_append_chunk()
itself?
pal/src/host/linux-sgx/enclave_framework.c
line 739 at r5 (raw file):
} if(g_tf_max_chunks_in_cache > 0) {
if (
-- add space
pal/src/host/linux-sgx/enclave_framework.c
line 739 at r5 (raw file):
} if(g_tf_max_chunks_in_cache > 0) {
Why do you need this if (g_tf_...)
here? Why not move it inside td_append_chunk()
itself?
pal/src/host/linux-sgx/enclave_framework.c
line 764 at r5 (raw file):
if (memcmp(chunk_hashes_item, &chunk_hash[0], sizeof(*chunk_hashes_item))) { log_error("Accessing file '%s' is denied: incorrect hash of file chunk at %lu-%lu.", path, chunk_offset, chunk_end);
Fix indentation (revert this accidental change)
pal/src/host/linux-sgx/enclave_tf_structs.h
line 47 at r5 (raw file):
Previously, jkr0103 (Jitender Kumar) wrote…
Shell we allocate memory to this array dynamically as it will be waste of 16k memory when
sgx.tf_max_chunks_in_cache
is not used in the manifest?
I don't understand this question. Where do we waste memory?
pal/src/host/linux-sgx/enclave_tf_structs.h
line 48 at r5 (raw file):
uint64_t chunk_number; uint8_t data[TRUSTED_CHUNK_SIZE]; } tf_chunk_t;
Please don't use a typedef, we consider it deprecated for all new code. Just use:
struct tf_chunk {
uint64_t chunk_number;
uint8_t data[TRUSTED_CHUNK_SIZE];
};
/* use like below */
struct tf_chunk chunk;
chunk.data = ...
pal/src/host/linux-sgx/pal_files.c
line 140 at r5 (raw file):
/* we lazily update the size of the trusted file */ tf->size = st.st_size; tf->cache = lruc_create();
Shouldn't you check for if (!tf->cache) return NOMEM
?
pal/src/host/linux-sgx/pal_files.c
line 140 at r5 (raw file):
/* we lazily update the size of the trusted file */ tf->size = st.st_size; tf->cache = lruc_create();
Where is the cleanup of this cache-per-trusted-file? You should remove all items in the cache when file is closed, and remove the cache itself.
pal/src/host/linux-sgx/pal_files.c
line 141 at r5 (raw file):
tf->size = st.st_size; tf->cache = lruc_create(); tf->times_first_chunk_loaded = 0;
These two assignments should better be done in load_trusted_or_allowed_file()
pal/src/host/linux-sgx/pal_main.c
line 869 at r5 (raw file):
ret = toml_int_in(g_pal_public_state.manifest_root, "sgx.tf_max_chunks_in_cache", /*defaultval=*/0, &g_tf_max_chunks_in_cache);
Please fix indentation
Signed-off-by: jkr0103 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 15 files reviewed, 38 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @llly)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We'll need to modify this title and we'll need to add the explanatory text in the commit body.
Need to mention:
- New manifest option
sgx.tf_max_chunks_in_cache
- That we have an explicit
common_lru_cache_dep
dependency in Meson files now- Copy some text from Documentation, to have a quick explanation what it is and why needed
I can do it during final rebase.
Thanks
CI-Examples/nginx/nginx.manifest.template
line 36 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
That's a useless comment. It's too generic.
Can you specify exactly why our example with Nginx requires this
48
, and what it means exactly (i.e., 48 trusted-file chunks is enough to cover common Nginx static HTTP files which will be cached by Gramine).
Done.
common/src/meson.build
line 14 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This looks wrong. Please remove from this list. You should always use
common_src_lru_cache_dep
from now on.
Done.
Documentation/manifest-syntax.rst
line 983 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
There is a lot of information missing from this description:
- What is this option exactly? (It specifies the sub-region of each trusted file is cached. Also, only files that are opened more than once are being cached.)
- What is the cache eviction policy? (It is classic Least Recently Used aka LRU.)
- When users would want to use this optimization? (When there are just a few frequently-reused files.)
- What is the downside of this optimization? (If the files are rarely reused, then this optimization unnecessarily fills out enclave memory. Or if there are too many files, then too much memory will be spent on the cache).
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 629 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Fix indentation
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 629 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
The operations on
tf->times_first_chunk_loaded
andtf->cache
must be protected by a spinlock, otherwise two threads working on the same file can lead to a segfault or inconsistent state.
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 631 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We can have an integer overflow here theoretically. Since this is purely theoretical, I would smth like this:
if (chunk_number == 0) { tf->times_first_chunk_loaded++; if (tf->times_first_chunk_loaded == UINT64_MAX) BUG(); /* should not be possible in practice */ }
I added one more check to limit it's value to 10, hence no overflow is possible.
pal/src/host/linux-sgx/enclave_framework.c
line 632 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You need to explain what this logic is in a comment.
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 634 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please revert for better readability:
if (tf->times_first_chunk_loaded < 1)
Did not understand it?
pal/src/host/linux-sgx/enclave_framework.c
line 635 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please use
malloc()
You then will have to add
memset(0)
aftermemcpu()
, to zero out the rest of the chunk (if the requested size was smaller than the chunk size)
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 645 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add a newline after this line
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 646 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You don't need the type cast
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 650 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
uint64_t
I feel it's not required to change the variable type as it's value won't cross 100.
pal/src/host/linux-sgx/enclave_framework.c
line 650 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This variable is more like a log-frequency throttler, call it
tf_cache_log_throttler
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 651 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
If you will do:
if (!g_tf_max_chunks_in_cache) return 0;
at the beginning of this func, then you won't need the first clause in this IF.
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 651 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
lcu_remove_count % 100 == 0
If we keep increamenting the variable value, it will overflow, instead we can reset when value reaches 100.
pal/src/host/linux-sgx/enclave_framework.c
line 655 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
No need to reset to 0, instead better move the increment to here:
lcu_remove_count++;
need to reset to prevent value overflow.
pal/src/host/linux-sgx/enclave_framework.c
line 682 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This is an expensive operation.
You should instead pass
tf
as an argument to this function. The callers of this function must have access totf
. To do this, seems like you need to memorizetf
duringfile_open()
inside a new filehandle.file.tf
.
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 683 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point of this helper variable?
You can just calculate this variable on demand, based on
chunk_offset
. No need to drag it around.
It's used at multiple places and makes code more readable. Do you still want it to be changes?
pal/src/host/linux-sgx/enclave_framework.c
line 703 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add a newline and a comment, to make it clear that below we start another case:
/* we didn't find the chunk in the TF cache, must copy into enclave and add to the TF cache */
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 713 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Fix indentation (revert this accidental change)
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 719 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
if (
-- add space
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 719 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you need this
if (g_tf_...)
here? Why not move it insidetd_append_chunk()
itself?
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 739 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
if (
-- add space
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 739 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you need this
if (g_tf_...)
here? Why not move it insidetd_append_chunk()
itself?
Done.
pal/src/host/linux-sgx/enclave_framework.c
line 764 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Fix indentation (revert this accidental change)
Done.
pal/src/host/linux-sgx/enclave_tf_structs.h
line 47 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I don't understand this question. Where do we waste memory?
understood, no memory waste here until instantiated.
pal/src/host/linux-sgx/enclave_tf_structs.h
line 48 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please don't use a typedef, we consider it deprecated for all new code. Just use:
struct tf_chunk { uint64_t chunk_number; uint8_t data[TRUSTED_CHUNK_SIZE]; }; /* use like below */ struct tf_chunk chunk; chunk.data = ...
Done.
pal/src/host/linux-sgx/pal_files.c
line 140 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't you check for
if (!tf->cache) return NOMEM
?
done at other place where this code is moved as per other comment.
pal/src/host/linux-sgx/pal_files.c
line 140 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Where is the cleanup of this cache-per-trusted-file? You should remove all items in the cache when file is closed, and remove the cache itself.
Added cleanup code in file_destroy
function.
pal/src/host/linux-sgx/pal_files.c
line 141 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
These two assignments should better be done in
load_trusted_or_allowed_file()
Done.
pal/src/host/linux-sgx/pal_main.c
line 869 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please fix indentation
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 15 files reviewed, 39 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @llly)
pal/src/host/linux-sgx/enclave_framework.c
line 752 at r6 (raw file):
} else { /* if current chunk-to-copy only partially overlaps with the requested region-to-copy, * read the file contents into a scratch buffer, verify hash and then copy only the part
Will fix indentation issue with future commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 15 files reviewed, 39 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @llly)
pal/src/host/linux-sgx/enclave_framework.c
line 632 at r5 (raw file):
Previously, jkr0103 (Jitender Kumar) wrote…
Done.
We start adding a file to cache if file is resed for 10 times or more. We can adjustg this number based on the data from other workloads, added below data from nginx run which shows the list of file with number of times reused. I will collect similer data for other workloads as well:
No. of times file used-------------file path
4 /usr/local/lib/x86_64-linux-gnu/gramine/runtime/glibc/ld-linux-x86-64.so.2
5 /usr/local/lib/x86_64-linux-gnu/gramine/runtime/glibc/libdl.so.2
5 /usr/local/lib/x86_64-linux-gnu/gramine/runtime/glibc/libpthread.so.0
3 /lib/x86_64-linux-gnu/libcrypt.so.1
2 /lib/x86_64-linux-gnu/libssl.so.1.1
2 /lib/x86_64-linux-gnu/libcrypto.so.1.1
3 /lib/x86_64-linux-gnu/libz.so.1
4 /usr/local/lib/x86_64-linux-gnu/gramine/runtime/glibc/libc.so.6
4 install/sbin/nginx
1 install/conf/nginx-gramine.conf
2 install/conf/mime.types
1 install/conf/server.crt
1 install/conf/server.key
14320000+ install/html/random/10K.1.html
pal/src/host/linux-sgx/enclave_framework.c
line 635 at r5 (raw file):
Previously, jkr0103 (Jitender Kumar) wrote…
Done.
Are memset
and memcpu
really required. We are filling value few line below:
new_chunk->chunk_number = chunk_number;
memcpy(new_chunk->data, chunk, chunk_size);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 15 files reviewed, 39 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @llly)
Documentation/performance.rst
line 312 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I would remove this
doesn't implement a Page Cache feature
-- this is taken from our GitHub discussions, and I referred in that sentence to the Linux feature. I would simply start with the second half of the sentence:... hash for each chunk. Gramine doesn't have an optimization of keeping the trusted file's content in enclave memory, which implies re-reading and re-hashing the same file contents every time the file is read at the same offset. To address this performance bottleneck ...
Done.
Signed-off-by: jkr0103 <[email protected]>
Signed-off-by: jkr0103 <[email protected]>
Signed-off-by: jkr0103 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 15 files reviewed, 39 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @llly)
pal/src/host/linux-sgx/enclave_tf.h
line 64 at r8 (raw file):
* \brief Copy and check file contents from untrusted outside buffer to in-enclave buffer * * \param tf Trusted file struct corresponding to this file.
Will fix this comment with next commit.
Description of the changes
Please check issue #1712 for details.
This PR introduces a manifest option
sgx.tf_max_chunks_in_cache
which specifies the sub-region of each trusted file is cached if the file is opened more than 10 times. Cache eviction policy is Least Recently Used aka LRU. This optimization is used when there are frequently-reused files.Downside of this optimization:
If the files are rarely reused, then this optimization unnecessarily fills out enclave memory. and if there are too many files, then too much memory will be spent on the cache.
Below are the results with and without this PR for nginx run:
Fixes #1712
How to test this PR?
Change
worker_processes
value toauto
inCI-Examples/nginx/nginx-gramine.conf.template
Add
open_file_cache max=1024 inactive=10s;
underhttp {
block inCI-Examples/nginx/nginx-gramine.conf.template
Native run:
SGX run:
Initial 1-3 run with gramine-sgx might give low performance results as trusted file chuks are added to the page cache for the first time. After 1-3 iterations results become consistent.
This change is