You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 25, 2024. It is now read-only.
For a CoCo guest a malicious host/VMM can prevent IPIs to be delivered across vCPUs. We need to ensure that all missing IPIs can be detected or force waiting for failed deliveries of IPIs.
Solution
Analyze the respected APIs and its users within kernel, analyse the consequences of a missing IPI.
Ideally develop a method to take the IPIs outside of host/VMM control by utilizing methods such as IPI virtualization https://lwn.net/Articles/863190/ or similar.
Initial analysis
The below is initial analysis done a while back. It is not completed and needs checking by someone with a better knowledge.
Here is the list of x86 callers of smp_call_function that do not wait:
xen_pv_stop_other_cpus. Xen code is currently excluded.
sysrq_showregs_othercpus. Just printing info, safe to be lost.
scftorture_invoke_one & scf_torture_cleanup. Torture test for smp_call_function() and friends. We do not care about torture test module.
Rest are using wait=true and therefore should be safe.
Here is the list of x86 callers of smp_call_function_many that do not wait:
raise_mce. MCE feature should be disabled in CoCo guest by respected CPUID bit.
scftorture_invoke_one. Torture test for smp_call_function() and friends. We do not care about torture test module.
kvm_kick_many_cpus. Kernel-based Virtual Machine driver for Linux, we dont care about this one in TDX guest
Rest are using wait=true and therefore should be safe.
Here is the list of x86 callers of on_each_cpu that do not wait (note this omits the generic drivers users since we had an assumption that we only enable minimal set of drivers. If this assumption is not correct, this list must be extended):
fix_erratum_688. ??
lapic_update_tsc_freq. TSC re-calibration. Does not seem to have any strong implications if this fails.
rcu_gp_init & rcu_gp_cleanup. These are only for RCU_STRICT_GRACE_PERIOD, which as far as I understand are used for debug only due to big performance degradation. However, if this is used for TDX guest kernel, i think this might create problems in rcu operation.
Rest are using wait=true and therefore should be safe.
on_each_cpu_cond_mask, on_each_cpu_cond, smp_call_function_any & on_each_cpu_mask do not have x86 callers that do not wait.
Here is the list of x86 callers of smp_call_function_single that do not wait:
aperfmperf_snapshot_cpu. Aperf/Mperf should be disabled in CoCo guest by respected CPUIDs.
do_inject. if MCE feature is disabled (see above), injection should be actually turned off also.
cpuhp_report_idle_dead. if CPU hot plug is disabled, this is also disabled?
trc_wait_for_one_reader. This I think will disturb the correctness of how rcu works, if not delivered. However, I dont know if there any direct security consequences.
scftorture_invoke_one. Torture test for smp_call_function() and friends. We do not care about torture test module.
sync_rcu_exp_select_node_cpus & sync_sched_exp_online_cleanup. Same as above trc_wait_for_one_reader. Not sure of consequences for proper rcu operation.
Rest are using wait=true and therefore should be safe.
x86 callers of smp_call_function_single_async:
cpuid_read. We are ok with such reads failing.
rdmsr_safe_on_cpu. We are ok with such reads failing.
Problem
For a CoCo guest a malicious host/VMM can prevent IPIs to be delivered across vCPUs. We need to ensure that all missing IPIs can be detected or force waiting for failed deliveries of IPIs.
Solution
Analyze the respected APIs and its users within kernel, analyse the consequences of a missing IPI.
Ideally develop a method to take the IPIs outside of host/VMM control by utilizing methods such as IPI virtualization https://lwn.net/Articles/863190/ or similar.
Initial analysis
The below is initial analysis done a while back. It is not completed and needs checking by someone with a better knowledge.
Here is the list of x86 callers of smp_call_function that do not wait:
Rest are using wait=true and therefore should be safe.
Here is the list of x86 callers of smp_call_function_many that do not wait:
Rest are using wait=true and therefore should be safe.
Here is the list of x86 callers of on_each_cpu that do not wait (note this omits the generic drivers users since we had an assumption that we only enable minimal set of drivers. If this assumption is not correct, this list must be extended):
Rest are using wait=true and therefore should be safe.
on_each_cpu_cond_mask, on_each_cpu_cond, smp_call_function_any & on_each_cpu_mask do not have x86 callers that do not wait.
Here is the list of x86 callers of smp_call_function_single that do not wait:
Rest are using wait=true and therefore should be safe.
x86 callers of smp_call_function_single_async:
The text was updated successfully, but these errors were encountered: