-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PC just freezes after starting ksm_um.exe #22
Comments
Works fine for me with same build configuration. I have Windows 10 build 15063.296 however, CPU configuration shouldn't really be an issue. The initialization of EPAGE_HOOK should never cause this issue as it's just a spin lock init, plus an identity hash table initialization, are you calling |
No, like I mentioned. It freezes my PC right after executing the user mode application. Maybe I haven't expressed myself clearly enough: The driver itself (sc start) loads just fine, but after the UM application sends the 'subvert' command, the pc freezes. |
The annoying thing is that I don't even have a dump file or something similar to look though. |
Yeah, I understand. I am saying since it's only initialization of epage hook that is the suspect then it shouldn't really cause the issue. If it really is epage that's causing the issue, then can you modify static bool do_ept_violation(struct ept_ve_around *ve)
{
struct vcpu *vcpu = ve->vcpu;
struct ept *ept = &vcpu->ept;
struct ksm *k = vcpu_to_ksm(vcpu);
struct ve_except_info *info = ve->info;
if ((info->exit & EPT_VE_RWX) == 0) { /* no access */
if (!ept_alloc_page(EPT4(ept, info->eptp),
EPT_ACCESS_ALL, info->gpa, info->gpa))
return false;
return true;
}
KSM_PANIC(EPT_BUGCHECK_CODE, EPT_UNHANDLED_VIOLATION, info->exit, info->gpa);
return false;
} And let me know if a bugcheck occurs then? Also, since you enabled fileprint, post the log too. |
I don't think it has anything to do with epage hook. The log isn't very eventful, but here it is:
Edit: |
Weird about the memory issue, 8 physical memory ranges shouldn't be too much for it to handle. I don't know what the closest possible cause is, if it'd be too much for NonPagedPool then it'd have crashed, I think. |
You mean the minidump for the 4GB RAM crash? |
Now I finally got a crash and a dump file that can be examined. Edit: This here may actually be the right kernel file. At least WinDbg says it is the mapped one. |
So seems to be an
This happens when KSM fails to read from guest VA and therefore injects a page fault, I think it might be a fault on how KSM reads from guest virtual memory. I will take a look at this again in a few minutes but in the meantime... can you try without commit c956379? |
So, I have tried it without that commit and got a crash. |
Commit 172ca1f should do it. |
ok, I will try |
So, I have tried the latest commit but had no luck until now. |
Does it freeze when coming back from a sleep state? I wonder if you can let a VM freeze while having a debugger attached then break when it hangs to see where it's hanging at. |
I would have to set up a new VM since I somehow corrupted my old one when experimenting with hypervisors. And I also don't really know how to debug a VM :/ |
Just use the release build for the UM application, and install the VC runtime for it. You can keep the debug build of the driver. See virtual KD it makes it easier with VMWare: http://virtualkd.sysprogs.org/ |
Ugh, when I try to subvert the cpus, I get the error code Edit2: So, I have got it running in a VM now and WinDbg is attached to it. |
The VM seems to be stable ..? |
What if you reduce EPTPs, change |
So, just tried reducing |
Well, this is interesting. Edit: Ah, and also, my VS complains about the |
Hmm, that's odd. Do you mind bisecting? Since I don't get this on my machine. |
Sorry for my dumb question, what do you mean with 'bisecting' ? |
I mean using git bisect. This will help find the offending commit since v1.4. |
Ah, ok sure. Just give me a bit of time :) |
So, this is what
I have to mention that this commit here
gave me two crashes (for which I don't have dump files) but worked after that weirdly enough. |
I don't know whether this has anything to do with the freezing, but the first bad commit mentioned above, is the first commit that has the user mode application. |
So to confirm, 7654361 causes a crash/freeze? Can you upload the minidump and ksm then? |
So, this is weird. Look at this:
|
That's the virtualization probe callback. This is the call tree:
So if you want to investigate then looking at |
It seems like all 8 CPUs get virtualized successfully.
And then in the percpu dpc callback:
The crash dump shows that 8 virtual CPUs were initialized |
Now I am a bit baffled... if (k->active_vcpus >= 8)
return k->active_vcpus; And this for the callback: if (DPC_RET() > 8) \
KSM_PANIC(DPC_RET(), 0, 0, 0xFFF); \ If I understand it right, this should never catch? |
Your latest commit (d650239) with the bitmap fix causes an
|
Your new commit (b895792) seems to either fix or break something.
|
Latest should fix bitmap for once and all. |
Is it really supposed to be: static inline unsigned long __ffs(unsigned long x)
{
#ifdef _MSC_VER
unsigned long i;
_BitScanForward(&i, x);
return i + 1;
#else
return __builtin_ffs(x);
#endif
} I am asking, because in this commit (fa654b6), you removed the addition, but now in your latest commit (07e4334), you added it again? With the addition, I am getting a freeze (like always) and without, |
Commit 07e4334 is all the bitmap commits merged into one, because it was a mess. That's the final version.
So this basically explains why we need to decrement I don't see why it would fail with that error code (invalid control field) which I'd assume is the Or just add a function to tell you which static inline u8 debug_vmx_vmwrite(const char *field, size_t nr, size_t value)
{
u8 err = __vmx_vmwrite(nr, value);
if (err != 0)
KSM_DEBUG("error writing 0x%016llX to %s: %d\n", value, field, err);
return err;
}
#define DEBUG_VMX_VMWRITE(field, value) \
debug_vmx_vmwrite(#field, field, value) And replace |
May have been an MTRR issue, can you re-try with latest? |
Ok, I will when I come home. |
Sorry for the late response. |
I git bisected again, and this time I got a different result. |
I don't think that's relevant, the bug is in some other commit for sure, keep bisecting. |
Check your Windows '__ffs64' implementation. The first argument to _BitScanForward64 is the wrong size. |
Thanks! Should be fixed now. |
Was this bug fixed? I've encountered the same problem. Every time I run ksm_um.exe, the system freezed even when I remove all the macro other than ENABLE_DBGPRINT. OS:windows 10 x64, CPU:i5 7200U Kernel:16299. |
No, it wasn't. I can't reproduce, if you can find which part is broken, I will look into it. |
Tried several combinations ,finally found:
then run ksm_um.exe, no frozen or crash. but after a while, system hang too. |
What did you change the Also, if you're gonna change the IDTR, then comment out |
change from vcpu->idt.base to idtr->base. but if I disable SECONDARY_EXEC_ENABLE_VMFUNC, it seems I can't use the ept related tricks(ept hook, introspect...) |
Yep, that's because they require vmfunc. You can use VMFUNC without #VE. However, if you want to fix that freeze/crash, can you try backporting #VE handling and IDT shadowing to how v1.4 does them? v1.4 is commit 0cb7dd5. |
I revert to ksm-1.4, comment #VE and SECONDARY_EXEC_DESC_TABLE_EXITING, change GUESTR_IDTR_BASE to idtr->base, the epage_hook works fine. Hope the current version has this bug fixed. |
I don't understand what you changed from the current modifications you already made anyways. There is a reason why I told you to only backport these specific parts, v1.4 is v1.4. If you don't want to work on fixing it or help to fix it, then it won't be fixed any time soon. |
@asamy |
@hzqst That's a different issue, pretty sure this was posted before KPTI was even discovered and reported. What you reported, can be related, however. So can you open an issue with that? I can look into this later. |
I'm getting SYSTEM_THREAD_EXCEPTION_NOT_HANDLED bluescreen both when I try the epage hook example in the driver or if I try your test binary. I'm win10 1709 and latest gen i5. This happens both in vmware 14 and on normal windows. Also I have cpu overclocking disabled. Here's a memory dump of the epage hook test |
I have found this problem is caused by : mov cr8, rax, in nt!KiGenericCallDpcWorker+0x111: win10 1903 , but i do not know why |
Type of this issue (please specify)
System information
Build Configuration
I have to mention that this also happens when I just enable EPAGE_HOOK
Issue description
My issues is, that after I start the user mode application (ksm_um.exe), my PC just freezes.
I have waited up to about 10 minutes without anything happening. No crash, nothing.
The last log entry in the log file is right before the DPC call. After that, nothing gets through.
When I tested other hypervisors, I always got some kind of feedback (good or bad) like a BSOD which would help to track down the issue.
The text was updated successfully, but these errors were encountered: