Skip to content

Support ELF core dump creation on guest crash #417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

dblnz
Copy link
Contributor

@dblnz dblnz commented Apr 14, 2025

Purpose

Description

This depends on PR.
This closes #310

  • The crashdump feature toggles the logic of the creation of core dumps for Hyperlight Sandboxes.
    Additionally, core dumps can be configured at the Sandbox level. By default, all the Sandboxes have core dump generation enabled. The user can set the guest_core_dump = false in the SandboxConfiguration to disable core dumps for specific Sandboxes.
  • By default, Hyperlight places the core dumps in the temporary directory (platform specific).
    To change this, use the HYPERLIGHT_CORE_DUMP_DIR to specify a directory. If the directory does not exist, it will be created.

@dblnz dblnz added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Apr 14, 2025
@dblnz dblnz force-pushed the gdb-coredump-elfcore branch from d5892ec to 52c3c71 Compare April 15, 2025 06:47
@dblnz dblnz self-assigned this Apr 15, 2025
@dblnz dblnz force-pushed the gdb-coredump-elfcore branch 7 times, most recently from cc8a48d to 0126ce3 Compare April 29, 2025 12:40
@dblnz dblnz requested a review from Copilot April 29, 2025 12:41
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for ELF core dump generation on guest crash by introducing crashdump metadata and integrating it into the hypervisor initialization and crashdump context API. Key changes include:

  • Introducing the SandboxMetadata structure and propagating a new crashdump parameter across hypervisor modules.
  • Implementing a crashdump_context method that gathers register and memory state from various hypervisors.
  • Integrating the elfcore dependency and functionality in a new crashdump module.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/hyperlight_host/src/sandbox/uninitialized_evolve.rs Passes crashdump metadata to hypervisor handler initialization.
src/hyperlight_host/src/sandbox/uninitialized.rs Introduces SandboxMetadata to support crash dump context.
src/hyperlight_host/src/hypervisor/windows_hypervisor_platform.rs Adds a get_xsave function and crash dump integration.
src/hyperlight_host/src/hypervisor/mod.rs Updates the hypervisor trait to use crashdump_context instead of get_memory_regions.
src/hyperlight_host/src/hypervisor/kvm.rs Implements crashdump_context with register mapping and xsave conversion.
src/hyperlight_host/src/hypervisor/inprocess.rs Marks crashdump_context as unsupported in in-process mode.
src/hyperlight_host/src/hypervisor/hyperv_windows.rs Integrates crashdump_context and replaces the get_memory_regions implementation.
src/hyperlight_host/src/hypervisor/hyperv_linux.rs Adds a crashdump_context implementation with register mapping and xsave handling.
src/hyperlight_host/src/hypervisor/crashdump.rs Implements core dump creation using elfcore and adds custom memory reading.
src/hyperlight_host/Cargo.toml Adds the elfcore dependency for ELF core dump support.
Comments suppressed due to low confidence (2)

src/hyperlight_host/src/hypervisor/hyperv_windows.rs:535

  • Using unwrap() for OsString to String conversion may lead to a runtime panic if the conversion fails; consider handling conversion errors or using to_string_lossy() instead.
Path::new(&path).file_name().map(|name| name.to_os_string().into_string().unwrap()).unwrap()

src/hyperlight_host/src/hypervisor/hyperv_linux.rs:707

  • Using unwrap() for converting an OsString to a String here might result in a panic if the conversion fails; consider using a more robust conversion strategy such as to_string_lossy().
Path::new(&path).file_name().map(|name| name.to_os_string().into_string().unwrap()).unwrap()

@dblnz dblnz force-pushed the gdb-coredump-elfcore branch from 0126ce3 to 51ce291 Compare April 29, 2025 13:07
@dblnz dblnz requested a review from Copilot April 29, 2025 13:09
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for generating ELF core dump files when a guest crashes. The changes integrate a new crashdump feature across the codebase by introducing a new SandboxMetadata structure, updating various hypervisor implementations with a crashdump_context function, and adding a dedicated crashdump module along with necessary dependency updates.

  • Introduces crashdump support via conditional compilation (#[cfg(crashdump)]) and metadata propagation.
  • Implements crashdump_context functions in multiple hypervisor modules.
  • Adds a new crashdump module and updates Cargo.toml to include the elfcore dependency.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/hyperlight_host/src/sandbox/uninitialized_evolve.rs Introduces crashdump metadata propagation in sandbox initialization.
src/hyperlight_host/src/sandbox/uninitialized.rs Defines SandboxMetadata and integrates metadata handling in guest binary parsing.
src/hyperlight_host/src/hypervisor/windows_hypervisor_platform.rs Adds a new get_xsave function for crashdump support.
src/hyperlight_host/src/hypervisor/kvm.rs Implements crashdump_context using vCPU register and XSAVE information.
src/hyperlight_host/src/hypervisor/inprocess.rs Marks crashdump_context as unimplemented for in-process mode.
src/hyperlight_host/src/hypervisor/hypervisor_handler.rs Updates hypervisor handler functions to accept crashdump metadata.
src/hyperlight_host/src/hypervisor/hyperv_windows.rs Adds crashdump_context support and propagates metadata in HypervWindowsDriver.
src/hyperlight_host/src/hypervisor/hyperv_linux.rs Implements crashdump_context for HypervLinuxDriver with register extraction and metadata.
src/hyperlight_host/src/hypervisor/crashdump.rs Introduces the crashdump module with GuestView, GuestMemReader, and the core dump generation function.
src/hyperlight_host/Cargo.toml Adds the elfcore dependency needed for ELF core dump creation.

@dblnz dblnz marked this pull request as ready for review April 29, 2025 14:06
ludfjig
ludfjig previously approved these changes Apr 29, 2025
Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Left some nit comments.

By the way (unrelated to your PR), we should simplify each driver to include only the core VM functionality, (like run, set_reg, get_regs, etc.). Then, we can rely on a dyn Hypervisor to avoid duplicating functionality across different implementations.

@dblnz dblnz force-pushed the gdb-coredump-elfcore branch 4 times, most recently from 964a139 to 94a55d1 Compare May 7, 2025 11:28
ludfjig
ludfjig previously approved these changes May 7, 2025
simongdavies
simongdavies previously approved these changes May 8, 2025
Copy link
Contributor

@simongdavies simongdavies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments/suggestions, the main thing is that I think we should enable this for release builds and show how to get debug info in a release build as above

@dblnz dblnz dismissed stale reviews from simongdavies and ludfjig via 9f60095 May 20, 2025 14:51
@dblnz dblnz force-pushed the gdb-coredump-elfcore branch from 94a55d1 to 9f60095 Compare May 20, 2025 14:51
@dblnz dblnz requested a review from danbugs as a code owner May 20, 2025 14:51
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds end-to-end support for generating ELF core dumps when a guest VM crashes by propagating binary metadata, implementing dump-context methods in each hypervisor backend, and wiring up an elfcore-based dump writer.

  • Introduce SandboxMetadata to track the guest binary path and pass it through sandbox initialization.
  • Implement crashdump_context in KVM, Hyper-V (Linux/Windows) drivers and add crashdump::CrashDumpContext plus the crashdump_to_tempfile function.
  • Update build configuration, Cargo features, and docs to enable the crashdump feature and document usage.

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/hyperlight_host/src/sandbox/uninitialized_evolve.rs Propagate SandboxMetadata into hv_init and evolution calls.
src/hyperlight_host/src/sandbox/uninitialized.rs Define SandboxMetadata and set binary_path based on GuestBinary.
src/hyperlight_host/src/hypervisor/windows_hypervisor_platform.rs Add get_xsave under crashdump and import HyperlightError.
src/hyperlight_host/src/hypervisor/mod.rs Replace get_memory_regions with crashdump_context in the trait.
src/hyperlight_host/src/hypervisor/kvm.rs Store metadata, implement crashdump_context for KVM.
src/hyperlight_host/src/hypervisor/hypervisor_handler.rs Pass SandboxMetadata through handler setup and set_up_hypervisor_partition.
src/hyperlight_host/src/hypervisor/hyperv_windows.rs Store metadata and implement crashdump_context for Windows Hyper-V.
src/hyperlight_host/src/hypervisor/hyperv_linux.rs Store metadata and implement crashdump_context for Linux Hyper-V.
src/hyperlight_host/src/hypervisor/crashdump.rs Add CrashDumpContext, GuestView, GuestMemReader, and crashdump_to_tempfile.
src/hyperlight_host/build.rs Remove debug_assertions gate for crashdump feature.
src/hyperlight_host/Cargo.toml Add chrono and elfcore dependencies under the crashdump feature.
docs/how-to-debug-a-hyperlight-guest.md Document ELF core dump generation and inspection workflows.
Comments suppressed due to low confidence (2)

src/hyperlight_host/src/hypervisor/crashdump.rs:1

  • The crashdump module and its imports from optional crates (chrono, elfcore) are always compiled, which will break builds when the crashdump feature is disabled. Wrap the entire module (or relevant imports and functions) with #[cfg(crashdump)] so it's only included when that feature is active.
use chrono;

src/hyperlight_host/src/hypervisor/crashdump.rs:246

  • Add unit tests for crashdump_to_tempfile and CrashDumpContext::new to verify that core dump files are generated correctly and metadata is set as expected.
pub(crate) fn crashdump_to_tempfile(hv: &dyn Hypervisor) -> Result<()> {

dblnz added 9 commits May 21, 2025 13:44
- the core dump file is an ELF file with special segments that describe
the guest's memory when it crashed, the CPU register's values and other
special notes that tell the debugger how to set up a debugging session
starting from the core dump

Signed-off-by: Doru Blânzeanu <[email protected]>
…he sandbox

- only store the binary path for now

Signed-off-by: Doru Blânzeanu <[email protected]>
…b and lldb using vscode

Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
- Improve documentation and describe how an user can create separate files for debug information
- Change the output file directory to be configurable using HYPERLIGHT_CORE_DUMP_DIR environment variable
- Change output file name to include a timestamp

Signed-off-by: Doru Blânzeanu <[email protected]>
@dblnz dblnz force-pushed the gdb-coredump-elfcore branch 3 times, most recently from 27ba8a0 to ea7dfcc Compare May 22, 2025 09:00
- this allows a user to configure the crash dump feature at sandbox level
- create a SandboxRuntimeConfig struct to contain all the configuration
a sandbox would need at runtime to avoid passing the information as multiple
functions arguments
- add unit tests to verify crashdump behavior

Signed-off-by: Doru Blânzeanu <[email protected]>
@dblnz dblnz force-pushed the gdb-coredump-elfcore branch from ea7dfcc to 3543cfb Compare May 22, 2025 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve experience when an unexpected VM exit or crash occurs
3 participants