Skip to content

sha2 crate = runtime error #207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
brandonros opened this issue Apr 27, 2025 · 6 comments
Open

sha2 crate = runtime error #207

brandonros opened this issue Apr 27, 2025 · 6 comments

Comments

@brandonros
Copy link

https://github.com/RustCrypto/hashes/blob/master/sha2/Cargo.toml vs https://github.com/brandonros/rust-ed25519-compact

$ cargo run --release -- aa $BLOCKS_PER_GRID $THREADS_PER_BLOCK
   Compiling ed25519_vanity v0.1.0 (/home/brandon/ed25519-vanity-rs)
    Finished `release` profile [optimized] target(s) in 0.90s
     Running `target/release/ed25519_vanity aa 128 128`
Found 1 CUDA devices
Starting device 0
[0] Loading module...
[0] Starting search loop...

thread '<unnamed>' panicked at /home/brandon/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cudarc-0.16.0/src/driver/safe/core.rs:470:36:
called `Result::unwrap()` on an `Err` value: DriverError(CUDA_ERROR_LAUNCH_FAILED, "unspecified launch failure")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

thread '<unnamed>' panicked at /home/brandon/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cudarc-0.16.0/src/driver/safe/core.rs:246:58:
called `Result::unwrap()` on an `Err` value: DriverError(CUDA_ERROR_LAUNCH_FAILED, "unspecified launch failure")
stack backtrace:
   0:     0x55e313ca18e3 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::hdbd106d724e72c20
   1:     0x55e313cc3493 - core::fmt::write::h861eecc74abebf7a
   2:     0x55e313c9f003 - std::io::Write::write_fmt::h493b3152b071fba0
   3:     0x55e313ca1732 - std::sys::backtrace::BacktraceLock::print::h71f315c25fc266cb
   4:     0x55e313ca26ca - std::panicking::default_hook::{{closure}}::h8019dc6a2c6c0fe7
   5:     0x55e313ca253a - std::panicking::default_hook::h497f769686a88dd6
   6:     0x55e313ca2fd2 - std::panicking::rust_panic_with_hook::h98fc165e90ef379e
   7:     0x55e313ca2e6a - std::panicking::begin_panic_handler::{{closure}}::h2c1a60d0a908eaec
   8:     0x55e313ca1dd9 - std::sys::backtrace::__rust_end_short_backtrace::he8aba8f9b7ddf304
   9:     0x55e313ca2afd - rust_begin_unwind
  10:     0x55e313cc2010 - core::panicking::panic_fmt::hcbf39f8c1e585f84
  11:     0x55e313cc23a6 - core::result::unwrap_failed::haf1491c6d679786d
  12:     0x55e313c7bc98 - <cudarc::driver::safe::core::CudaEvent as core::ops::drop::Drop>::drop::heca398ef781d7d06
  13:     0x55e313c60323 - core::ptr::drop_in_place<core::option::Option<cudarc::driver::safe::core::CudaEvent>>::h0f0fd586ce97c35f
  14:     0x55e313c601f6 - core::ptr::drop_in_place<cudarc::driver::safe::core::CudaSlice<u8>>::he9b6484b4f5db6c9
  15:     0x55e313c61cae - ed25519_vanity::device_main::h99b65f8ab2f8c263
  16:     0x55e313c65f2b - std::sys::backtrace::__rust_begin_short_backtrace::he6efac01710238fd
  17:     0x55e313c656f1 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h486bd94640b67ff5
  18:     0x55e313ca4d8b - std::sys::pal::unix::thread::Thread::new::thread_start::h20288ab9ea215a81
  19:     0x7fe278c381f5 - <unknown>
  20:     0x7fe278cb889c - <unknown>
  21:                0x0 - <unknown>

thread '<unnamed>' panicked at library/core/src/panicking.rs:226:5:
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.
Aborted
use sha2::Digest as _;
use ed25519_compact::ge_scalarmult_base;
use rand_core::{SeedableRng, RngCore};
use rand_xorshift::XorShiftRng;
use bs58;

// fails
fn sha512(input: &[u8]) -> [u8; 64] {
    let mut hasher = sha2::Sha512::new();
    hasher.update(input);
    hasher.finalize().into()
}

// works
fn sha512_compact(input: &[u8]) -> [u8; 64] {
    let mut hasher = ed25519_compact::sha512::Hash::new();
    hasher.update(input);
    hasher.finalize()
}
@adamcavendish
Copy link
Contributor

Hi @brandonros , sha2 has a lot of optimizations on the CPU, i.e. AVX2 etc. so these kinds of crates are not available for direct use in CUDA kernel. If we would like to directly use these kinds of crates in the CUDA kernel, we need to write an implementation in these crates and gate by a CUDA alike feature flag.

@brandonros
Copy link
Author

I would have guessed the compiler would be able to tell AVX2 was not available and not try to include them.

I believe even with this non-AVX2 implementation (soft) the issue still occurs: https://github.com/RustCrypto/hashes/blob/master/sha2/src/sha512/soft.rs

https://github.com/RustCrypto/hashes/blob/master/sha2/src/sha512.rs#L2-L4

Any suggestions on how to debug exactly what the problem is or tell the compiler those options aren't available? Are you saying host CPU features are accidentally used when compiling with the CUDA GPU compiler?

@jorge-ortega
Copy link
Collaborator

The error shown here happens at runtime, so I'm assuming that your GPU crate compiled successfully with the nvvm codegen. If so, then there shouldn't be an issue with the use of the SHA crate. You are however using cudarc, which is a different crate then the one we maintain here and is where the error originates in. While in theory, these should be identical bindings to the cuda driver api, and the ptx generated should be loadable by any program that can load and launch kernels, I've only every used the bindings provided through cust to launch kernels compiled by the nvvm backend. If this issue is in how cudarc launches the kernel, then it might be better to open an issue with them so they can help pinpoint why the kernel is failing to launch, and if it has something to do with the ptx generated from our backend. If you have the same issue launching the kernel with cust, I can look further.

@brandonros
Copy link
Author

cudarc replaced with cust: brandonros/ed25519-vanity-rs@2b04c7e

_compact functions work (sha2), non-compact do not

PTX:

//
// Generated by NVIDIA NVVM Compiler
//
// Compiler Build ID: CL-35059454
// Cuda compilation tools, release 12.6, V12.6.85
// Based on NVVM 7.0.1
//

.version 8.5
.target sm_61
.address_size 64

        // .globl       find_vanity_private_key

.visible .entry find_vanity_private_key(
        .param .u64 find_vanity_private_key_param_0,
        .param .u64 find_vanity_private_key_param_1,
        .param .u64 find_vanity_private_key_param_2,
        .param .u64 find_vanity_private_key_param_3,
        .param .u64 find_vanity_private_key_param_4,
        .param .u64 find_vanity_private_key_param_5,
        .param .u64 find_vanity_private_key_param_6
)
{



        bar.sync        0;
        bar.sync        0;
        bar.sync        0;
        trap;

}

@jorge-ortega
Copy link
Collaborator

Thanks for the extra context. I'll look further.

@jorge-ortega
Copy link
Collaborator

Thanks again for all the reports. I won't have as much availability to look into this as I thought but will asap. Or someone else can feel free to look further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants