Skip to content
This repository has been archived by the owner on Jan 26, 2022. It is now read-only.

CUDA Scalar Mul #17

Merged
merged 177 commits into from
Nov 10, 2020
Merged

CUDA Scalar Mul #17

merged 177 commits into from
Nov 10, 2020

Conversation

jon-chuang
Copy link

@jon-chuang jon-chuang commented Oct 6, 2020

Exposes single function cpu_gpu_scalar_mul

impl<G: AffineCurve> GPUScalarMulSlice<G> for [G] {
    fn cpu_gpu_scalar_mul(
        &mut self,
        exps_h: &[<<G as AffineCurve>::ScalarField as PrimeField>::BigInt],
        cuda_group_size: usize,
        // size of the batch for cpu scalar mul
        cpu_chunk_size: usize,
    ) {
        if accel::Device::init() && cfg!(feature = "cuda") {
            <G as AffineCurve>::Projective::cpu_gpu_static_partition_run_kernel(
                self,
                exps_h,
                cuda_group_size,
                cpu_chunk_size,
            );
        } else {
            let mut exps_mut = exps_h.to_vec();
            cfg_chunks_mut!(self, cpu_chunk_size)
                .zip(cfg_chunks_mut!(exps_mut, cpu_chunk_size))
                .for_each(|(b, s)| {
                    b[..].batch_scalar_mul_in_place(&mut s[..], 4);
                });
        }
    }
}

A majority of the PR lies in the algebra/curves/cuda folder (850 out of 1400 loc). There is a deletion of sw_projective.rs as it is not actually used anywhere and Pratyush mentioned he plans to drop it in future zexe. The rest of the diff is boilerplate to impl for different curves, or code displacement (300 loc). Also, there is a test.

Potential TODOs:

  • cleanup the GPUScalarMul interface, choose what to expose (ctx?). Maybe, make GPUScalarMul pub(crate) instead of pub, and have GPUScalarMulSlice be the only pub.
  • Alternatively, make a new trait, GPUScalarMulInternal pub(crate) and choose what to expose in public GPUScalarMul interface. (I prefer this).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants