quant: Quantization routines only support CPU processing #14

cimes-isi · 2022-09-01T13:40:54Z

Quantization routines use unsigned 32 bit integers to pack quantized data. Since PyTorch doesn't support this datatype, we're use numpy arrays before casting back to a Tensor, which currently limits us to running on the CPU. This restriction means we are forced to move data to/from the CPU when sending/receiving data between stages, rather than a more efficient direct network transfer to/from GPUs and NICs.

cimes-isi added the enhancement New feature or request label Sep 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quant: Quantization routines only support CPU processing #14

quant: Quantization routines only support CPU processing #14

cimes-isi commented Sep 1, 2022

quant: Quantization routines only support CPU processing #14

quant: Quantization routines only support CPU processing #14

Comments

cimes-isi commented Sep 1, 2022