Add GPU acceleration via Pytorch #8

0hq · 2023-07-03T18:16:26Z

Let's start GPU accelerating with a Pytorch index. Dot products/cosine similarity are both nearly equivalent to a matrix multiplication, so using hardware accelerators seems to be useful here. On 32 GB of VRAM, we could fit 22 million MiniLM embeddings (384 dimensions on f32 precision) on a single GPU.

go-noah · 2023-07-06T04:44:03Z

I've been implementing and using pretty much the same ideas you're thinking of in tensorflow and java series.

Of course, I did the exact same thing with Pytorch, and the problem of finding the top k was also considered, as well as batch processing, dynamic batch processing, etc.

If you take a look at my code and agree with the direction I think the implementation should go, I'll contribute to this repository.

https://github.com/go-noah/akka-dynamic-batch-serving/blob/main/tensorflow-gpu-cosine-similarity/src/main/scala/serving/model/CosineSimilarity.scala

https://github.com/go-noah/akka-dynamic-batch-serving/blob/main/tensorflow-gpu-cosine-similarity/README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU acceleration via Pytorch #8

Add GPU acceleration via Pytorch #8

0hq commented Jul 3, 2023

go-noah commented Jul 6, 2023 •

edited

Loading

Add GPU acceleration via Pytorch #8

Add GPU acceleration via Pytorch #8

Comments

0hq commented Jul 3, 2023

go-noah commented Jul 6, 2023 • edited Loading

go-noah commented Jul 6, 2023 •

edited

Loading