Skip to content

Conversation

kozistr
Copy link

@kozistr kozistr commented Sep 10, 2025

Change Log

related to huggingface/text-embeddings-inference#717

Thanks to all the following resources, I was able to complete implementing these and adapting to TEI :) (+ I've tested that it's working)

  • topk_softmax kernel from vLLM
  • fused MoE kernel from here
    • Qwen3 MoE
    • Nomic MoE

Also, I'm very new to CUDA programming, so any feedback or suggestions would be greatly appreciated 🤗

@Narsil @alvarobartt @EricLBuehler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant