Replies: 1 comment 1 reply
-
Hi @delock, AutoTP is not supported for Bloom because it uses ALiBi. Any model that uses ALiBi is not currently supported with AutoTP. We would like to add support for more models in the future but are not actively working on supporting Bloom at this time. You can still use Bloom with kernel injection. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi @molly-smith, we are recently enabling LLM on CPU and want to use AutoTP to reduce latency of LLM inference. From our experiment, AutoTP can work with GPT-J 6B inference well, but does not support BLOOM-176B. Can I ask the reason that AutoTP cannot support BLOOM? Is it because of unique model architecture of BLOOM or due to other reason? Do you plan to support more models i.e. BLOOM in AutoTP?
Appreciate if I can get a brief answer from you. We used AutoTP for GPT-J and its very helpful. We wonder whether we can do the same for BLOOM, or whether we need to implement kernel injection path in order to use tensor parallel for BLOOM. Any insight on this will be very helpful to us. Thanks!
Guokai
Beta Was this translation helpful? Give feedback.
All reactions