Support for DeepseekV2ForCausalLM #2340

tgandrew · 2024-10-15T23:41:22Z

Question

Can we get DeepseekV2 supported?

Code to reproduce

from tensorrt_llm import LLM, SamplingParams


prompts = [
    "Hello, my name is",
    "The president of the United States is",
    "The capital of France is",
    "The future of AI is",
]

sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True)

outputs = llm.generate(prompts, sampling_params)

# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

Error

Traceback (most recent call last):
  File "/workspace/test.py", line 13, in <module>
    llm = LLM(model="deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True)
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm.py", line 152, in __init__
    raise e
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm.py", line 147, in __init__
    self._build_model()
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm.py", line 310, in _build_model
    self._engine_dir = model_loader()
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm_utils.py", line 1318, in __call__
    return self._build_model()
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm_utils.py", line 1434, in _build_model
    build_task(self.get_engine_dir())
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm_utils.py", line 1403, in build_task
    model_loader(engine_dir=engine_dir)
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm_utils.py", line 1000, in __call__
    pipeline()
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm_utils.py", line 942, in __call__
    self.step_forward()
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm_utils.py", line 971, in step_forward
    self.step_handlers[self.counter]()
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/hlapi/llm_utils.py", line 1116, in _load_model_from_hf
    model_cls = AutoModelForCausalLM.get_trtllm_model_class(
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/automodel.py", line 54, in get_trtllm_model_class
    raise NotImplementedError(
NotImplementedError: The given huggingface model architecture DeepseekV2ForCausalLM is not supported in TRT-LLM yet

The text was updated successfully, but these errors were encountered:

fengyang95 · 2024-10-16T03:00:18Z

@tgandrew FYI: #1758

dominicshanshan · 2024-10-17T04:43:28Z

@fengyang95, little update as promised: CI pass and code review ongoing, will add some benchmark data and analysis in README, MR is pretty huge so took little longer time, will update the status on Monday, thanks for support TRT-LLM !

fengyang95 · 2024-10-17T12:48:11Z

@fengyang95, little update as promised: CI pass and code review ongoing, will add some benchmark data and analysis in README, MR is pretty huge so took little longer time, will update the status on Monday, thanks for support TRT-LLM !

nice work!!!

dominicshanshan · 2024-10-25T09:35:57Z

little update on Friday: we are in code rebase and possible merge in main branch in next week... finger crossed.. thanks!

dominicshanshan · 2024-11-12T13:57:40Z

@fengyang95 , thanks for attention on Tensorrt-LLM, deepseek-v2 is live in main branch and will be officially announced in v0.15, will close this bug now .. welcome to give any suggestion or issue when using Tensorrt-LLM..

fengyang95 · 2024-11-13T12:42:25Z

@fengyang95 , thanks for attention on Tensorrt-LLM, deepseek-v2 is live in main branch and will be officially announced in v0.15, will close this bug now .. welcome to give any suggestion or issue when using Tensorrt-LLM..

@dominicshanshan Nice work!!! Looking forward to support for fp8 and cc8.9 (e.g., L40) since our GPU resources are relatively limited.

dominicshanshan · 2024-11-14T03:26:12Z

@fengyang95 , thanks for attention on Tensorrt-LLM, deepseek-v2 is live in main branch and will be officially announced in v0.15, will close this bug now .. welcome to give any suggestion or issue when using Tensorrt-LLM..

@dominicshanshan Nice work!!! Looking forward to support for fp8 and cc8.9 (e.g., L40) since our GPU resources are relatively limited.

message received, working on FP8 and sm89, sm89 support should available soon, will update the status once it passed our internal tests..

Pernekhan · 2024-12-30T19:46:27Z

@dominicshanshan really appreciate your work on deepseek.

is the fp8 support getting closer?

dominicshanshan · 2024-12-31T05:27:57Z

@dominicshanshan really appreciate your work on deepseek.

is the fp8 support getting closer?

yes, working on it, should be enable soon..

Superjomn added feature request New feature or request triaged Issue has been triaged by maintainers labels Oct 16, 2024

dominicshanshan self-assigned this Oct 25, 2024

penli9 added the new model label Oct 25, 2024

dominicshanshan closed this as completed Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for DeepseekV2ForCausalLM #2340

Support for DeepseekV2ForCausalLM #2340

tgandrew commented Oct 15, 2024

fengyang95 commented Oct 16, 2024

dominicshanshan commented Oct 17, 2024

fengyang95 commented Oct 17, 2024

dominicshanshan commented Oct 25, 2024

dominicshanshan commented Nov 12, 2024

fengyang95 commented Nov 13, 2024

dominicshanshan commented Nov 14, 2024

Pernekhan commented Dec 30, 2024

dominicshanshan commented Dec 31, 2024

Support for DeepseekV2ForCausalLM #2340

Support for DeepseekV2ForCausalLM #2340

Comments

tgandrew commented Oct 15, 2024

Question

Code to reproduce

Error

fengyang95 commented Oct 16, 2024

dominicshanshan commented Oct 17, 2024

fengyang95 commented Oct 17, 2024

dominicshanshan commented Oct 25, 2024

dominicshanshan commented Nov 12, 2024

fengyang95 commented Nov 13, 2024

dominicshanshan commented Nov 14, 2024

Pernekhan commented Dec 30, 2024

dominicshanshan commented Dec 31, 2024