Skip to content

Commit

Permalink
Move additional tips/notes to the top
Browse files Browse the repository at this point in the history
  • Loading branch information
DarkLight1337 committed Oct 24, 2024
1 parent 8a02cd0 commit c7ea3cb
Showing 1 changed file with 41 additions and 48 deletions.
89 changes: 41 additions & 48 deletions docs/source/models/supported_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,47 @@
Supported Models
================

vLLM supports a variety of generative Transformer models in `HuggingFace (HF) Transformers <https://huggingface.co/models>`_.
The following is the list of model architectures that are currently supported by vLLM.
vLLM supports a variety of generative and embedding models from `HuggingFace (HF) Transformers <https://huggingface.co/models>`_.
This page lists the model architectures that are currently supported by vLLM.
Alongside each architecture, we include some popular models that use it.

For other models, you can check the :code:`config.json` file inside the model repository.
If the :code:`"architectures"` field contains a model architecture listed below, then it should be supported in theory.

.. tip::
The easiest way to check if your model is really supported at runtime is to run the program below:

.. code-block:: python
from vllm import LLM
llm = LLM(model=...) # Name or path of your model
output = llm.generate("Hello, my name is")
print(output)
If vLLM successfully generates text, it indicates that your model is supported.

Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` and :ref:`Enabling Multimodal Inputs <enabling_multimodal_inputs>`
for instructions on how to implement your model in vLLM.
Alternatively, you can `open an issue on GitHub <https://github.com/vllm-project/vllm/issues/new/choose>`_ to request vLLM support.

.. note::
To use models from `ModelScope <https://www.modelscope.cn>`_ instead of HuggingFace Hub, set an environment variable:

.. code-block:: shell
$ export VLLM_USE_MODELSCOPE=True
And use with :code:`trust_remote_code=True`.

.. code-block:: python
from vllm import LLM
llm = LLM(model=..., revision=..., trust_remote_code=True) # Name or path of your model
output = llm.generate("Hello, my name is")
print(output)
Text-only Language Models
^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -144,7 +181,7 @@ Text Generation
- ✅︎
* - :code:`JAISLMHeadModel`
- Jais
- :code:`inceptionai/jais-13b`, :code:`inceptionai/jais-13b-chat`, :code:`inceptionai/jais-30b-v3`, :code:`inceptionai/jais-30b-chat-v3`, etc.
- :code:`core42/jais-13b`, :code:`core42/jais-13b-chat`, :code:`core42/jais-30b-v3`, :code:`core42/jais-30b-chat-v3`, etc.
-
- ✅︎
* - :code:`JambaForCausalLM`
Expand Down Expand Up @@ -425,7 +462,7 @@ Text Generation
-
* - :code:`MolmoForCausalLM`
- Molmo
- T + I
- Image
- :code:`allenai/Molmo-7B-D-0924`, :code:`allenai/Molmo-72B-0924`, etc.
-
- ✅︎
Expand Down Expand Up @@ -459,12 +496,6 @@ Text Generation
- :code:`Qwen/Qwen-VL`, :code:`Qwen/Qwen-VL-Chat`, etc.
-
- ✅︎
* - :code:`Qwen2AudioForConditionalGeneration`
- Qwen2-Audio
- T + A\ :sup:`+`
- :code:`Qwen/Qwen2-Audio-7B-Instruct`
-
- ✅︎
* - :code:`Qwen2VLForConditionalGeneration`
- Qwen2-VL
- T + I\ :sup:`E+` + V\ :sup:`+`
Expand Down Expand Up @@ -515,44 +546,6 @@ Multimodal Embedding
Some model architectures support both generation and embedding tasks.
In this case, you have to pass :code:`--task embedding` to run the model in embedding mode.

----

If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.
Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` and :ref:`Enabling Multimodal Inputs <enabling_multimodal_inputs>`
for instructions on how to implement support for your model.
Alternatively, you can raise an issue on our `GitHub <https://github.com/vllm-project/vllm/issues>`_ project.

.. tip::
The easiest way to check if your model is supported is to run the program below:

.. code-block:: python
from vllm import LLM
llm = LLM(model=...) # Name or path of your model
output = llm.generate("Hello, my name is")
print(output)
If vLLM successfully generates text, it indicates that your model is supported.

.. tip::
To use models from `ModelScope <https://www.modelscope.cn>`_ instead of HuggingFace Hub, set an environment variable:

.. code-block:: shell
$ export VLLM_USE_MODELSCOPE=True
And use with :code:`trust_remote_code=True`.

.. code-block:: python
from vllm import LLM
llm = LLM(model=..., revision=..., trust_remote_code=True) # Name or path of your model
output = llm.generate("Hello, my name is")
print(output)
Model Support Policy
=====================

Expand Down

0 comments on commit c7ea3cb

Please sign in to comment.