Skip to content

Latest commit

 

History

History
59 lines (53 loc) · 2.49 KB

README.md

File metadata and controls

59 lines (53 loc) · 2.49 KB

AI Singapore SEA-LION model served by vLLM inference server with Docker Compose

Requirements

SEA-LION

This section describes the setup of the SEA-LION models.

SEA-LION v2.1

  • Download LLaMA3 8B CPT SEA-LIONv2.1 Instruct.
  • Copy the model or add a symbolic link in the models directory. The path is ./models/llama3-8b-cpt-sea-lionv2.1-instruct. For example, if the model was downloaded to ~/downloads/llama3-8b-cpt-sea-lionv2.1-instruct, the symbolic link is added by:
    ln -s ~/downloads/llama3-8b-cpt-sea-lionv2.1-instruct models/

SEA-LION v3

  • Download Gemma2 9B CPT SEA-LIONv3 Instruct.
  • Copy the model or add a symbolic link in the models directory. The path is ./models/gemma2-9b-cpt-sea-lionv3-instruct. For example, if the model was downloaded to ~/downloads/gemma2-9b-cpt-sea-lionv3-instruct, the symbolic link is added by:
    ln -s ~/downloads/gemma2-9b-cpt-sea-lionv3-instruct models/
  • Set MODEL_NAME.
     export MODEL_NAME=gemma2-9b-cpt-sea-lionv3-instruct

Start vLLM

  • Start the service.
    docker compose up
  • vLLM is deployed as a server that implements the OpenAI API protocol. By default, it starts the server at http://localhost:8000. This server can be queried in the same format as OpenAI API. For example, list the models:
    curl http://localhost:8000/v1/models
  • Test the service. Update the model name accordingly.
    curl http://localhost:8000/v1/completions \
      -H "Content-Type: application/json" \
      -d '{
          "model": "llama3-8b-cpt-sea-lionv2.1-instruct",
          "prompt": "Artificial Intelligence is",
          "max_tokens": 20,
          "temperature": 0.8,
          "repetition_penalty": 1.2
      }'

Customisation

  • To use another model:
    • Download the model to the models directory.
    • Update the $MODEL_NAME environment variable. For example, if the model is downloaded to ./models/foo-model-30b:
      export MODEL_NAME=foo-model-30b