A Helm chart, that allows your easy way to deploy anything-llm. But also allows you to deploy anything-llm with different components like chromadb, nvidia-device-plugin, ollama, and more.
Name | Url | |
---|---|---|
la-cc | [email protected] | https://github.com/la-cc |
deB4SH | https://github.com/deB4SH |
Thanks to the work of Mintplex-Labs for creating anything-llm! If you like it, feel free to leave a ⭐️ on the anything-llm or contribute to the project or booth!
This chart allows you to deploy Anything-LLM on a Kubernetes cluster using the Helm package manager. Anything-LLM is a versatile API that can be used to interact with various language models, embedding models, and vector databases.
To get an idea, here is a visual representation of a simplified architecture:
The full list of supported LLMs, Vector DBs and Embedder can be found under Supported LLMs, Embedder Models, Speech models, and Vector Databases
The easiest way to start with anything-llm is to use the default components with OpenAI API like:
This is how the ui looks like after deploying the easiest way:
To install the chart with the release name anything-llm
:
$ helm repo add anything-llm https://la-cc.github.io/anything-llm-helm-chart
$ helm repo update
$ helm install anything-llm anything-llm/anything-llm
Or if you like you can also template the manifest and apply it directly:
Note: Don't template the secret, its not recommended to store the secret in the manifest. Its just for demonstration purposes and to keep the process simple. Please create a secret and reference it in the
values.yaml
.
$ helm template anything-llm anything-llm/anything-llm -f values.yaml | kubectl apply -f -
The next section "Requirements" is only required, if you want replace anyrhing-llm components like llm, embedded, vector db with your own components. If you want to use the default components, you can skip the next section.
Repository | Name | Version |
---|---|---|
https://amikos-tech.github.io/chromadb-chart/ | chromadb | 0.1.20 |
https://la-cc.github.io/nvidia-device-plugin-helm-chart | nvidia-device-plugin | 0.3.9 |
https://otwld.github.io/ollama-helm/ | ollama | 0.61.0 |
Key | Type | Default | Description |
---|---|---|---|
chromadb.chromadb.auth.enabled | bool | false |
|
chromadb.enabled | bool | false |
|
chromadb.service.type | string | "ClusterIP" |
|
config | object | {"EMBEDDING_MODEL_MAX_CHUNK_LENGTH":"8192","EMBEDDING_MODEL_PREF":"nomic-embed-text:1.5","STORAGE_DIR":"/app/server/storage","TTS_PROVIDER":"native","VECTOR_DB":"lancedb","WHISPER_PROVIDER":"local"} |
Configuration for the application. |
config.EMBEDDING_MODEL_PREF | string | "nomic-embed-text:1.5" |
Configuration for the embedding model. |
config.VECTOR_DB | string | "lancedb" |
Configuration for the vector db like lanceDB (in storage) or chroma DB (external), etc. |
fullnameOverride | string | "anything-llm" |
Override the full name of the chart. |
image | object | {"pullPolicy":"IfNotPresent","repository":"ghcr.io/mintplex-labs/anything-llm","tag":"1.2.2"} |
Configuration for the Docker image used by the pod. |
image.pullPolicy | string | "IfNotPresent" |
The pull policy for the image. IfNotPresent means the image will only be pulled if it is not already present locally. |
image.repository | string | "ghcr.io/mintplex-labs/anything-llm" |
The Docker repository to pull the image from. |
image.tag | string | "1.2.2" |
The specific tag of the image to use. |
ingress | object | {"annotations":{"cert-manager.io/cluster-issuer":"letsencrypt-dns","cert-manager.io/renew-before":"360h","nginx.ingress.kubernetes.io/rewrite-target":"/"},"enabled":true,"hosts":[{"host":"llm.example.com","paths":[{"path":"/","pathType":"Prefix"}]}],"ingressClassName":"nginx","tls":[{"hosts":["llm.example.com"],"secretName":"anything-llm-tls"}]} |
Ingress configuration. |
ingress.annotations | object | {"cert-manager.io/cluster-issuer":"letsencrypt-dns","cert-manager.io/renew-before":"360h","nginx.ingress.kubernetes.io/rewrite-target":"/"} |
Ingress annotations. |
ingress.enabled | bool | true |
Enable ingress. |
ingress.hosts | list | [{"host":"llm.example.com","paths":[{"path":"/","pathType":"Prefix"}]}] |
Ingress hosts. |
ingress.ingressClassName | string | "nginx" |
Ingress class name. |
ingress.tls | list | [{"hosts":["llm.example.com"],"secretName":"anything-llm-tls"}] |
TLS configuration for ingress. |
nvidia-device-plugin.enabled | bool | false |
|
nvidia-device-plugin.fullnameOverride | string | "nvidia-device-plugin" |
|
nvidia-device-plugin.resources.limits."nvidia.com/gpu" | int | 1 |
|
nvidia-device-plugin.tolerations[0].effect | string | "NoSchedule" |
|
nvidia-device-plugin.tolerations[0].key | string | "nvidia.com/gpu" |
|
nvidia-device-plugin.tolerations[0].operator | string | "Exists" |
|
ollama.autoscaling.enabled | bool | false |
|
ollama.autoscaling.maxReplicas | int | 1 |
|
ollama.enabled | bool | false |
|
ollama.fullnameOverride | string | "ollama" |
|
ollama.image.repository | string | "ollama/ollama" |
|
ollama.image.tag | string | "0.3.12" |
|
ollama.ollama.gpu.enabled | bool | true |
|
ollama.ollama.models[0] | string | "gemma2" |
|
ollama.ollama.number | int | 1 |
|
ollama.ollama.type | string | "nvidia" |
|
ollama.persistentVolume.accessModes[0] | string | "ReadWriteOnce" |
|
ollama.persistentVolume.enabled | bool | true |
|
ollama.persistentVolume.size | string | "50Gi" |
|
ollama.persistentVolume.storageClass | string | "" |
|
ollama.tolerations[0].effect | string | "NoSchedule" |
|
ollama.tolerations[0].key | string | "ai" |
|
ollama.tolerations[0].operator | string | "Equal" |
|
ollama.tolerations[0].value | string | "true" |
|
persistence | object | {"accessMode":"ReadWriteOnce","enabled":true,"size":"10Gi","volumes":[{"mountPath":"/app/server/storage","name":"server-storage"}]} |
Persistence configuration. |
persistence.accessMode | string | "ReadWriteOnce" |
Access mode for the persistent volume. |
persistence.enabled | bool | true |
Enable persistence. |
persistence.size | string | "10Gi" |
Size of the persistent volume. |
persistence.volumes | list | [{"mountPath":"/app/server/storage","name":"server-storage"}] |
List of volumes to create. |
replicaCount | int | 1 |
Number of pod replicas to deploy. |
secret | object | {"data":{"AUTH_TOKEN":"replace-me","JWT_SECRET":"replace-me"},"enabled":true,"name":""} |
Secret configuration. |
secret.data | object | {"AUTH_TOKEN":"replace-me","JWT_SECRET":"replace-me"} |
Secret data. |
secret.enabled | bool | true |
Enable secrets. |
secret.name | string | "" |
Name of the secret, if not set, a secret is generated. |
service | object | {"port":3001,"type":"ClusterIP"} |
Service configuration. |
service.port | int | 3001 |
Service port. |
service.type | string | "ClusterIP" |
Service type. |
Autogenerated from chart metadata using helm-docs v1.14.2
Configuration | Example Value | Description |
---|---|---|
SERVER_PORT |
3001 |
Port on which the server will run. |
STORAGE_DIR |
"/app/server/storage" |
Directory for storing application data. |
UID |
1000 |
User ID for running the application. |
GID |
1000 |
Group ID for running the application. |
SIG_KEY |
'passphrase' |
Passphrase for signing (requires at least 32 characters). |
SIG_SALT |
'salt' |
Salt for signing (requires at least 32 characters). |
JWT_SECRET |
"my-random-string-for-seeding" |
Secret for JWT authentication (requires at least 12 characters). |
Configuration | Example Value | Description |
---|---|---|
LLM_PROVIDER |
'openai' |
Provider for the LLM API. |
OPEN_AI_KEY |
sk-xxxx |
API key for OpenAI. |
OPEN_MODEL_PREF |
'gpt-4o' |
Preferred OpenAI model. |
GEMINI_API_KEY |
sk-gemini-xxxx |
API key for Gemini. |
GEMINI_LLM_MODEL_PREF |
'gemini-pro' |
Preferred Gemini model. |
AZURE_OPENAI_ENDPOINT |
https://replace-me.openai.azure.com/ |
Azure OpenAI endpoint. |
AZURE_OPENAI_KEY |
50f.. |
API key for Azure OpenAI. |
ANTHROPIC_API_KEY |
sk-ant-xxxx |
API key for Anthropic. |
ANTHROPIC_MODEL_PREF |
'claude-2' |
Preferred Anthropic model. |
LMSTUDIO_BASE_PATH |
'http://your-server:1234/v1' |
Base path for LMStudio API. |
LMSTUDIO_MODEL_PREF |
'Loaded from Chat UI' |
Preferred LMStudio model. |
LMSTUDIO_MODEL_TOKEN_LIMIT |
4096 |
Token limit for LMStudio model. |
LOCAL_AI_BASE_PATH |
'http://host.docker.internal:8080/v1' |
Base path for Local AI API. |
LOCAL_AI_MODEL_PREF |
'luna-ai-llama2' |
Preferred Local AI model. |
LOCAL_AI_MODEL_TOKEN_LIMIT |
4096 |
Token limit for Local AI model. |
LOCAL_AI_API_KEY |
"sk-123abc" |
API key for Local AI. |
OLLAMA_BASE_PATH |
'http://host.docker.internal:11434' |
Base path for Ollama API. |
OLLAMA_MODEL_PREF |
'llama2' |
Preferred Ollama model. |
OLLAMA_MODEL_TOKEN_LIMIT |
4096 |
Token limit for Ollama model. |
TOGETHER_AI_API_KEY |
'my-together-ai-key' |
API key for Together AI. |
TOGETHER_AI_MODEL_PREF |
'mistralai/Mixtral-8x7B-Instruct-v0.1' |
Preferred Together AI model. |
MISTRAL_API_KEY |
'example-mistral-ai-api-key' |
API key for Mistral. |
MISTRAL_MODEL_PREF |
'mistral-tiny' |
Preferred Mistral model. |
PERPLEXITY_API_KEY |
'my-perplexity-key' |
API key for Perplexity. |
PERPLEXITY_MODEL_PREF |
'codellama-34b-instruct' |
Preferred Perplexity model. |
OPENROUTER_API_KEY |
'my-openrouter-key' |
API key for OpenRouter. |
OPENROUTER_MODEL_PREF |
'openrouter/auto' |
Preferred OpenRouter model. |
HUGGING_FACE_LLM_ENDPOINT |
https://uuid-here.us-east-1.aws.endpoints.huggingface.cloud |
Endpoint for Hugging Face LLM. |
HUGGING_FACE_LLM_API_KEY |
hf_xxxxxx |
API key for Hugging Face LLM. |
HUGGING_FACE_LLM_TOKEN_LIMIT |
8000 |
Token limit for Hugging Face LLM. |
GROQ_API_KEY |
gsk_abcxyz |
API key for Groq. |
GROQ_MODEL_PREF |
'llama3-8b-8192' |
Preferred Groq model. |
KOBOLD_CPP_BASE_PATH |
'http://127.0.0.1:5000/v1' |
Base path for KoboldCPP API. |
KOBOLD_CPP_MODEL_PREF |
'koboldcpp/codellama-7b-instruct.Q4_K_S' |
Preferred KoboldCPP model. |
KOBOLD_CPP_MODEL_TOKEN_LIMIT |
4096 |
Token limit for KoboldCPP model. |
TEXT_GEN_WEB_UI_BASE_PATH |
'http://127.0.0.1:5000/v1' |
Base path for TextGenWebUI API. |
TEXT_GEN_WEB_UI_TOKEN_LIMIT |
4096 |
Token limit for TextGenWebUI model. |
TEXT_GEN_WEB_UI_API_KEY |
"sk-123abc" |
API key for TextGenWebUI. |
GENERIC_OPEN_AI_BASE_PATH |
'http://proxy.url.openai.com/v1' |
Base path for Generic OpenAI API. |
GENERIC_OPEN_AI_MODEL_PREF |
'gpt-3.5-turbo' |
Preferred Generic OpenAI model. |
GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT |
4096 |
Token limit for Generic OpenAI model. |
GENERIC_OPEN_AI_API_KEY |
"sk-123abc" |
API key for Generic OpenAI. |
LITE_LLM_MODEL_PREF |
'gpt-3.5-turbo' |
Preferred LiteLLM model. |
LITE_LLM_MODEL_TOKEN_LIMIT |
4096 |
Token limit for LiteLLM model. |
LITE_LLM_BASE_PATH |
'http://127.0.0.1:4000' |
Base path for LiteLLM API. |
LITE_LLM_API_KEY |
"sk-123abc" |
API key for LiteLLM. |
COHERE_API_KEY |
"" |
API key for Cohere. |
COHERE_MODEL_PREF |
'command-r' |
Preferred Cohere model. |
Configuration | Example Value | Description |
---|---|---|
EMBEDDING_ENGINE |
'openai' |
Embedding engine to use. |
EMBEDDING_MODEL_PREF |
'text-embedding-ada-002' |
Preferred embedding model. |
EMBEDDING_MODEL_MAX_CHUNK_LENGTH |
8192 |
Maximum chunk length for embedding model. |
EMBEDDING_BASE_PATH |
'http://localhost:8080/v1' |
Base path for embedding API. |
GENERIC_OPEN_AI_EMBEDDING_API_KEY |
"sk-123abc" |
API key for Generic OpenAI Embedding. |
Configuration | Example Value | Description |
---|---|---|
VECTOR_DB |
'chroma' |
Vector database to use. |
CHROMA_ENDPOINT |
'http://host.docker.internal:8000' |
Endpoint for Chroma database. |
CHROMA_API_HEADER |
"X-Api-Key" |
API header for Chroma database. |
CHROMA_API_KEY |
"sk-123abc" |
API key for Chroma database. |
PINECONE_API_KEY |
"" |
API key for Pinecone database. |
PINECONE_INDEX |
"" |
Index for Pinecone database. |
WEAVIATE_ENDPOINT |
'http://localhost:8080' |
Endpoint for Weaviate database. |
WEAVIATE_API_KEY |
"" |
API key for Weaviate database. |
QDRANT_ENDPOINT |
'http://localhost:6333' |
Endpoint for Qdrant database. |
QDRANT_API_KEY |
"" |
API key for Qdrant database. |
MILVUS_ADDRESS |
'http://localhost:19530' |
Address for Milvus database. |
MILVUS_USERNAME |
"" |
Username for Milvus database. |
MILVUS_PASSWORD |
"" |
Password for Milvus database. |
ZILLIZ_ENDPOINT |
'https://sample.api.gcp-us-west1.zillizcloud.com' |
Endpoint for Zilliz database. |
ZILLIZ_API_TOKEN |
'api-token-here' |
API token for Zilliz database. |
ASTRA_DB_APPLICATION_TOKEN |
"" |
Application token for Astra DB. |
ASTRA_DB_ENDPOINT |
"" |
Endpoint for Astra DB. |
Configuration | Example Value | Description |
---|---|---|
WHISPER_PROVIDER |
'local' |
Provider for Whisper model. |
OPEN_AI_KEY |
sk-xxxxxxx |
API key for OpenAI (for Whisper model). |
Configuration | Example Value | Description |
---|---|---|
TTS_PROVIDER |
'native' |
Provider for TTS (Text-to-Speech). |
TTS_OPEN_AI_KEY |
sk-example |
API key for OpenAI (for TTS model). |
TTS_OPEN_AI_VOICE_MODEL |
'nova' |
Preferred OpenAI TTS voice model. |
TTS_ELEVEN_LABS_KEY |
"" |
API key for Eleven Labs (for TTS model). |
TTS_ELEVEN_LABS_VOICE_MODEL |
'21m00Tcm4TlvDq8ikWAM' |
Preferred Eleven Labs TTS voice model (e.g., Rachel). |
Configuration | Example Value | Description |
---|---|---|
AUTH_TOKEN |
"hunter2" |
Password for your application if remote hosting. |
DISABLE_TELEMETRY |
"false" |
Disable telemetry if set to true. |
Configuration | Example Value | Description |
---|---|---|
PASSWORDMINCHAR |
8 |
Minimum password length. |
PASSWORDMAXCHAR |
250 |
Maximum password length. |
PASSWORDLOWERCASE |
1 |
Minimum number of lowercase letters in the password. |
PASSWORDUPPERCASE |
1 |
Minimum number of uppercase letters in the password. |
PASSWORDNUMERIC |
1 |
Minimum number of numeric digits in the password. |
PASSWORDSYMBOL |
1 |
Minimum number of symbols in the password. |
PASSWORDREQUIREMENTS |
4 |
Total number of password requirements to be met. |
Configuration | Example Value | Description |
---|---|---|
ENABLE_HTTPS |
"true" |
Enable HTTPS server. |
HTTPS_CERT_PATH |
"sslcert/cert.pem" |
Path to the SSL certificate. |
HTTPS_KEY_PATH |
"sslcert/key.pem" |
Path to the SSL key. |
Configuration | Example Value | Description |
---|---|---|
AGENT_GSE_KEY |
"" |
API key for Google Search. |
AGENT_GSE_CTX |
"" |
Context key for Google Search. |
AGENT_SERPER_DEV_KEY |
"" |
API key for Serper.dev. |
AGENT_BING_SEARCH_API_KEY |
"" |
API key for Bing Search. |
AGENT_SERPLY_API_KEY |
"" |
API key for Serply.io. |
AGENT_SEARXNG_API_URL |
"" |
API URL for SearXNG. |