Skip to content

Commit

Permalink
Gateway change resources (#816)
Browse files Browse the repository at this point in the history
  • Loading branch information
srihari-tf authored Nov 21, 2024
1 parent 55fcd98 commit ef7d0ce
Show file tree
Hide file tree
Showing 2 changed files with 81 additions and 75 deletions.
120 changes: 62 additions & 58 deletions charts/tfy-llm-gateway/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,61 +5,65 @@ LLM-Gateway Helm Chart

### Configuration for LLM Gateway

| Name | Description | Value |
| -------------------------------------------- | ---------------------------------- | ------------------------------------------------- |
| `global` | Truefoundry global values | `{}` |
| `image.repository` | Image repository for tfyLLMGateway | `tfy.jfrog.io/tfy-private-images/tfy-llm-gateway` |
| `image.tag` | Image tag for the tfyLLMGateway | `510cb55e8ff708cbc0b0fbdf02ea9b104bbdc846` |
| `replicaCount` | Number of replicas | `1` |
| `environmentName` | The environment name | `default` |
| `envSecretName` | The environment secret name | `tfy-llm-gateway-env-secret` |
| `imagePullPolicy` | Image pull policy | `IfNotPresent` |
| `nameOverride` | Name override | `""` |
| `fullnameOverride` | Fullname override | `""` |
| `podAnnotations` | Pod annotations | `{}` |
| `podSecurityContext` | Pod security context | `{}` |
| `commonLabels` | Common labels | `{}` |
| `securityContext` | Security context configuration | `{}` |
| `healthcheck.enabled` | Enable healthcheck | `true` |
| `healthcheck.readiness.port` | Port to probe | `8787` |
| `healthcheck.readiness.path` | Path to probe | `/` |
| `healthcheck.readiness.initialDelaySeconds` | Initial delay in seconds | `10` |
| `healthcheck.readiness.periodSeconds` | Period in seconds | `10` |
| `healthcheck.readiness.timeoutSeconds` | Timeout in seconds | `5` |
| `healthcheck.readiness.successThreshold` | Success threshold | `1` |
| `healthcheck.readiness.failureThreshold` | Failure threshold | `3` |
| `healthcheck.liveness.port` | Port to probe | `8787` |
| `healthcheck.liveness.path` | Path to probe | `/` |
| `resources.limits.cpu` | CPU limit | `2` |
| `resources.limits.memory` | Memory limit | `1024Mi` |
| `resources.limits.ephemeral-storage` | Ephemeral storage limit | `512Mi` |
| `resources.requests.cpu` | CPU request | `1` |
| `resources.requests.memory` | Memory request | `512Mi` |
| `resources.requests.ephemeral-storage` | Ephemeral storage request | `256Mi` |
| `nodeSelector` | Node selector | `{}` |
| `tolerations` | Tolerations | `{}` |
| `affinity` | Affinity | `{}` |
| `topologySpreadConstraints` | Topology spread constraints | `{}` |
| `ingress.enabled` | Enable ingress configuration | `false` |
| `ingress.annotations` | Ingress annotations | `{}` |
| `ingress.labels` | Ingress labels | `{}` |
| `ingress.ingressClassName` | Ingress class name | `istio` |
| `ingress.tls` | Ingress TLS configuration | `[]` |
| `ingress.hosts` | Ingress hosts | `[]` |
| `istio.virtualservice.enabled` | Enable virtual service | `false` |
| `istio.virtualservice.annotations` | Virtual service annotations | `{}` |
| `istio.virtualservice.gateways` | Virtual service gateways | `[]` |
| `istio.virtualservice.hosts` | Virtual service hosts | `[]` |
| `service.type` | Service type | `ClusterIP` |
| `service.port` | Service port | `8787` |
| `service.annotations` | Service annotations | `{}` |
| `serviceAccount.create` | Create service account | `true` |
| `serviceAccount.annotations` | Service account annotations | `{}` |
| `serviceAccount.name` | Service account name | `tfy-llm-gateway` |
| `extraVolumes` | Extra volumes | `[]` |
| `extraVolumeMounts` | Extra volume mounts | `[]` |
| `rbac.enabled` | Enable rbac | `true` |
| `autoscaling.enabled` | Enable autoscaling | `false` |
| `autoscaling.minReplicas` | Minimum number of replicas | `3` |
| `autoscaling.maxReplicas` | Maximum number of replicas | `100` |
| `autoscaling.targetCPUUtilizationPercentage` | Target CPU utilization percentage | `60` |
| Name | Description | Value |
| -------------------------------------------- | ------------------------------------------ | ------------------------------------------------- |
| `global.controlPlaneURL` | Control plane URL | `""` |
| `global.truefoundryReleaseName` | Truefoundry release name | `truefoundry` |
| `global.llmGatewayInfra.enabled` | Bool if llm gateway infra is enabled | `false` |
| `global.llmGatewayInfra.releaseName` | Release name for the tfy-llm-gateway-infra | `tfy-llm-gateway-infra` |
| `global.llmGatewayInfra.natsAdminPassword` | NATS admin password | `""` |
| `image.repository` | Image repository for tfyLLMGateway | `tfy.jfrog.io/tfy-private-images/tfy-llm-gateway` |
| `image.tag` | Image tag for the tfyLLMGateway | `510cb55e8ff708cbc0b0fbdf02ea9b104bbdc846` |
| `fullnameOverride` | Full name override for the tfy-llm-gateway | `""` |
| `replicaCount` | Number of replicas | `1` |
| `environmentName` | The environment name | `default` |
| `envSecretName` | The environment secret name | `tfy-llm-gateway-env-secret` |
| `imagePullPolicy` | Image pull policy | `IfNotPresent` |
| `nameOverride` | Name override | `""` |
| `podAnnotations` | Pod annotations | `{}` |
| `podSecurityContext` | Pod security context | `{}` |
| `commonLabels` | Common labels | `{}` |
| `securityContext` | Security context configuration | `{}` |
| `healthcheck.enabled` | Enable healthcheck | `true` |
| `healthcheck.readiness.port` | Port to probe | `8787` |
| `healthcheck.readiness.path` | Path to probe | `/` |
| `healthcheck.readiness.initialDelaySeconds` | Initial delay in seconds | `10` |
| `healthcheck.readiness.periodSeconds` | Period in seconds | `10` |
| `healthcheck.readiness.timeoutSeconds` | Timeout in seconds | `5` |
| `healthcheck.readiness.successThreshold` | Success threshold | `1` |
| `healthcheck.readiness.failureThreshold` | Failure threshold | `3` |
| `healthcheck.liveness.port` | Port to probe | `8787` |
| `healthcheck.liveness.path` | Path to probe | `/` |
| `resources.limits.cpu` | CPU limit | `2` |
| `resources.limits.memory` | Memory limit | `1024Mi` |
| `resources.limits.ephemeral-storage` | Ephemeral storage limit | `512Mi` |
| `resources.requests.cpu` | CPU request | `1` |
| `resources.requests.memory` | Memory request | `512Mi` |
| `resources.requests.ephemeral-storage` | Ephemeral storage request | `256Mi` |
| `nodeSelector` | Node selector | `{}` |
| `tolerations` | Tolerations | `{}` |
| `affinity` | Affinity | `{}` |
| `topologySpreadConstraints` | Topology spread constraints | `{}` |
| `ingress.enabled` | Enable ingress configuration | `false` |
| `ingress.annotations` | Ingress annotations | `{}` |
| `ingress.labels` | Ingress labels | `{}` |
| `ingress.ingressClassName` | Ingress class name | `istio` |
| `ingress.tls` | Ingress TLS configuration | `[]` |
| `ingress.hosts` | Ingress hosts | `[]` |
| `istio.virtualservice.enabled` | Enable virtual service | `false` |
| `istio.virtualservice.annotations` | Virtual service annotations | `{}` |
| `istio.virtualservice.gateways` | Virtual service gateways | `[]` |
| `istio.virtualservice.hosts` | Virtual service hosts | `[]` |
| `service.type` | Service type | `ClusterIP` |
| `service.port` | Service port | `8787` |
| `service.annotations` | Service annotations | `{}` |
| `serviceAccount.create` | Create service account | `true` |
| `serviceAccount.annotations` | Service account annotations | `{}` |
| `serviceAccount.name` | Service account name | `tfy-llm-gateway` |
| `extraVolumes` | Extra volumes | `[]` |
| `extraVolumeMounts` | Extra volume mounts | `[]` |
| `rbac.enabled` | Enable rbac | `true` |
| `autoscaling.enabled` | Enable autoscaling | `true` |
| `autoscaling.minReplicas` | Minimum number of replicas | `3` |
| `autoscaling.maxReplicas` | Maximum number of replicas | `100` |
| `autoscaling.targetCPUUtilizationPercentage` | Target CPU utilization percentage | `60` |
36 changes: 19 additions & 17 deletions charts/tfy-llm-gateway/values.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,27 @@
## @section Configuration for LLM Gateway

## @param global Truefoundry global values
global: {}
global:
## @param global.controlPlaneURL Control plane URL
controlPlaneURL: ""
## @param global.truefoundryReleaseName Truefoundry release name
truefoundryReleaseName: "truefoundry"
llmGatewayInfra:
## @param global.llmGatewayInfra.enabled Bool if llm gateway infra is enabled
enabled: false
## @param global.llmGatewayInfra.releaseName Release name for the tfy-llm-gateway-infra
releaseName: "tfy-llm-gateway-infra"
## @param global.llmGatewayInfra.natsAdminPassword NATS admin password
natsAdminPassword: ""

## Image configuration for llm-gateway
image:
## @param image.repository Image repository for tfyLLMGateway
repository: tfy.jfrog.io/tfy-private-images/tfy-llm-gateway
## @param image.tag Image tag for the tfyLLMGateway
tag: 510cb55e8ff708cbc0b0fbdf02ea9b104bbdc846

## @param fullnameOverride Full name override for the tfy-llm-gateway
fullnameOverride: ""

## @param replicaCount Number of replicas
replicaCount: 1
Expand All @@ -19,8 +33,6 @@ envSecretName: tfy-llm-gateway-env-secret
imagePullPolicy: IfNotPresent
## @param nameOverride Name override
nameOverride: ''
## @param fullnameOverride Fullname override
fullnameOverride: ''
## @param podAnnotations Pod annotations
podAnnotations: {}
## @param podSecurityContext Pod security context
Expand Down Expand Up @@ -120,9 +132,9 @@ serviceAccount:
## @param serviceAccount.create Create service account
create: true
## @param serviceAccount.annotations Service account annotations
name: tfy-llm-gateway
## @param serviceAccount.name Service account name
annotations: {}
## @param serviceAccount.name Service account name
name: tfy-llm-gateway
## @param extraVolumes Extra volumes
extraVolumes: []
## @param extraVolumeMounts Extra volume mounts
Expand All @@ -131,21 +143,11 @@ extraVolumeMounts: []
rbac:
## @param rbac.enabled Enable rbac
enabled: true
## @skip env
env:
CONTROL_PLANE_URL: ""
TFY_API_KEY: ${k8s-secret/truefoundry-creds/TFY_API_KEY}
AUTH_SERVER_URL: https://auth.truefoundry.com
LOG_LEVEL: info
GATEWAY_NATS_CONFIGURATION: ""
DEPLOYED_LLM_GATEWAY_URL: ""
CONTROL_PLANE_NATS_URL: ""
ENABLE_EXTERNAL_OAUTH: "false"

## Autoscaling configuration
autoscaling:
## @param autoscaling.enabled Enable autoscaling
enabled: false
enabled: true
## @param autoscaling.minReplicas Minimum number of replicas
minReplicas: 3
## @param autoscaling.maxReplicas Maximum number of replicas
Expand Down

0 comments on commit ef7d0ce

Please sign in to comment.