Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChatQnA benchmark tests(#995) #998

Draft
wants to merge 37 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
d6b04b3
benchmark helmcharts (#995)
Zhenzhong1 Oct 21, 2024
3dd5475
updated chatqna helmcharts
Oct 21, 2024
a70775d
updated chatqna helmcharts image name
Oct 21, 2024
5c2f3f0
move image & replicas path
Oct 21, 2024
a0b2263
updated customize deployment template
Oct 21, 2024
2416661
removed spec
Oct 21, 2024
9ee1a74
rename
Oct 21, 2024
58ff7d9
moved HUGGINGFACEHUB_API_TOKEN
Oct 21, 2024
4e1237d
refactored GaqGen
Oct 21, 2024
fdb8a33
refactored GaqGen
Oct 21, 2024
048b4e1
refactored AudioQNA
Oct 21, 2024
d68ce80
refactored AudioQNA
Oct 21, 2024
d290bd8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 21, 2024
6dc4bb5
refactoered image
Oct 21, 2024
124143e
removed values.yaml
Oct 21, 2024
bcaffd7
added more cases
Oct 21, 2024
bb46f5b
added visual qna & update deployment template
Oct 22, 2024
0d3876d
removed multiple yamls
Oct 22, 2024
8effe7a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 22, 2024
e21ee76
updated tgiparams
Oct 22, 2024
f3cbcad
fixed visualqna image issues & tgi params issues
Oct 22, 2024
27e9832
fixed visualqna issues
Oct 22, 2024
b9c646a
update README
Oct 22, 2024
9da0c09
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 22, 2024
3f596d9
update README
Oct 22, 2024
065222f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 22, 2024
24de14e
fixed the audioqna benchmark path
Oct 22, 2024
a953632
added the tuned tgi params
Oct 22, 2024
2876677
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 22, 2024
1046aad
removed benchmark template
Oct 23, 2024
4f183c2
restore README
Zhenzhong1 Oct 23, 2024
4f32f86
update cpu core into 80
chensuyue Oct 23, 2024
93bbd51
updated oob manifests
Oct 24, 2024
1601808
added some envs
Oct 25, 2024
bb4c1db
Update configmap.yaml
Zhenzhong1 Oct 28, 2024
48e5b7e
Update tuned_single_gaudi_with_rerank.yaml
Zhenzhong1 Nov 8, 2024
6ddfb9d
Update tuned_single_gaudi_with_rerank.yaml
Zhenzhong1 Nov 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions AudioQnA/benchmark/helm_charts/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
27 changes: 27 additions & 0 deletions AudioQnA/benchmark/helm_charts/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
name: chatqna-charts
description: A Helm chart for Kubernetes

# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 1.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.16.0"
36 changes: 36 additions & 0 deletions AudioQnA/benchmark/helm_charts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# ChatQnA Deployment

This document guides you through deploying ChatQnA pipelines using Helm charts. Helm charts simplify managing Kubernetes applications by packaging configuration and resources.

## Getting Started

### Preparation

```bash
# on k8s-master node
cd GenAIExamples/ChatQnA/benchmark/performance/helm_charts

# Replace the key of HUGGINGFACEHUB_API_TOKEN with your actual Hugging Face token:
# vim customize.yaml
HUGGINGFACEHUB_API_TOKEN: hf_xxxxx
```

### Deploy your ChatQnA

```bash
# Deploy a ChatQnA pipeline using the specified YAML configuration.
# To deploy with different configurations, simply provide a different YAML file.
helm install chatqna helm_charts/ -f customize.yaml
```

Notes: The provided [BKC manifests](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark) for single, two, and four node Kubernetes clusters are generated using this tool.

## Customize your own ChatQnA pipelines. (Optional)

There are two yaml configs you can specify.

- customize.yaml
This file can specify image names, the number of replicas and CPU cores to manage your pods.

- values.yaml
This file contains the default microservice configurations for ChatQnA. Please review and understand each parameter before making any changes.
23 changes: 23 additions & 0 deletions AudioQnA/benchmark/helm_charts/customize.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

podSpecs:
- name: audioqna-backend-server-deploy
replicas: 1

- name: asr-deploy
replicas: 1

- name: whisper-deploy
replicas: 1


- name: tts-deploy
replicas: 1

- name: speecht5-deploy
replicas: 1


- name: llm-dependency-deploy
replicas: 1
25 changes: 25 additions & 0 deletions AudioQnA/benchmark/helm_charts/templates/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Values.config.CONFIG_MAP_NAME }}
namespace: default
data:
HUGGINGFACEHUB_API_TOKEN: {{ .Values.config.HUGGINGFACEHUB_API_TOKEN }}
LLM_MODEL_ID: {{ .Values.config.LLM_MODEL_ID }}
NODE_SELECTOR: {{ .Values.config.NODE_SELECTOR }}
TGI_LLM_ENDPOINT: http://faq-tgi-svc.default.svc.cluster.local:8010

ASR_ENDPOINT: http://whisper-svc.default.svc.cluster.local:7066
TTS_ENDPOINT: http://speecht5-svc.default.svc.cluster.local:7055
TGI_LLM_ENDPOINT: http://llm-dependency-svc.default.svc.cluster.local:3006
MEGA_SERVICE_HOST_IP: audioqna-backend-server-svc
ASR_SERVICE_HOST_IP: asr-svc
ASR_SERVICE_PORT: "3001"
LLM_SERVICE_HOST_IP: llm-svc
LLM_SERVICE_PORT: "3007"
TTS_SERVICE_HOST_IP: tts-svc
TTS_SERVICE_PORT: "3002"
---
131 changes: 131 additions & 0 deletions AudioQnA/benchmark/helm_charts/templates/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

{{- $global := .Values }}
{{- range $microservice := .Values.microservices }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ $microservice.name }}
namespace: default
spec:
{{- $replicas := $microservice.replicas }}
{{- range $podSpec := $global.podSpecs }}
{{- if eq $podSpec.name $microservice.name }}
{{- $replicas = $podSpec.replicas | default $microservice.replicas }}
{{- end }}
{{- end }}
replicas: {{ $replicas }}

selector:
matchLabels:
app: {{ $microservice.name }}
template:
metadata:
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
labels:
app: {{ $microservice.name }}
spec:
containers:
- envFrom:
- configMapRef:
name: {{ $global.config.CONFIG_MAP_NAME }}
{{- if $microservice.args }}
args:
{{- range $arg := $microservice.args }}
{{- if $arg.name }}
- {{ $arg.name }}
{{- end }}
{{- if $arg.value }}
- "{{ $arg.value }}"
{{- end }}
{{- end }}
{{- end }}

{{- if $microservice.env }}
env:
{{- range $env := $microservice.env }}
- name: {{ $env.name }}
value: "{{ $env.value }}"
{{- end }}
{{- end }}

{{- $image := $microservice.image }}
{{- range $podSpec := $global.podSpecs }}
{{- if eq $podSpec.name $microservice.name }}
{{- $image = $podSpec.image | default $microservice.image }}
{{- end }}
{{- end }}
image: {{ $image }}

imagePullPolicy: IfNotPresent
name: {{ $microservice.name }}

{{- if $microservice.ports }}
ports:
{{- range $port := $microservice.ports }}
{{- range $port_name, $port_id := $port }}
- {{ $port_name }}: {{ $port_id }}
{{- end }}
{{- end }}
{{- end }}

{{- $resources := $microservice.resources }}
{{- range $podSpec := $global.podSpecs }}
{{- if eq $podSpec.name $microservice.name }}
{{- if $podSpec.resources }}
{{- $resources = $podSpec.resources }}
{{- end }}
{{- end }}
{{- end }}

{{- if $resources }}
resources:
{{- range $resourceType, $resource := $resources }}
{{ $resourceType }}:
{{- range $limitType, $limit := $resource }}
{{ $limitType }}: {{ $limit }}
{{- end }}
{{- end }}
{{- end }}

{{- if $microservice.volumeMounts }}
volumeMounts:
{{- range $volumeMount := $microservice.volumeMounts }}
- mountPath: {{ $volumeMount.mountPath }}
name: {{ $volumeMount.name }}
{{- end }}
{{- end }}

hostIPC: true
nodeSelector:
node-type: {{ $global.config.NODE_SELECTOR }}
serviceAccountName: default
topologySpreadConstraints:
- labelSelector:
matchLabels:
app: {{ $microservice.name }}
maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway


{{- if $microservice.volumes }}
volumes:
{{- range $index, $volume := $microservice.volumes }}
- name: {{ $volume.name }}
{{- if $volume.hostPath }}
hostPath:
path: {{ $volume.hostPath.path }}
type: {{ $volume.hostPath.type }}
{{- else if $volume.emptyDir }}
emptyDir:
medium: {{ $volume.emptyDir.medium }}
sizeLimit: {{ $volume.emptyDir.sizeLimit }}
{{- end }}
{{- end }}
{{- end }}

---
{{- end }}
24 changes: 24 additions & 0 deletions AudioQnA/benchmark/helm_charts/templates/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

{{- range $service := .Values.services }}
apiVersion: v1
kind: Service
metadata:
name: {{ $service.name }}
namespace: default
spec:
ports:
{{- range $port := $service.spec.ports }}
- name: {{ $port.name }}
{{- range $port_name, $port_id := $port }}
{{- if ne $port_name "name"}}
{{ $port_name }}: {{ $port_id }}
{{- end }}
{{- end }}
{{- end }}
selector:
app: {{ $service.spec.selector.app }}
type: {{ $service.spec.type }}
---
{{- end }}
Loading