Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue #1: GKE cluster using Terraform (first iteration) #4

Merged
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
6fc5018
issue #1: GKE cluster using Terraform
ThomasCardin Dec 22, 2023
2c8d565
issue #1: Added vault to manage secrets
ThomasCardin Jan 3, 2024
7dd8399
issue #1: removed aws and azure provider it doesnt belong to this issue
ThomasCardin Jan 3, 2024
cb82279
issue #1: removed unused cluster node pool
ThomasCardin Jan 5, 2024
4d9ed24
issue #1: nachet-backend kubernetes deployment
ThomasCardin Jan 5, 2024
31b1d44
issue #1: added namespace to sa inside nachet deployment + finesse de…
ThomasCardin Jan 5, 2024
6385349
increse proxy-read-timeout to 30m
ThomasCardin Jan 8, 2024
d49b0d5
issue #1: kube-prometheus-stack (Prometheus, grafana and alertmanager)
ThomasCardin Jan 9, 2024
e1f5807
issue #1: Nachet deployment, with backend and frontend
ThomasCardin Jan 12, 2024
9a13373
issue #1: updated the nachet backend image
ThomasCardin Jan 15, 2024
f4f98e4
issue #1: fixed both ingress name for nachet
ThomasCardin Jan 15, 2024
e7883a4
issue #1: finesse frontend and backend deployed with our images from GCR
ThomasCardin Jan 15, 2024
826e1b9
issue #1: reviewed and added EOF on files
ThomasCardin Jan 16, 2024
e07aad8
issue #1: added new version to finesse-frontend
ThomasCardin Jan 16, 2024
96ed032
issue #1: added github workflow
ThomasCardin Jan 17, 2024
e2d6231
issue #1: adding the workflow to the right folder
ThomasCardin Jan 17, 2024
91917a4
issue #1: fixed md max line length
ThomasCardin Jan 17, 2024
8267004
issue #1: testing the workflow repo standard with applied patch
ThomasCardin Jan 17, 2024
1b072b7
issue #1: removed testing branch for repo standard action
ThomasCardin Jan 17, 2024
5dc9f1c
issue #1: testing the repo standard with new token
ThomasCardin Jan 17, 2024
9d363cd
issue #1: testing the repo standard with new token
ThomasCardin Jan 17, 2024
d7d16f8
issue #1: EOF for workflow
ThomasCardin Jan 17, 2024
d5b616e
issue #1: fixed 413 error from frontend to backend
ThomasCardin Jan 17, 2024
a0c1546
issue #1: changed nachet images tag for PR number
ThomasCardin Jan 18, 2024
3c19f9e
issue #1: changed README.md content
ThomasCardin Jan 31, 2024
de56ac5
issue #1: removed unused code
ThomasCardin Jan 31, 2024
01f7948
issue #1: changed image version for finesse (default is main represen…
ThomasCardin Jan 31, 2024
8f5f250
issue #1: changed default version of nachet-frontend
ThomasCardin Feb 1, 2024
82ada60
deleted the GKE cluster
ThomasCardin Feb 6, 2024
02801c9
added providers for the GKE cluster. Note: the cluster isn't supporte…
ThomasCardin Feb 6, 2024
4c6f99c
Merge remote-tracking branch 'origin/main' into 1-create-a-kubernetes…
ThomasCardin Feb 7, 2024
a3b942b
issue #1: fixed EOF and completing the merge from main
ThomasCardin Feb 7, 2024
b90a705
issue #1: fixed yaml linting error for nginx deployment
ThomasCardin Feb 7, 2024
edf3c83
issue #1: fixed yaml linting error for nginx deployment
ThomasCardin Feb 7, 2024
23f4db3
issue #1: fixed yaml linting error for nginx deployment
ThomasCardin Feb 7, 2024
b7d1910
issue #1: fixed some yaml linting error
ThomasCardin Feb 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .github/workflows/workflow.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: Ai-cfia repo standard and markdown check

on:
pull_request:
types:
- opened
- closed
- synchronize

jobs:
repo-standard:
uses: ai-cfia/github-workflows/.github/workflows/workflow-repo-standards-validation.yml@main
secrets: inherit

markdown-check:
uses: ai-cfia/github-workflows/.github/workflows/workflow-markdown-check.yml@main
with:
config-file-path: '.mlc_config.json'
3 changes: 3 additions & 0 deletions .mlc_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"aliveStatusCodes": [999,200,403]
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EOF newline missing! also same content about 999 as before. I also don't think we should accept 403 (documentations behind authentication)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

102 changes: 102 additions & 0 deletions .terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions .vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"recommendations": [
"stkb.rewrap",
"DavidAnson.vscode-markdownlint"
],
"unwantedRecommendations": [

]
}
6 changes: 6 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"editor.rulers": [80],
"files.trimTrailingWhitespace": true,
"files.trimFinalNewlines": true,
"files.insertFinalNewline": true
}
35 changes: 25 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,31 @@
# Infrastructure Repository for ACIA-CFIA AI-Lab
rngadam marked this conversation as resolved.
Show resolved Hide resolved

This repository is dedicated to the infrastructure management of the ACIA-CFIA AI-Lab. It contains scripts, configurations, and documentation pertinent to infrastructure and DevOps practices within the lab, facilitating setup, deployment, and management across multiple cloud platforms including AWS, GCP, and Azure.
This repository contains all the infrastructure used by the ACIA/CFIA AI Lab.
In this repository, you can find the Kubernetes manifests that deploy each of
the applications on the three different cloud providers: Google Cloud Platform
(GCP), Amazon Web Services (AWS), and Azure.

## Contents:
## Content

Cross-Cloud Setup Scripts: Automation scripts for seamless configuration across AWS, GCP, and Azure, covering project initiation, billing account association, artifact repository orchestration, and service account setup.
GitHub Repository Creation Guide: Detailed instructions for creating new repositories in alignment with ACIA-CFIA standards.
Getting Started:
- The Terraform configuration for the GCP cluster.
- Kubernetes manifests used to deploy the following applications:
- [Nachet backend](https://github.com/ai-cfia/nachet-backend)
- [Nachet frontend](https://github.com/ai-cfia/nachet-frontend)
- [Finesse backend](https://github.com/ai-cfia/finesse-backend)
- [Finesse frontend](https://github.com/ai-cfia/finesse-frontend)
- Configuration for Vault, Grafana, Prometheus, Alert Manager, Ingress NGINX,
and Cert Manager to meet our requirements.

## Clone this repository.
1. Navigate to the desired script or documentation.
2. Follow the provided instructions.
3. Related Repositories:
## Tooling

Dev-Rel-Docs: Contains introductory files and documentation related to developer relations at ACIA-CFIA AI-Lab.
- [Hashicorp Vault](https://www.vaultproject.io/)
- [Grafana](https://grafana.com/)
- [Prometheus](https://prometheus.io/docs/visualization/grafana/)
- [Alert manager](https://github.com/prometheus/alertmanager)
- [Cert manager](https://cert-manager.io/)
- [Ingress NGINX](https://docs.nginx.com/nginx-ingress-controller/)
- [OTEL](https://opentelemetry.io/)

## Liens utiles

[ai-cfia github container registry](https://github.com/orgs/ai-cfia/packages)
88 changes: 88 additions & 0 deletions kubernetes/apps/demo/nginx-deployment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
apiVersion: v1
kind: Namespace
metadata:
name: nginx
labels:
name: nginx

---
rngadam marked this conversation as resolved.
Show resolved Hide resolved
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80

---
apiVersion: v1
kind: Service
metadata:
name: nginx
namespace: nginx
spec:
clusterIP: None
selector:
app: nginx
ports:
- protocol: TCP
port: 80

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx-ingress
namespace: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-http
ingress.kubernetes.io/force-ssl-redirect: "true"
kubernetes.io/tls-acme: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- nginx.ninebasetwo.xyz
secretName: aciacfia-tls
rules:
- host: nginx.ninebasetwo.xyz
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx
port:
number: 80

# ---
# apiVersion: gateway.networking.k8s.io/v1beta1
# kind: HTTPRoute
# metadata:
# name: nginx-http-route
# namespace: nginx
# spec:
# parentRefs:
# - name: gateway-gke-l7-rilb
# rules:
# - matches:
# - path:
# type: PathPrefix
# value: "/"
# backendRefs:
# - name: nginx
# port: 80
rngadam marked this conversation as resolved.
Show resolved Hide resolved
76 changes: 76 additions & 0 deletions kubernetes/apps/finesse/finesse-backend-deployment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: secrets-reader
namespace: finesse

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: finesse-backend
namespace: finesse
spec:
replicas: 2
selector:
matchLabels:
app: finesse-backend
template:
metadata:
labels:
app: finesse-backend
annotations:
vault.hashicorp.com/agent-inject: 'true'
vault.hashicorp.com/role: 'secrets-reader'
vault.hashicorp.com/tls-skip-verify: 'true'
vault.hashicorp.com/agent-inject-template-.env: |
{{- with secret "apps/finesse" -}}
AZURE_OPENAI_CHATGPT_DEPLOYMENT="{{ .Data.data.AZURE_OPENAI_CHATGPT_DEPLOYMENT }}"
AZURE_OPENAI_GPT_DEPLOYMENT="{{ .Data.data.AZURE_OPENAI_GPT_DEPLOYMENT }}"
FINESSE_BACKEND_AZURE_SEARCH_API_KEY="{{ .Data.data.FINESSE_BACKEND_AZURE_SEARCH_API_KEY }}"
FINESSE_BACKEND_AZURE_SEARCH_ENDPOINT="{{ .Data.data.FINESSE_BACKEND_AZURE_SEARCH_ENDPOINT }}"
FINESSE_BACKEND_AZURE_SEARCH_INDEX_NAME="{{ .Data.data.FINESSE_BACKEND_AZURE_SEARCH_INDEX_NAME }}"
FINESSE_BACKEND_GITHUB_STATIC_FILE_URL="{{ .Data.data.FINESSE_BACKEND_GITHUB_STATIC_FILE_URL }}"
FINESSE_BACKEND_STATIC_FILE_URL="{{ .Data.data.FINESSE_BACKEND_STATIC_FILE_URL }}"
FINESSE_BACKEND_DEBUG_MODE="{{ .Data.data.FINESSE_BACKEND_DEBUG_MODE }}"
FINESSE_WEIGHTS="{{ .Data.data.FINESSE_WEIGHTS }}"
LOUIS_DSN="{{ .Data.data.LOUIS_DSN }}"
LOUIS_SCHEMA="{{ .Data.data.LOUIS_SCHEMA }}"
OPENAI_API_ENGINE="{{ .Data.data.OPENAI_API_ENGINE }}"
OPENAI_API_KEY="{{ .Data.data.OPENAI_API_KEY }}"
OPENAI_API_VERSION="{{ .Data.data.OPENAI_API_VERSION }}"
OPENAI_ENDPOINT="{{ .Data.data.OPENAI_ENDPOINT }}"
{{- end }}
spec:
serviceAccountName: secrets-reader
containers:
- name: finesse-backend
image: ghcr.io/ai-cfia/finesse-backend:main
imagePullPolicy: Always
command: ["/bin/sh", "-c"]
args:
- >
cp /vault/secrets/.env . &&
gunicorn --bind :8080 --workers 1 --threads 8 --timeout 0 --forwarded-allow-ips "*" app:app
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60
periodSeconds: 10

---
apiVersion: v1
kind: Service
metadata:
name: finesse-backend-svc
namespace: finesse
spec:
clusterIP: None
selector:
app: finesse-backend
ports:
- protocol: TCP
port: 8080
42 changes: 42 additions & 0 deletions kubernetes/apps/finesse/finesse-frontend-deployment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: finesse-frontend
namespace: finesse
spec:
replicas: 2
selector:
matchLabels:
app: finesse-frontend
template:
metadata:
labels:
app: finesse-frontend
spec:
serviceAccountName: secrets-reader
containers:
- name: finesse-frontend
image: ghcr.io/ai-cfia/finesse-frontend:main
imagePullPolicy: Always
ports:
- containerPort: 3000
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 60
periodSeconds: 10

---
apiVersion: v1
kind: Service
metadata:
name: finesse-frontend-svc
namespace: finesse
spec:
clusterIP: None
selector:
app: finesse-frontend
ports:
- protocol: TCP
port: 3000
Loading
Loading