Skip to content

Commit

Permalink
docs: add ClickHouse cluster documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Ian2012 committed Nov 5, 2024
1 parent 162fc6b commit 2d7f6cb
Show file tree
Hide file tree
Showing 4 changed files with 201 additions and 0 deletions.
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,44 @@ populated with random value to ensure uniqueness.

In order for SSL to work without warnings the CA certificate needs to be mounted in the relevant pods. This is not yet implemented as due to an [outstanding issue in tutor](https://github.com/overhangio/tutor/issues/791) that had not yet been completed at the time of writing.

### ClickHouse Cluster

ClickHouse is needed for running Aspects, however for medium/large instances one single ClickHouse node can be
a bottleneck for Aspects and the default ClickHouse deployment in Aspects can take down other services running on the
same node as the ClickHouse pod. In case you are interested on running a ClickHouse cluster, you can enable the
Altinity ClickHouse Operator and follow the templates available on `charts/examples/clickhouse` to setup a ClickHouseKeeper
quorum (needed for replication) and a ClickHouse cluster based on your needs.

Once your cluster is created and working on Kubernetes, you need to update your installation settings:

```yaml
# See the clickhouse-installation.yml template for more details
CLICKHOUSE_ADMIN_USER: default
CLICKHOUSE_ADMIN_PASSWORD: change_me
CLICKHOUSE_CLUSTER_NAME: openedx-demo
# Set the first ClickHouse node as the DDL node.
CLICKHOUSE_CLUSTER_DDL_NODE_HOST: chi-clickhouse-{{CLICKHOUSE_CLUSTER_NAME}}-0-0.{{namespace}}
CLICKHOUSE_HOST: clickhouse-clickhouse.{{namespace}}
CLICKHOUSE_SECURE_CONNECTION: false
RUN_CLICKHOUSE: false
```

For multitenancy you have two options, either have multiple ClickHouse clusters or use different databases and users:

*Using different users and databases*: Make sure to update the users and databases on your config:

```yaml
ASPECTS_CLICKHOUSE_CMS_USER: openedx_demo_ch_cms
ASPECTS_CLICKHOUSE_LRS_USER: openedx_demo_ch_lrs
ASPECTS_CLICKHOUSE_REPORT_USER: openedx_demo_ch_report
ASPECTS_CLICKHOUSE_VECTOR_USER: openedx_demo_ch_vector
ASPECTS_XAPI_DATABASE: openedx_demo_xapi
ASPECTS_EVENT_SINK_DATABASE: openedx_demo_event_sink
ASPECTS_VECTOR_DATABASE: openedx_demo_openedx
DBT_PROFILE_TARGET_DATABASE: openedx_demo_reporting
```


## Extended Documentation

### How to uninstall this chart
Expand Down
76 changes: 76 additions & 0 deletions charts/examples/clickhouse/clickhouse-installation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "clickhouse"
spec:
configuration:
clusters:
- name: "openedx-demo"
layout:
shardsCount: 1 # Shards have not been tested with Aspects and we don't recommend it.
replicasCount: 2 # Scale as you need/can
templates:
podTemplate: server
volumeClaimTemplate: storage
users:
test/networks/ip:
- "::/0"
test/profile: default
test/password: change_me
test/quota: default
# Default permissions needed for user creation
test/access_management: 1
test/named_collection_control: 1
test/show_named_collections: 1
test/show_named_collections_secrets: 1
zookeeper:
nodes:
- host: clickhouse-keeper-0.clickhouse-keeper-headless
- host: clickhouse-keeper-1.clickhouse-keeper-headless
- host: clickhouse-keeper-2.clickhouse-keeper-headless
files:
# Enable user replication
users-replication.xml: |
<clickhouse>
<user_directories replace="replace">
<users_xml>
<path>/etc/clickhouse-server/users.xml</path>
</users_xml>
<replicated>
<zookeeper_path>/clickhouse/access/</zookeeper_path>
</replicated>
</user_directories>
</clickhouse>
# Enable function replication
functions-replication.xml: |
<clickhouse>
<user_defined_zookeeper_path>/udf</user_defined_zookeeper_path>
</clickhouse>
templates:
podTemplates:
- name: server
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:24.8
# If you are running a dedicated node group for ClickHouse (and you should)
# make sure to add it tolerations.
tolerations:
- key: "clickhouseInstance"
operator: "Exists"
effect: "NoSchedule"
# Optional: set the nodegroup name
nodeSelector:
eks.amazonaws.com/nodegroup: clickhouse_worker
volumeClaimTemplates:
- name: storage
# Do not delete PV if installation is deleted. If a new ClickHouseInstallation is created
# data will be re-used, allowing recovery of data
reclaimPolicy: Retain
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
86 changes: 86 additions & 0 deletions charts/examples/clickhouse/clickhouse-keeper.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
apiVersion: "clickhouse-keeper.altinity.com/v1"
kind: "ClickHouseKeeperInstallation"
metadata:
name: clickhouse-keeper
spec:
configuration:
clusters:
- name: "openedx-demo"
layout:
# ClickHouseKeeper needs at least tree pods to form a Quorum for high
# availability.
replicasCount: 3
settings:
logger/level: "trace"
logger/console: "true"
listen_host: "0.0.0.0"
keeper_server/storage_path: /var/lib/clickhouse-keeper
keeper_server/tcp_port: "2181"
keeper_server/four_letter_word_white_list: "*"
keeper_server/coordination_settings/raft_logs_level: "information"
keeper_server/raft_configuration/server/port: "9444"
prometheus/endpoint: "/metrics"
prometheus/port: "7000"
prometheus/metrics: "true"
prometheus/events: "true"
prometheus/asynchronous_metrics: "true"
prometheus/status_info: "false"
templates:
podTemplates:
- name: default
spec:
# affinity removed to allow use in single node test environment
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- clickhouse-keeper
topologyKey: "kubernetes.io/hostname"
containers:
- name: clickhouse-keeper
imagePullPolicy: IfNotPresent
# Make sure to keep this up to date with the ClickHouse compatible version
image: "clickhouse/clickhouse-keeper:24.8-alpine"
resources:
requests:
memory: "256M"
cpu: "0.25"
limits:
memory: "1Gi"
cpu: "1"
priorityClassName: clickhouse-priority
volumeClaimTemplates:
- name: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
- name: snapshot-storage-path
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
- name: log-storage-path
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi

---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: clickhouse-priority
value: 1000000
globalDefault: false
description: "This priority class should be used for ClickHouse service pods only."
1 change: 1 addition & 0 deletions charts/harmony-chart/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -374,3 +374,4 @@ clickhouse-operator:
secret:
username: "change_me"
password: "change_me"
fullnameOverride: "clickhouse-operator"

0 comments on commit 2d7f6cb

Please sign in to comment.