Skip to content

Commit

Permalink
Merge pull request #22 from sendbird/release-1.1.1
Browse files Browse the repository at this point in the history
Release v1.1.1
  • Loading branch information
jjh-kim authored Nov 1, 2024
2 parents e6742fe + 30460ed commit 3e84f96
Show file tree
Hide file tree
Showing 23 changed files with 440 additions and 122 deletions.
5 changes: 3 additions & 2 deletions .github/workflows/linters.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
name: Linters

on:
pull_request:
push:
branches:
- master
- main
pull_request:
workflow_dispatch:

jobs:
flake8_py3:
Expand Down
11 changes: 2 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,22 +59,15 @@ load when production traffic increases.

## Requirements

SB-OSC is designed to work with Aurora MySQL database, and it's an EKS-based tool.

It requires the following resources to run:

- Aurora MySQL database (v2, v3)
- EKS cluster
- AWS SecretsManager secret
- IAM role
SB-OSC is designed to work with Aurora MySQL database. It's a containerized application that can be run on both Kubernetes and Docker environments.

SB-OSC accepts `ROW` for binlog format. It is recommended to set `binlog-ignore-db` to `sbosc` to prevent SB-OSC from
processing its own binlog events.

- `binlog_format` set to `ROW`
- `binlog-ignore-db` set to `sbosc` (Recommended)

Detailed requirements and setup instructions can be found in the [usage guide](doc/usage.md).
Detailed requirements and setup instructions can be found in the [deployment guide](deploy/README.md).

## Performance

Expand Down
15 changes: 15 additions & 0 deletions deploy/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Usage Guide

SB-OSC is designed to be deployed as a containerized application.
It can be run on both Kubernetes and Docker environments.

For Kubernetes deployment refer to [charts](./charts) directory, and for Docker deployment refer to [compose](./compose) directory.

### Building Docker Image
You can build Docker image using Dockerfile in the root directory.
```bash
docker build -t sb-osc .
```

### Troubleshooting
Issues and solutions that may occur when using SB-OSC can be found in [troubleshooting.md](../doc/troubleshooting.md).
File renamed without changes.
77 changes: 77 additions & 0 deletions deploy/charts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Deploying on EKS Cluster

## 1. Create AWS Resources

### IAM Role

Two IAM role is required. One for `ExternalSecrets` to access SecretsManager secret and another for the `monitor` to access CloudWatch metrics. Each role will be attached to separate service accounts.


Create an IAM role with the following policy:

**sb-osc-external-role**
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetSecretValue",
"secretsmanager:DescribeSecret",
"secretsmanager:ListSecretVersionIds"
],
"Resource": "arn:aws:secretsmanager:REGION:ACCOUNT_ID:secret:SECRET_NAME"
}
]
}
```

**sb-osc-role**
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:GetMetricStatistics"
],
"Resource": "*"
}
]
}
```

### SecretsManager Secret
SB-OSC uses ExternalSecrets with SecretsManager for credentials. Following keys should be defined.

- `username`: Database username
- `password`: Database password
- `port`: Database port
- `redis_host`: Redis endpoint (k8s Service name)
- `redis_password`: Redis password
- `slack_channel`: Slack channel ID (Optional)
`slack_token`: Slack app token (Optional)

You can find these keys in [secret.py](../../src/config/secret.py)

## 2. Create Destination Table
SB-OSC does not create destination table on its own. Table should be manually created before starting migration.

## 3. Enable Binlog
SB-OSC requires binlog to be enabled on the source database. Please set `binlog_format` to `ROW`

### Other Parameters
- Setting `binlog-ignore-db` to `sbosc` is recommended to prevent SB-OSC from processing its own binlog events.
- Set `range_optimizer_max_mem_size` to `0` or a large value to prevent bad query plans on queries with large `IN` clauses (especially on Aurora v3)

## 4. Run SB-OSC
When all of the above steps are completed, you can start the migration process by installing the [helm chart]()

```bash
helm install charts sb-osc -n sb-osc --create-namespace

# or
helm -i upgrade charts sb-osc -n sb-osc
```
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,6 @@ spec:
- name: redis-data
persistentVolumeClaim:
claimName: redis-pvc
- name: redis-config
configMap:
name: redis-config
- name: redis-secret
secret:
secretName: sb-osc-secret
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
67 changes: 67 additions & 0 deletions deploy/compose/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Deploying with Docker Compose

## 1. Create IAM Role

### IAM Role

IAM role is required for the `monitor` to access CloudWatch metrics.

Create an IAM role with the following policy:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:GetMetricStatistics"
],
"Resource": "*"
}
]
}
```

Attach this role to the instance where SB-OSC is running.

## 2. Write Config Files
You have to write three config files for SB-OSC to run properly.

### `config.yaml`
This files contains the configuration for SB-OSC. You can find the template in [config.yaml](config.yaml).
All values are loaded into `Config` class in [config.py](../../src/config/config.py).

### `secret.json`
This file contains the credentials for the database, redis, and slack. You can find the template in [secret.json](secret.json). All values are loaded into `Secret` class in [secret.py](../../src/config/secret.py).

- `username`: Database username
- `password`: Database password
- `port`: Database port
- `redis_host`: Redis endpoint (Docker container name)
- `redis_password`: Redis password (Optional)
- `slack_channel`: Slack channel ID (Optional)
`slack_token`: Slack app token (Optional)

`redis_password` is optional. Keep in mind that if you set a password in `redis.conf`, you should set the same password in `secret.json`.

### `redis.conf`
This file contains the configuration for the Redis server. You can find the template in [redis.conf](redis.conf).
- `requirepass ""`: Match the `redis_password` set in `secret.json`.
- If `requirepass ""` is set, this means that the Redis server does not require a password. Fill in the password between the quotes to set a password.
- `appendonly yes`: Enable AOF persistence
- `save ""`: Disable RDB persistence

## 3. Create Destination Table
SB-OSC does not create destination table on its own. Table should be manually created before starting migration.

## 4. Enable Binlog
SB-OSC requires binlog to be enabled on the source database. Please set `binlog_format` to `ROW`

### Other Parameters
- Setting `binlog-ignore-db` to `sbosc` is recommended to prevent SB-OSC from processing its own binlog events.
- Set `range_optimizer_max_mem_size` to `0` or a large value to prevent bad query plans on queries with large `IN` clauses (especially on Aurora v3)

## 5. Run SB-OSC
When all of the above steps are completed, you can start the migration process by running docker compose.

Please double-check if the `docker-compose.yml` file is correctly configured (ex. `image`, `AWS_REGION`, etc.)
78 changes: 78 additions & 0 deletions deploy/compose/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
####################
# Required configs #
####################

# Migration plan
source_writer_endpoint: "" # If source_cluster_id is not provided, this must be cluster writer endpoint.
source_reader_endpoint: ""
destination_writer_endpoint: "" # If destination_cluster_id is not provided, this must be cluster writer endpoint.
destination_reader_endpoint: ""
source_db: ""
source_table: ""
destination_db: ""
destination_table: ""

auto_swap: false # Whether to swap tables automatically. (Default: false)
preferred_window: "00:00-23:59" # Preferred window for swapping tables & bulk import validation. (Default: "00:00-23:59")

# Worker config
min_batch_size: 500 # Starting batch size to use. (Default: 500)
max_batch_size: 3000 # Desired batch size to use. (Default: 3000)
batch_size_step_size: 500 # Step size to increase batch size. (Default: 500)

min_thread_count: 1 # Starting thread count to use. (Default: 1)
max_thread_count: 8 # Desired thread count to use. (Default: 8)
thread_count_step_size: 1 # Step size to increase thread count. (Default: 1)

commit_interval_in_seconds: 1 # Time wait after each query executed by worker. (Default: 1)

# Validator
bulk_import_validation_batch_size: 10000 # Batch size for bulk import validation (Default: 10000)
apply_dml_events_validation_batch_size: 1000 # Batch size for DML event validation (Default: 1000)
validation_thread_count: 4 # Number of threads to use for validation process (Default: 4)

####################
# Optional configs #
####################

# Migration plan
# sbosc_db: "sbosc" # Database to create sb-osc tables. (Default: "sbosc")
# source_cluster_id: ~ # If not provided, cluster id will be retrieved from source_writer_endpoint (Default: ~)
# destination_cluster_id: ~ # If not provided, cluster id will be retrieved from destination_writer_endpoint (Default: ~)
# min_chunk_size: 100000 # Minimum chunk size to create. (Default: 100000)
# max_chunk_count: 200 # Maximum number of chunks to create. (Default: 200)
# wait_interval_until_auto_swap_in_seconds: 60 # Interval to wait until auto swap. (Default: 60)
# skip_bulk_import: false # Whether to skip bulk import. (Default: false)
# disable_apply_dml_events: false # Whether to disable applying dml events. (Default: false)
# operation_class: BaseOperation # Operation class to use. (Default: BaseOperation)
# indexes: [] # Indexes to create after bulk import. (Default: [])
# index_created_per_query: 4 # Number of indexes to create per iteration. (Default: 4)
# innodb_ddl_buffer_size: ~ # innodb_ddl_buffer_size for MySQL. (Default: ~)
# innodb_ddl_threads: ~ # innodb_ddl_threads for MySQL. (Default: ~)
# innodb_parallel_read_threads : ~ # innodb_parallel_read_threads for MySQL. (Default: ~)

# Worker config
# use_batch_size_multiplier: false # Whether to use batch size multiplier. (Default: false)

# EventHandler config
# eventhandler_thread_count: 4 # Number of threads for EventHandler. Max number of binlog files to read at once. (Default 4. Max 4 recommended)
# eventhandler_thread_timeout_in_seconds: 300 # Timeout for EventHandler thread. If the thread is not finished within this time, it raises exception and restarts EventHandler. (Default: 300)
# init_binlog_file: ~ # Initial binlog file to start reading. (Default: ~)
# init_binlog_position: ~ # Initial binlog position to start reading. (Default: ~)

# Monitor threshold
# cpu_soft_threshold: 40 # Soft threshold for CPU usage. If the CPU usage exceeds this value, thread count will be decreased into half. (Default: 40)
# cpu_hard_threshold: 60 # Hard threshold for CPU usage. If the CPU usage exceeds this value, thread count will be decreased to 0. (Default: 60)
# write_latency_soft_threshold: 30 # Soft threshold for WriteLatency. If the latency exceeds this value, batch size will be decreased into half. (Default: 30)
# write_latency_hard_threshold: 50 # Hard threshold for WriteLatency. If the latency exceeds this value, batch size will be decreased to 0. (Default: 50)

# Validation config
# apply_dml_events_validation_interval_in_seconds: 10 # Interval for DML event validation (seconds) (Default: 10)
# full_dml_event_validation_interval_in_hours: 0 # Interval for full DML event validation. 0 disables full DML event validation (hours) (Default: 0)

# EventLoader config
# pk_set_max_size: 100000 # Max number of DML PKs to load from DB at once. No more than 2 * pk_set_max_size will be kept in Redis. This is used for memory optimization. (Default: 100000)
# event_batch_duration_in_seconds: 3600 # Timestamp range of DML events to load from DB at once (seconds). (Default: 3600)

# Operation class config
# operation_class_config: ~ # Operation class specific configurations. (Default: ~)
54 changes: 54 additions & 0 deletions deploy/compose/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
services:
controller: &component-base
image: "" # SB-OSC image
container_name: controller
environment: &component-env
AWS_REGION: "" # AWS region
CONFIG_FILE: "/opt/sb-osc/config.yaml"
SECRET_FILE: "/opt/sb-osc/secret.json"
volumes:
- ./config.yaml:/opt/sb-osc/config.yaml
- ./secret.json:/opt/sb-osc/secret.json
command: ["python", "-m", "sbosc.controller.main"]
restart: always
depends_on:
- redis

eventhandler:
<<: *component-base
container_name: eventhandler
command: ["python", "-m", "sbosc.eventhandler.main"]
depends_on:
- controller

monitor:
<<: *component-base
container_name: monitor
command: ["python", "-m", "sbosc.monitor.main"]
depends_on:
- controller

worker:
<<: *component-base
container_name: worker
command: ["python", "-m", "sbosc.worker.main"]
environment:
<<: *component-env
POD_NAME: "worker"
depends_on:
- controller

redis:
image: "redis:7.0.4"
container_name: redis
command:
- redis-server
- /usr/local/etc/redis/redis.conf
ports:
- "6379:6379"
volumes:
- redis-data:/data
- ./redis.conf:/usr/local/etc/redis/redis.conf

volumes:
redis-data:
3 changes: 3 additions & 0 deletions deploy/compose/redis.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
requirepass ""
appendonly yes
save ""
9 changes: 9 additions & 0 deletions deploy/compose/secret.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"username": "root",
"password": "",
"port": "3306",
"redis_host": "redis",
"redis_password": "",
"slack_channel": "",
"slack_token": ""
}
Loading

0 comments on commit 3e84f96

Please sign in to comment.