Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make alt efs sync with live #2025

Draft
wants to merge 34 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
9d28f55
feat: create wrapper script
kencho51 Sep 4, 2024
d5a577f
feat: create bats test
kencho51 Sep 4, 2024
87eb288
chore: create gitignore
kencho51 Sep 4, 2024
38729c8
refactor: make teardown work
kencho51 Sep 5, 2024
041c707
feat: bring in config steps
kencho51 Sep 5, 2024
70aa24c
refactor: update the log dir name to logs
kencho51 Sep 5, 2024
b131322
feat: make bats test to check the log
kencho51 Sep 5, 2024
6eaac03
refactor: update gitignore
kencho51 Sep 6, 2024
e14356d
refactor: make the script more versitle
kencho51 Sep 6, 2024
98aaa7b
refactor: update the bats tests
kencho51 Sep 6, 2024
a17c771
refactor: create data dir in dev
kencho51 Sep 9, 2024
f9110e3
refactor: get the upstream private key
kencho51 Sep 9, 2024
d8f8ea4
refactor: read the private key from ssh dir
kencho51 Sep 9, 2024
edc9490
feat: include misc url in ansible
kencho51 Sep 9, 2024
d46e84e
feat: make sync dropbox tool available in bastion
kencho51 Sep 9, 2024
7331f87
refactr: make the cnhk url working
kencho51 Sep 9, 2024
e0db781
refactor: make centos user to execute the script
kencho51 Sep 9, 2024
68f2324
feat: make user can execute the script
kencho51 Sep 9, 2024
e13e96e
refactor: make rds skip snapshot when destroy
kencho51 Sep 19, 2024
c25466e
feat: create cronjob for sync dropbox
kencho51 Sep 19, 2024
615d834
refactor: create config dir
kencho51 Sep 19, 2024
e3f9ccd
refactor: specify the exact path of the rclone executable in productions
kencho51 Sep 20, 2024
54c7346
refactor: remove the date suffix from the log file
kencho51 Sep 26, 2024
1bf8435
refactor: create tasks for logrotate implementation
kencho51 Sep 26, 2024
0cf9b05
refactor: make the created log file rw to the gigadb group
kencho51 Sep 30, 2024
f4f444b
doc: update change log
kencho51 Sep 30, 2024
1b7e808
refactor: remove extra dist file
kencho51 Oct 23, 2024
79e4190
refactor: add back rclone config file
kencho51 Oct 23, 2024
418572e
refactor: make the log dir name consistent
kencho51 Oct 23, 2024
62709ac
Revert "refactor: make rds skip snapshot when destroy"
kencho51 Oct 23, 2024
5784aa8
refactor: update the test after changing the log dir name
kencho51 Oct 23, 2024
b208b5d
refactor: update the config path
kencho51 Oct 23, 2024
7ca6611
refactor: make sure get the rclone profile name for the upsteam live
kencho51 Oct 24, 2024
9e6ffc4
refactor: remove extra task for change the dir permission
kencho51 Oct 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## Unreleased

- Feat #2000: Make user dropboxes in alternative in sync with live
- Fix #1838: switch datepicker format to yyy-mm-dd
- Feat #1768: Alphabetically sorted dataset author dropdown options in adminDatasetAuthor form
- Fix #1843: Add top margin to table footer in dataset page
Expand Down
4 changes: 4 additions & 0 deletions gigadb/app/tools/sync-dropbox/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
log
data

.env
23 changes: 23 additions & 0 deletions gigadb/app/tools/sync-dropbox/config-sources/env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@

### GitLab Private Token ######################################################
# This token allows access to the secret variables stored on GitLab.com for
# logins, passwords, api keys and secrets required by the project. If you don't
# have one, generate from your user account setting on Gitlab.com.

#GITLAB_PRIVATE_TOKEN=

### GitLab API URLs ###########################################################
# The urls can be different for each environment and forks. Group variables are
# supplied by Gigadb and set at group level (gigascience, forks and upstream) in
# GitLab. Project variables are set at a project level by the owner of the fork.
# The variables will be merged into a .secrets file.
# Replace <Your fork name here> with url-encoded name of the fork

REPO_NAME= <Your fork name here>
CI_PROJECT_URL="https://gitlab.com/api/v4/projects/gigascience/forks/$REPO_NAME/"
GROUP_VARIABLES_URL="https://gitlab.com/api/v4/groups/gigascience/variables?per_page=100"
FORK_VARIABLES_URL="https://gitlab.com/api/v4/groups/3501869/variables"
PROJECT_VARIABLES_URL="https://gitlab.com/api/v4/projects/gigascience%2Fforks%2F$REPO_NAME/variables"
MISC_VARIABLES_URL="https://gitlab.com/api/v4/projects/gigascience%2Fcnhk-infra/variables"

GIGADB_ENV=dev
17 changes: 17 additions & 0 deletions gigadb/app/tools/sync-dropbox/config/rclone.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
[production-staging]
type = sftp
host = bastion-stg.gigadb.host
user = centos
key_file = ~/.ssh/id-rsa-aws-hk-gigadb.pem
shell_type = unix
md5sum_command = md5sum
sha1sum_command = sha1sum

[production-live]
type = sftp
host = bastion.gigadb.host
user = centos
key_file = ~/.ssh/id-rsa-aws-hk-gigadb.pem
shell_type = unix
md5sum_command = md5sum
sha1sum_command = sha1sum
23 changes: 23 additions & 0 deletions gigadb/app/tools/sync-dropbox/configure
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/env bash

# load the library of configuration function
source ../../configurator/src/dotfiles.sh

# bail out upon error
set -e

# bail out if an unset variable is used
set -u

# set up the application source
appSource=.

set -a
makeDotEnv $appSource
set +a

# generate config for Yii2 config files: SOURCE TARGET VARS
# If we are on staging environment override variable name with their remote environment counterpart
echo "Current environment: $GIGADB_ENV"

curl -s --header "PRIVATE-TOKEN: $GITLAB_PRIVATE_TOKEN" "$MISC_VARIABLES_URL/id_rsa_aws_hk_gigadb_pem" | jq -r .value > ~/.ssh/id-rsa-aws-hk-gigadb.pem
103 changes: 103 additions & 0 deletions gigadb/app/tools/sync-dropbox/scripts/sync_dropbox.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
#!/usr/bin/env bash

# stop the script upon error
set -e

if [[ $(uname -n) =~ compute ]]; then
source "./.bash_profile"
RCLONE_CONF='/etc/sync_dropbox/rclone.conf'
else
source "./.env"
RCLONE_CONF='config/rclone.conf'
fi

usage_message="Usage: $0 <option> \n

Available Option:
--dry-run Do a trial run
--apply Escape dry run mode

Example usages:
$0 --dry-run
$0 --apply"

if [ $# -eq 0 ]; then
echo -e "$usage_message"
exit 1
fi

# Setup logging
function set_up_logging() {
if [[ $(uname -n) =~ compute ]];then
LOGFILE="/var/log/gigadb/sync_dropbox.log"
else
currentPath=$(pwd)
LOGDIR="$currentPath/log"
LOGFILE="$LOGDIR/sync_dropbox.log"
mkdir -p "${LOGDIR}"
touch "${LOGFILE}"

SHAREDATADIR="data/share/dropbox"
mkdir -p "${SHAREDATADIR}"
fi
}


# rclone sync is executed in dry run mode as default
dry_run=true

while [[ $# -gt 0 ]]; do
case "$1" in
--dry-run)
dry_run=true
;;
--apply)
dry_run=false
;;
*)
echo "Invalid option: $1"
exit 1
;;
esac
shift
done

function start_sync () {
# Determine the rclone sync command based on the environment
case "${GIGADB_ENV}" in
dev)
ENV=$(awk 'NR==1 {print $1}' ${RCLONE_CONF} | tr -d '[]')
echo -e "$(date +'%Y/%m/%d %H:%M:%S') INFO : Start sync dropbox from ${ENV} to alt ${GIGADB_ENV}" | tee -a "${LOGFILE}"
rclone_sync_cmd="rclone sync production-staging:/share/dropbox/ ${SHAREDATADIR}"
;;
staging)
ENV=$(awk 'NR==1 {print $1}' ${RCLONE_CONF} | tr -d '[]')
echo -e "$(date +'%Y/%m/%d %H:%M:%S') INFO : Start sync dropbox from ${ENV} to alt ${GIGADB_ENV}" | tee -a "${LOGFILE}"
rclone_sync_cmd="/usr/local/bin/rclone sync production-staging:/share/dropbox/ /share/dropbox"
;;
live)
ENV=$(awk 'NR==10 {print $1}' ${RCLONE_CONF} | tr -d '[]')
echo -e "$(date +'%Y/%m/%d %H:%M:%S') INFO : Start sync dropbox from ${ENV} to alt ${GIGADB_ENV}" | tee -a "${LOGFILE}"
rclone_sync_cmd="/usr/local/bin/rclone sync production-live:/share/dropbox/ /share/dropbox"
;;
esac

# Append options
[[ "${dry_run}" == true ]] && rclone_sync_cmd+=" --dry-run"
rclone_sync_cmd+=" --config ${RCLONE_CONF}"
rclone_sync_cmd+=" --log-file ${LOGFILE} --log-level INFO --stats-log-level DEBUG"

# Execute command
eval "${rclone_sync_cmd}"
rclone_sync_exit_code=$?

echo "$(date +'%Y/%m/%d %H:%M:%S') INFO : Executed: ${rclone_sync_cmd}" | tee -a "$LOGFILE"
if [ ${rclone_sync_exit_code} -eq 0 ]; then
echo -e "$(date +'%Y/%m/%d %H:%M:%S') INFO : Successfully sync dropbox from ${ENV} to alt ${GIGADB_ENV}" | tee -a "${LOGFILE}"
else
echo -e "$(date +'%Y/%m/%d %H:%M:%S') ERROR : Problem with the sync - rclone has exit code: ${rclone_sync_exit_code}" | tee -a "${LOGFILE}"
fi
}

set_up_logging
start_sync
59 changes: 59 additions & 0 deletions gigadb/app/tools/sync-dropbox/tests/bats/sync_dropbox.bats
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
teardown () {
echo "executing teardown code"

rm -rf log/sync_dropbox.log
rm -rf data/share/dropbox/*
}

@test "No parameter provided" {
run scripts/sync_dropbox.sh
[ "$status" -eq 1 ]
[[ "$output" =~ "Usage: scripts/sync_dropbox.sh <option>" ]]
[[ "$output" =~ "Available Option:" ]]
[[ "$output" =~ "--dry-run Do a trial run" ]]
[[ "$output" =~ "--apply Escape dry run mode" ]]
[[ "$output" =~ "Example usages:" ]]
[[ "$output" =~ "scripts/sync_dropbox.sh --dry-run" ]]
[[ "$output" =~ "scripts/sync_dropbox.sh --apply" ]]
}

@test "Execute in dry run mode" {
run scripts/sync_dropbox.sh --dry-run
[ "$status" -eq 0 ]

expected_lines=(
"INFO : Start sync dropbox from production-staging to alt dev"
"NOTICE: rija_test.txt: Skipped copy as --dry-run is set (size 166)"
"NOTICE: user27: Skipped set directory modification time as --dry-run is set"
"INFO : Executed: rclone sync production-staging:/share/dropbox/ data/share/dropbox --dry-run --config config/rclone.conf"
"INFO : Successfully sync dropbox from production-staging to alt dev"
)

# Check the log
for line in "${expected_lines[@]}"; do
run grep -F "$line" log/sync_dropbox.log
[ "$status" -eq 0 ]
done
}

@test "Execute in apply mode" {
run scripts/sync_dropbox.sh --apply
[ "$status" -eq 0 ]

expected_lines=(
"INFO : Start sync dropbox from production-staging to alt dev"
"INFO : user27/.dotfiles.txt: Copied (new)"
"INFO : rija_test.txt: Copied (new)"
"INFO : user0/change.log: Copied (new)"
"INFO : user4/brassicaceae_NCBI/amas.tar.gz: Copied (new)"
"INFO : user109: Set directory modification time (using DirSetModTime)"
"INFO : Executed: rclone sync production-staging:/share/dropbox/ data/share/dropbox --config config/rclone.conf"
"INFO : Successfully sync dropbox from production-staging to alt dev"
)

# Check the log
for line in "${expected_lines[@]}"; do
run grep -F "$line" log/sync_dropbox.log
[ "$status" -eq 0 ]
done
}
121 changes: 113 additions & 8 deletions ops/infrastructure/bastion_playbook.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,55 @@
group:
when: backup_file

- name: Create and update logrotate service
hosts: name_bastion_server_{{gigadb_env}}*
become: yes
tags:
- logrotate

tasks:
- name: Create gigadb group if it does not exist
ansible.builtin.group:
name: gigadb
state: present

- name: Add centos user to gigadb group
ansible.builtin.user:
name: centos
groups: gigadb
append: yes

- name: Create directory for storing log output
ansible.builtin.file:
path: /var/log/gigadb
state: directory
mode: '0755'
owner: root
group: gigadb

- name: Create logrotate config file for gigadb services
ansible.builtin.file:
path: /etc/logrotate.d/gigadb
state: touch
owner: root
group: root
mode: '0644'

- name: Add in configuration
ansible.builtin.blockinfile:
path: /etc/logrotate.d/gigadb
block: |
/var/log/gigadb/*.log {
daily
missingok
rotate 7
compress
delaycompress
notifempty
create 0660 root gigadb
dateformat -%Y%m%d
}

- name: Setup files-url-updater so to load latest DB backup in RDS daily
hosts: name_bastion_server_{{gigadb_env}}*
tags:
Expand Down Expand Up @@ -394,14 +443,6 @@
group: centos
mode: 0644

- name: Create directory for storing log output
ansible.builtin.file:
path: /var/log/gigadb
state: directory
mode: '0777'
owner: centos
group: centos

- name: Create a .aws directory for the centos user
ansible.builtin.file:
path: "/home/centos/.aws"
Expand All @@ -426,6 +467,70 @@
group: centos
mode: a+x

- name: Setup sync dropbox tool
hosts: name_bastion_server_{{gigadb_env}}*
tags:
- sync-dropbox-tool

tasks:
- name: Create dir for storing rclone config
ansible.builtin.file:
path: /etc/sync_dropbox
state: directory
owner: centos
group: centos
mode: '0777'

- name: Copy rclone config for dropbox sync
ansible.builtin.copy:
src: "../../../../gigadb/app/tools/sync-dropbox/config/rclone.conf"
dest: /etc/sync_dropbox/rclone.conf
owner: centos
group: centos
mode: '0644'

- name: Create log file
ansible.builtin.file:
path: /var/log/gigadb/sync_dropbox.log
state: touch
owner: root
group: gigadb
mode: '0664'

- name: Copy the wrapper script to sync dropbox from upstream to alt
ansible.builtin.copy:
src: "../../../../gigadb/app/tools/sync-dropbox/scripts/sync_dropbox.sh"
dest: /usr/local/bin/sync_dropbox
owner: centos
group: centos
mode: a+x

- name: Get then upstream private key
ansible.builtin.uri:
url: "{{ gitlab_misc_url }}/variables/id_rsa_aws_hk_gigadb_pem"
method: GET
headers:
PRIVATE-TOKEN: "{{ gitlab_private_token }}"
body_format: json
register: private_key_from_gl

- name: Copy the upstream private key
ansible.builtin.copy:
content: "{{ private_key_from_gl.json.value }}"
dest: "/home/centos/.ssh/id-rsa-aws-hk-gigadb.pem"
owner: centos
group: centos
mode: g-rw,o-rw

- name: Set up cronjob to execute the wrapper script
ansible.builtin.cron:
name: "Sync efs dropbox from upstream to alt upstream"
minute: "00"
hour: "11"
user: "centos"
job: "/usr/local/bin/sync_dropbox --apply"


- name: Set up and configuration of rclone on bastion server
hosts: name_bastion_server_{{gigadb_env}}*
tags:
Expand Down
1 change: 1 addition & 0 deletions ops/infrastructure/inventories/hosts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ grafana_contact_smtp_from_name = "{{ lookup('ini', 'grafana_contact_smtp_from_na
[type_aws_instance:vars]

gitlab_url = "https://gitlab.com/api/v4/projects/{{ lookup('ini', 'gitlab_project type=properties file=ansible.properties') | urlencode | regex_replace('/','%2F') }}"
gitlab_misc_url = "https://gitlab.com/api/v4/projects/{{ lookup('ini', 'gitlab_misc type=properties file=ansible.properties') | urlencode | regex_replace('/','%2F') }}"
ansible_ssh_private_key_file = "{{ lookup('ini', 'ssh_private_key_file type=properties file=ansible.properties') }}"
ansible_ssh_common_args="-o ProxyCommand='ssh -W %h:%p -q {{ lookup('ini', 'ec2_bastion_login_account type=properties file=ansible.properties') }} -i {{ lookup('ini', 'ssh_private_key_file type=properties file=ansible.properties') }}'"
ansible_user = "centos"
Expand Down
Loading