Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try out BackBlaze integration #270

Closed
3 tasks done
rahulbot opened this issue Apr 3, 2024 · 8 comments
Closed
3 tasks done

try out BackBlaze integration #270

rahulbot opened this issue Apr 3, 2024 · 8 comments
Assignees
Labels
enhancement New feature or request infrastructure
Milestone

Comments

@rahulbot
Copy link
Contributor

rahulbot commented Apr 3, 2024

Initial calculations show switching some offsite backups to BackBlaze could save 60-70% of ongoing monthly storage costs. We need to test it out to see how it performs on something low-risk and redundant. Candidates:

  • add a 2nd backup via Dokku for web or rss-fetcher Postgres DBs (is this possible)
  • do a manual backup of recently ILM-rolled-over ES index

This requires:

  • check if dokku allows multiple backups of PG
  • check if dokku support BackBlaze (or if it can "speak" s3 DSNs)
  • create BackBlaze account and credentials and store to BitWarden

(split off from #260)

@rahulbot rahulbot added enhancement New feature or request infrastructure labels Apr 3, 2024
@rahulbot rahulbot added this to the Production Beta 5 milestone Apr 3, 2024
@rahulbot rahulbot self-assigned this Apr 3, 2024
@rahulbot rahulbot changed the title test out BackBlaze integration try out BackBlaze integration Apr 3, 2024
@rahulbot
Copy link
Contributor Author

rahulbot commented Apr 8, 2024

I created a trial account for us to test with. Credentials and an application key are in our password manager. I created a "mediacloud-test" bucket to use for testing.

BackBlaze does have a S3-compatible API.

@rahulbot
Copy link
Contributor Author

rahulbot commented Apr 8, 2024

Doesn't look like Dokku supports multiple backup destinations. See the postgres plugin docs and an open enhancement issue requesting this features.

@rahulbot
Copy link
Contributor Author

Decided to try out BackBlaze via either (a) daily RSS file posting or (b) daily archive WARCs.

@philbudne
Copy link
Contributor

Opened PR to allow archiver to write to both S3 and B2 at same time, and to support both s3://bucket/prefix and b2://bucket/prefix command line arguments to Queuer subclasses: #279

Needed:

  • add config vars to {dev,staging,prod}.sh files, deploy.sh, docker-compose.yml.j2
  • figure out backblaze access control
  • create BB2 buckets
  • decide how to structure access (how many different keys to create)
  • rss-fetcher can sync to either S3, B2 or both using variation of command line args to aws s3 sync commands

@philbudne
Copy link
Contributor

Was able to use dokku postgres:backup command to back up staging-rss-fetcher PG DB after doing:
dokku postgres:backup-auth staging-rss-fetcher KEY_ID SECRET REGION s3v4 https://s3.REGION.backblazeb2.com

@philbudne
Copy link
Contributor

Looking out for backblaze pitfalls, I went looking for API rate limits. Answer is hazy:
https://www.reddit.com/r/backblaze/comments/15j7nwt/b2_rate_limit/
says:

The docs says there's a rate limit, but it does not specify how much. Does that rate limit applies to ListObjects only? Or also to Get/Put/Delete/Copy...?

From what I've seen some people talking, the API limit is 1000/5 minutes. Seems too little to download/delete the data if you have a large amount

A backblaze person made it clear they don't publish any limit:

Christopher from the Backblaze team here ->

We do not publish rate limits in order to help mitigate bad actors. If you believe you are being rate limited, feel free to reach out to our support team

@rahulbot
Copy link
Contributor Author

Update: switching RSS portgres backups to B2 from S3

@rahulbot
Copy link
Contributor Author

Closing as done. Other switches will be their own issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request infrastructure
Projects
None yet
Development

No branches or pull requests

2 participants