-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
monthly offsite backup to Amazon glacier #3
Comments
It would probably make sense to plan on transferring the data to Amazon cloud in such a way that it is available both for backups (e.g. in Amazon Glacier) and also for computation (e.g. Amazon Elastic Map Reduce). |
Agreed. I will get the ball rolling. -Charles
|
We might want to consider duply, which is supposed to ease duplicity use. This might be a useful overview: http://blog.phusion.nl/2013/11/11/duplicity-s3-easy-cheap-encrypted-automated-full-disk-backups-for-your-servers/ After we get duplicity working and see how well it works, we might want to consider making that our sole backup solution so we don't have to maintain both. |
This sounds like an awesome idea. Do you want me to set up an S3 bucket? -Charles Charles Y. Lin, Ph.D. On Wed, Oct 29, 2014 at 9:45 PM, John DiMatteo [email protected]
|
@bradnerComputation : would you like to create a I'm testing out duplicity now. I just installed duplicity/duply, and I think that was successful but it seems like the unrelated libpam-systemd package is in a broken state. Have you seen this error on TOD before?
I saw this related thread. I guess we can just ignore it for now, since it doesn't seem fixed in Ubuntu Trusty yet. |
@bradnerComputation I finally got duplicity running with a small sample. I don't think a full backup to ec2 of something as big as tod:/grail is feasible. I just measured tod's upload bandwidth to be about 30 megabits/second. Grail is about 12TB, and at that rate it would take about 37 days of uninterrupted upload. Your IT department might not appreciate you using all this upload bandwidth for 37 days either. We might want to consider instead copying /Grail to 6 or 7 of internal 2TB disks and mailing them to Amazon via their import program (http://calculator.s3.amazonaws.com/index.html?s=importexport). You could probably buy the drives for about $500 and then the Amazon upload cost would be about $200 I think, so around $700 (and they would return the drives so you could use them for something else). This would allow us to get the initial backup done, then we could do incremental backups over TOD's internet connection. This seems like significant hassle -- are we sure the Dana Farber IT doesn't have some other offsite solution (e.g. maybe an internal network to storage in another building that could effectively count as offsite)? If going forward we anticipate processing bams on EC2, then maybe this hassle is worth it. Please let me know your thoughts. I think last time we discussed this you suggested just trying the backup over the internet and seeing how it goes -- let me know if you'd like me to just start the 30 day transfer. I successfully followed the instructions at http://blog.phusion.nl/2013/11/11/duplicity-s3-easy-cheap-encrypted-automated-full-disk-backups-for-your-servers/ after figuring out the following:
|
I think I can get 40TB of storage from them. would that be sufficient? -Charles Charles Y. Lin, Ph.D. On Wed, Nov 5, 2014 at 5:03 AM, John DiMatteo [email protected]
|
40TB sounds good. You talking about 40TB of storage at another building at Dana Farber, right? Have you tested the upload bandwith to that 40TB? |
The text was updated successfully, but these errors were encountered: