Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How are you using Polo? #19

Open
nettofarah opened this issue Nov 30, 2015 · 7 comments
Open

How are you using Polo? #19

nettofarah opened this issue Nov 30, 2015 · 7 comments
Labels

Comments

@nettofarah
Copy link
Contributor

I'm curious to know what cool stuff people are using Polo for.

  • What environments are you generating data for?
  • What version of Rails?
  • How big is your final .sql file?
  • What database are you using?
  • How many database tables is Polo touching?
  • How hard was it to find a good sample size?
  • Are you using any advanced features?
  • How are you running Polo in prod? rake task, rails runner, rails console, etc.
  • Do you have an automated process to generate the files?
  • How do you transfer your data across environments? Publishing artifacts, rsync, scp...

No need to answer everything, but I would love to know how people are using the library so we know what to prioritize.

Thank you for using Polo <3

@khani3s
Copy link

khani3s commented Dec 2, 2015

We gonna use to allow customers dump their data from our web application (www.runrun.it).
Rails 3.2
Depends on customer account size.
Postgresql
10 tables
We are running as a background job
We upload the data to S3 and generate a link.

@nettofarah
Copy link
Contributor Author

@khani3s that is really cool!
Good luck.

@belt-ascendlearning
Copy link

  • What environments are you generating data for? Depends on the task.
  • What version of Rails? rails-api-0.4.0
  • How big is your final .sql file? Varies on the task.
  • What database are you using? mysql-enterprise-5.6.25
  • How many database tables is Polo touching? Varies on the task.
  • How hard was it to find a good sample size? N/A
  • Are you using any advanced features? Not officially.
  • How are you running Polo in prod? rake task, rails runner, rails console, etc. Not yet.
  • Do you have an automated process to generate the files? https://gist.github.com/belt/c8f7b1c45834ce6fa485
  • How do you transfer your data across environments? Publishing artifacts, rsync, scp... rsync & scp

@craigmcnamara
Copy link
Contributor

  • Generating compact and scrubbed DBs for development environments.
  • Our smallest working set is 8k .sql file, but it's growing slowly.
  • MySQL DB
  • We're using Rails.3.2 but migrating to Rails 4.2+.
  • Getting good data for a full runnable db subset was rather easy.
  • I was using the obfuscation features mentioned in pull Only obfuscate if the instance has the field in question. #28
  • Running polo in production soon with rake
  • Fully automated and pushes to S3

Polo is great, thanks!

@volkanunsal
Copy link
Contributor

volkanunsal commented May 4, 2016

  • What environments are you generating data for?

Staging and development

  • What version of Rails?

4.2.3

  • How big is your final .sql file?

50MB

  • What database are you using?

Postgres

  • How many database tables is Polo touching?

15

  • How hard was it to find a good sample size?

Tiny bit because one model in particular, i.e. Project, can belong to 2 separate models. When I first used this, I picked a random sample of each table and specified the Project dependency in only one of them, and as a result ended up with quite a few orphaned records. I don't know what the best way is to sample correctly to ensure you're not omitting all the dependent records in your sample.

  • Are you using any advanced features?

Not yet.

  • How are you running Polo in prod? rake task, rails runner, rails console, etc.

Rake task.

  • Do you have an automated process to generate the files?

Just the rake task. I run it manually right now.

  • How do you transfer your data across environments? Publishing artifacts, rsync, scp...

None of the above. I'm just connecting to the staging DB and running the SQL script.

@bessey
Copy link

bessey commented May 29, 2016

I've just open sourced a tool we've been using internally for a month or two largely built around Polo, called Brillo.

It uses Polo to make prod db scrubs > uploading them to S3 > download them to dev machines > load db.

It used to take us over an hour to load a "lightweight" copy of our DB with 4% of our biggest tables on a dev machine. Now with Polo we just take the last 1000 records from a few tables, and crawl their associations. Loads are down to < 10 minutes.

Thank you for making an awesome gem, and getting me to contribute to open source myself for the first time ever :)

@jdelStrother
Copy link
Contributor

  • What environments are you generating data for?
    Development
  • What version of Rails?
    4.2.7
  • How big is your final .sql file?
    ~ 100MB unzipped
  • What database are you using?
    Mysql
  • How many database tables is Polo touching?
    28
  • How hard was it to find a good sample size?
    Reasonably easy. By default we only fetch a fairly shallow copy of recent data (eg we find recent posts and copy those and the posts' authors, but not the posts' authors' followers or other associated metadata). Devs can manually specify individual entities when they do want to get a deep copy of data. Otherwise the dataset ballooned in size very quickly.
  • Are you using any advanced features?
    Don't think so.
  • How are you running Polo in prod? rake task, rails runner, rails console, etc.
    A capistrano task, which then uses rails-runner to generate the dump on a remote machine. eg -
    • cap production db:fetch FETCH=all to get a relatively shallow copy of recent data
    • cap production db:fetch FETCH=user=jon to get a deep copy of a specific entity & associations.
  • Do you have an automated process to generate the files?
    No, we just run capistrano by hand, on demand.
  • How do you transfer your data across environments? Publishing artifacts, rsync, scp...
    Capistrano (so basically scp)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants