New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Compress with pbzip2 #1

Open

ltworf opened this issue Apr 4, 2014 · 1 comment

ltworf commented Apr 4, 2014

I think that rather than having a compression module, the code could just run an external compression program and then read back from it.

This would allow us to use more diverse tools.

I think we should use pbzip2 for this, since db servers have many cores.

I run this experiment:

# asd is a file filled with random data, sized 149MiB
$ time (cat asd | pbzip2 > asd.bz2)

real    0m6.155s
user    0m40.856s
sys     0m0.660s

salvo@vulcano /tmp$ time bzip2 asd 

real    0m21.327s
user    0m20.692s
sys     0m0.156s
salvo@vulcano /tmp$

As you can see pbzip is clearly faster, even on streamed input, not just on mappable files.
This would benefit, by reducing the backup time.

The text was updated successfully, but these errors were encountered:

Author

ltworf commented Apr 4, 2014

It would also make the compress-test useless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment