Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compress the snapshot & commit log #2034

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

mamcx
Copy link
Contributor

@mamcx mamcx commented Dec 3, 2024

Description of Changes

Compress the snapshot, including the commit log, using Zstd.

Closes #1592 & #1594.

NOTES: in draft because we await more test data.

API and ABI breaking changes

It could read previously uncompressed data because it will detect using magic bytes the compression algorithm used, if any.

Because it wrap the files with CompressReader / CompressWriter is transparent to the rest of the engine.

Expected complexity level and risk

1

Testing

  • Added simple test that open/write the snapshot
  • Include a [ignored] test that using the environment variables SNAPSHOT="path'", IDENTITY=hex` for check the compression ratio on existing data*
  • Added benches for both cases`
  • Manually check with standalone & private that data get compressed

Bench

[crates/bench/benches/special.rs:234:9] &size = SnapshotSize {
    compressed_type: None,
    object_count   : 9,
    file_size      :      525 bytes,
    object_size    :   592145 bytes,
    total_size     :   592670 bytes,
}
[crates/bench/benches/special.rs:234:9] &size = SnapshotSize {
    compressed_type: Zstd,
    object_count   : 9,
    file_size      :      420 bytes,
    object_size    :    17896 bytes,
    total_size     :    18316 bytes,
}
[crates/bench/benches/special.rs:234:9] &size = SnapshotSize {
    compressed_type: Lz4,
    object_count   : 9,
    file_size      :      442 bytes,
    object_size    :    35989 bytes,
    total_size     :    36431 bytes,
}
[crates/bench/benches/special.rs:234:9] &size = SnapshotSize {
    compressed_type: Snap,
    object_count   : 9,
    file_size      :      447 bytes,
    object_size    :    53486 bytes,
    total_size     :    53933 bytes,
}
special/snapshot/synthetic/save_compression_None
                        time:   [3.4299 ms 3.4705 ms 3.5146 ms]
                        thrpt:  [160.82 MiB/s 162.86 MiB/s 164.79 MiB/s]
Found 1 outliers among 50 measurements (2.00%)
  1 (2.00%) high mild
special/snapshot/synthetic/open_compression_None
                        time:   [670.31 µs 671.53 µs 672.81 µs]
                        thrpt:  [840.07 MiB/s 841.68 MiB/s 843.21 MiB/s]
special/snapshot/synthetic/save_compression_Zstd
                        time:   [4.3128 ms 4.3491 ms 4.3853 ms]
                        thrpt:  [128.89 MiB/s 129.96 MiB/s 131.06 MiB/s]
Found 1 outliers among 50 measurements (2.00%)
  1 (2.00%) high mild
special/snapshot/synthetic/open_compression_Zstd
                        time:   [1.0121 ms 1.0257 ms 1.0401 ms]
                        thrpt:  [543.40 MiB/s 551.04 MiB/s 558.47 MiB/s]
special/snapshot/synthetic/save_compression_Lz4
                        time:   [3.8175 ms 3.8567 ms 3.8971 ms]
                        thrpt:  [145.03 MiB/s 146.56 MiB/s 148.06 MiB/s]
Found 2 outliers among 50 measurements (4.00%)
  1 (2.00%) low mild
  1 (2.00%) high severe
special/snapshot/synthetic/open_compression_Lz4
                        time:   [1.3209 ms 1.3221 ms 1.3236 ms]
                        thrpt:  [427.02 MiB/s 427.52 MiB/s 427.90 MiB/s]
Found 2 outliers among 50 measurements (4.00%)
  2 (4.00%) high severe
special/snapshot/synthetic/save_compression_Snap
                        time:   [4.2330 ms 4.3074 ms 4.3845 ms]
                        thrpt:  [128.91 MiB/s 131.22 MiB/s 133.53 MiB/s]
Found 7 outliers among 50 measurements (14.00%)
  3 (6.00%) low mild
  3 (6.00%) high mild
  1 (2.00%) high severe
special/snapshot/synthetic/open_compression_Snap
                        time:   [1.0540 ms 1.0563 ms 1.0590 ms]
                        thrpt:  [533.74 MiB/s 535.08 MiB/s 536.25 MiB/s]

special/snapshot/synthetic/save_compression_None #2
                        time:   [3.4767 ms 3.5054 ms 3.5347 ms]
                        thrpt:  [159.91 MiB/s 161.24 MiB/s 162.57 MiB/s]
Found 1 outliers among 50 measurements (2.00%)
  1 (2.00%) high mild
special/snapshot/synthetic/open_compression_None #2
                        time:   [673.88 µs 674.12 µs 674.37 µs]
                        thrpt:  [838.14 MiB/s 838.45 MiB/s 838.75 MiB/s]
Found 2 outliers among 50 measurements (4.00%)
  2 (4.00%) high mild
special/snapshot/synthetic/save_compression_Zstd #2
                        time:   [4.2698 ms 4.3035 ms 4.3393 ms]
                        thrpt:  [130.25 MiB/s 131.34 MiB/s 132.37 MiB/s]
Found 1 outliers among 50 measurements (2.00%)
  1 (2.00%) high severe
special/snapshot/synthetic/open_compression_Zstd #2
                        time:   [1.0497 ms 1.0531 ms 1.0569 ms]
                        thrpt:  [534.78 MiB/s 536.72 MiB/s 538.46 MiB/s]
Found 4 outliers among 50 measurements (8.00%)
  3 (6.00%) high mild
  1 (2.00%) high severe
special/snapshot/synthetic/save_compression_Lz4 #2
                        time:   [3.7103 ms 3.7474 ms 3.7875 ms]
                        thrpt:  [149.23 MiB/s 150.83 MiB/s 152.34 MiB/s]
Found 2 outliers among 50 measurements (4.00%)
  1 (2.00%) high mild
  1 (2.00%) high severe
special/snapshot/synthetic/open_compression_Lz4 #2
                        time:   [1.3267 ms 1.3529 ms 1.3953 ms]
                        thrpt:  [405.09 MiB/s 417.78 MiB/s 426.02 MiB/s]
Found 6 outliers among 50 measurements (12.00%)
  2 (4.00%) high mild
  4 (8.00%) high severe
special/snapshot/synthetic/save_compression_Snap #2
                        time:   [4.0390 ms 4.0786 ms 4.1180 ms]
                        thrpt:  [137.25 MiB/s 138.58 MiB/s 139.94 MiB/s]
Found 2 outliers among 50 measurements (4.00%)
  1 (2.00%) low mild
  1 (2.00%) high mild
special/snapshot/synthetic/open_compression_Snap #2
                        time:   [1.0285 ms 1.0347 ms 1.0424 ms]
                        thrpt:  [542.25 MiB/s 546.26 MiB/s 549.56 MiB/s]
Found 7 outliers among 50 measurements (14.00%)
  1 (2.00%) high mild
  6 (12.00%) high severe

@mamcx mamcx added Do not merge Do not merge PRs with this label without coordinating further release-1.0 labels Dec 3, 2024
@mamcx mamcx self-assigned this Dec 3, 2024
@mamcx mamcx requested a review from kim December 3, 2024 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Do not merge Do not merge PRs with this label without coordinating further release-1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[STORAGE USE REDUCTION] Commitlog segment compression
1 participant