Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimizations #131

Open
2 of 5 tasks
madadam opened this issue Jun 29, 2023 · 6 comments
Open
2 of 5 tasks

Performance optimizations #131

madadam opened this issue Jun 29, 2023 · 6 comments

Comments

@madadam
Copy link
Collaborator

madadam commented Jun 29, 2023

Quisync performance is not great and should be improved. There are two main areas for this:

Storage performance

Reading blobs

Reading a block involves loading the index nodes from the root to the corresponding leaf node to find the block id and then to load the block itself. This involves multiple db queries. If multliple blocks are read in succession (say when reading a file that consists of multiple blocks), some of the index nodes are shared but we still load them again.

To improve this we should cache the previously loaded nodes/blocks.

Writing blobs

To write a block we need to start a db transaction, load all the index nodes in the path from the root to the block and for each node also all its siblings. Then we need to recalculate the hashes (this is why we need the siblings), then save all the nodes one by one into the db and finally commit the transaction.

This gets very inefficient when multiple blocks are saved in succession because we often load and re-hash the same set of sibling nodes over and over. Also we currently create a new snapshot (and delete the old one) for every written block. Finally performing a lot of small db write transaction is slower than a single large one.

To improve this we should collect the blocks to be written (that is, locator + content) into a memory-only structure and write them all at once at some meaningful point (say after certain number of blocks collected and/or after explicit flush). The cache mentioned in the previous section could also help here.

Index nodes

When we receive an index node from another peer we always perform an expensive operation called update_summaries which involves traversing the index up and down in order to update the block presence checksums. We also use a separate db transaction for every received node.

To Improve this we should again collect the received nodes into a memory-only structure and commit them into the db all at once. This can happen when we collect certain number of nodes and/or when the current snapshot becomes complete. The update_summaries can then be called only once per this commit. This means that most responses would be processed without involving any db operation at all which should improve throughput. Also the cache might help here as well.

Db schema and operations

It's possible that our db schema is not optimal and it might be worth looking into how to improve it. For example, the index nodes tables are not fully normalized (some data is duplicated) and normalizing it might help. Also using integers as primary/foreign keys might be better than using blobs (e.g. block id, locator, ...). We should also check whether our db indices are set up correctly.

Syncing performance

Currently syncing is mostly bottlenecked by the storage so improving the storage also improves syncing and so we should start there. In addition to that, here are some other areas that we might want to look into:

  • In-flight messages limiting: do we need it? Is so, what should the limit be? Should we dynamically adapt the limit based on the throughput? ...
  • Use multiple substreams (they are native in QUIC and can be emulated on TCP)
  • Limit the number of simultaneously used connections (choking, ...)
  • ...

To do

  • Reading blobs
  • Writing blobs
  • Index nodes
  • Db schema and operations
  • Syncing
@madadam madadam self-assigned this Jul 13, 2023
@madadam
Copy link
Collaborator Author

madadam commented Jul 13, 2023

Benchmarks from 2023-07-13 on branch master:

(obtained with cargo bench --bench bench_lib)

lib/write_file/128 kiB  time:   [19.222 ms 20.215 ms 20.950 ms]
                        thrpt:  [5.9665 MiB/s 6.1834 MiB/s 6.5029 MiB/s]
lib/write_file/256 kiB  time:   [33.346 ms 40.202 ms 43.283 ms]
                        thrpt:  [5.7759 MiB/s 6.2185 MiB/s 7.4971 MiB/s]
Benchmarking lib/write_file/512 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.5s or enable flat sampling.
lib/write_file/512 kiB  time:   [50.525 ms 51.799 ms 52.852 ms]
                        thrpt:  [9.4604 MiB/s 9.6526 MiB/s 9.8961 MiB/s]
Benchmarking lib/write_file/1024 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 8.4s or enable flat sampling.
lib/write_file/1024 kiB time:   [100.82 ms 103.16 ms 105.25 ms]
                        thrpt:  [9.5012 MiB/s 9.6932 MiB/s 9.9190 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild
lib/write_file/2048 kiB time:   [178.23 ms 211.39 ms 242.97 ms]
                        thrpt:  [8.2315 MiB/s 9.4613 MiB/s 11.222 MiB/s]
Benchmarking lib/write_file/4096 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.3s.
lib/write_file/4096 kiB time:   [480.64 ms 507.00 ms 538.34 ms]
                        thrpt:  [7.4302 MiB/s 7.8895 MiB/s 8.3222 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking lib/write_file/8192 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 14.8s.
lib/write_file/8192 kiB time:   [1.5271 s 1.6786 s 1.8560 s]
                        thrpt:  [4.3103 MiB/s 4.7659 MiB/s 5.2386 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benchmarking lib/read_file/128 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.3s or enable flat sampling.
lib/read_file/128 kiB   time:   [27.402 ms 28.013 ms 28.769 ms]
                        thrpt:  [4.3449 MiB/s 4.4622 MiB/s 4.5617 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking lib/read_file/256 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.8s or enable flat sampling.
lib/read_file/256 kiB   time:   [24.503 ms 26.424 ms 29.024 ms]
                        thrpt:  [8.6136 MiB/s 9.4611 MiB/s 10.203 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking lib/read_file/512 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 7.4s or enable flat sampling.
lib/read_file/512 kiB   time:   [35.454 ms 36.564 ms 37.601 ms]
                        thrpt:  [13.298 MiB/s 13.674 MiB/s 14.103 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
lib/read_file/1024 kiB  time:   [63.847 ms 65.889 ms 67.264 ms]
                        thrpt:  [14.867 MiB/s 15.177 MiB/s 15.662 MiB/s]
Found 3 outliers among 10 measurements (30.00%)
  1 (10.00%) low severe
  2 (20.00%) high mild
lib/read_file/2048 kiB  time:   [95.851 ms 97.228 ms 98.743 ms]
                        thrpt:  [20.255 MiB/s 20.570 MiB/s 20.866 MiB/s]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low mild
  1 (10.00%) high mild
Benchmarking lib/read_file/4096 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 7.6s.
lib/read_file/4096 kiB  time:   [264.35 ms 277.44 ms 292.85 ms]
                        thrpt:  [13.659 MiB/s 14.417 MiB/s 15.132 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking lib/read_file/8192 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 34.6s.
lib/read_file/8192 kiB  time:   [709.99 ms 819.00 ms 929.15 ms]
                        thrpt:  [8.6100 MiB/s 9.7680 MiB/s 11.268 MiB/s]

lib/remove_file/128 KiB time:   [5.6563 ms 6.1096 ms 6.4388 ms]
                        thrpt:  [19.414 MiB/s 20.459 MiB/s 22.099 MiB/s]
Benchmarking lib/remove_file/256 KiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.6s or enable flat sampling.
lib/remove_file/256 KiB time:   [8.5118 ms 9.1404 ms 9.6413 ms]
                        thrpt:  [25.930 MiB/s 27.351 MiB/s 29.371 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking lib/remove_file/512 KiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.8s or enable flat sampling.
lib/remove_file/512 KiB time:   [5.6618 ms 5.9423 ms 6.2007 ms]
                        thrpt:  [80.636 MiB/s 84.142 MiB/s 88.311 MiB/s]
Benchmarking lib/remove_file/1024 KiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 8.8s or enable flat sampling.
lib/remove_file/1024 KiB
                        time:   [6.5695 ms 6.7217 ms 6.8699 ms]
                        thrpt:  [145.56 MiB/s 148.77 MiB/s 152.22 MiB/s]
lib/remove_file/2048 KiB
                        time:   [5.1991 ms 5.3472 ms 5.4865 ms]
                        thrpt:  [364.53 MiB/s 374.03 MiB/s 384.68 MiB/s]
Benchmarking lib/remove_file/4096 KiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.3s.
lib/remove_file/4096 KiB
                        time:   [9.3682 ms 10.674 ms 12.085 ms]
                        thrpt:  [331.00 MiB/s 374.75 MiB/s 426.98 MiB/s]
Benchmarking lib/remove_file/8192 KiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 15.3s.
lib/remove_file/8192 KiB
                        time:   [15.593 ms 17.756 ms 19.861 ms]
                        thrpt:  [402.80 MiB/s 450.56 MiB/s 513.07 MiB/s]

lib/sync/128 kiB        time:   [210.97 ms 218.02 ms 224.95 ms]
                        thrpt:  [569.02 KiB/s 587.11 KiB/s 606.73 KiB/s]
lib/sync/256 kiB        time:   [273.63 ms 281.76 ms 290.14 ms]
                        thrpt:  [882.32 KiB/s 908.56 KiB/s 935.55 KiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking lib/sync/512 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.2s.
lib/sync/512 kiB        time:   [400.15 ms 408.85 ms 418.31 ms]
                        thrpt:  [1.1953 MiB/s 1.2229 MiB/s 1.2495 MiB/s]
Benchmarking lib/sync/1024 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 11.2s.
lib/sync/1024 kiB       time:   [781.41 ms 792.72 ms 805.63 ms]
                        thrpt:  [1.2413 MiB/s 1.2615 MiB/s 1.2797 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking lib/sync/2048 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 24.1s.
lib/sync/2048 kiB       time:   [1.3538 s 1.5255 s 1.7144 s]
                        thrpt:  [1.1666 MiB/s 1.3111 MiB/s 1.4773 MiB/s]
Benchmarking lib/sync/4096 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 44.8s.
lib/sync/4096 kiB       time:   [5.1564 s 5.6887 s 6.1328 s]
                        thrpt:  [667.88 KiB/s 720.02 KiB/s 794.35 KiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild
Benchmarking lib/sync/8192 kiB: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 206.3s.
lib/sync/8192 kiB       time:   [16.336 s 17.464 s 18.408 s]
                        thrpt:  [445.02 KiB/s 469.08 KiB/s 501.48 KiB/s]
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) low mild

Summary

8MiB:

  • read: 819.00 ms | 9.7680 MiB/s
  • write: 1.6786 s | 4.7659 MiB/s
  • sync: 17.464 s | 469.08 KiB/s

@madadam
Copy link
Collaborator Author

madadam commented Jul 13, 2023

Benchmarks from 2023-07-13 on branch store:

lib/write_file/1 MiB    time:   [199.57 ms 206.09 ms 212.92 ms]
                        thrpt:  [4.6966 MiB/s 4.8523 MiB/s 5.0107 MiB/s]

lib/write_file/8 MiB    time:   [2.1103 s 2.3460 s 2.6424 s]
                        thrpt:  [3.0275 MiB/s 3.4100 MiB/s 3.7909 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe

lib/read_file/1 MiB     time:   [60.425 ms 73.095 ms 85.435 ms]
                        thrpt:  [11.705 MiB/s 13.681 MiB/s 16.550 MiB/s]

lib/read_file/8 MiB     time:   [729.31 ms 802.35 ms 874.28 ms]
                        thrpt:  [9.1504 MiB/s 9.9707 MiB/s 10.969 MiB/s]

lib/remove_file/1 MiB   time:   [5.7220 ms 6.5447 ms 7.3397 ms]
                        thrpt:  [136.24 MiB/s 152.79 MiB/s 174.76 MiB/s]

lib/remove_file/8 MiB   time:   [16.199 ms 18.490 ms 20.133 ms]
                        thrpt:  [397.36 MiB/s 432.67 MiB/s 493.85 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low severe

lib/sync/1 MiB          time:   [581.84 ms 691.68 ms 802.11 ms]
                        thrpt:  [1.2467 MiB/s 1.4458 MiB/s 1.7187 MiB/s]
                        
lib/sync/8 MiB          time:   [16.042 s 16.859 s 17.591 s]
                        thrpt:  [465.68 KiB/s 485.91 KiB/s 510.65 KiB/s]
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) low mild

@madadam
Copy link
Collaborator Author

madadam commented Jul 20, 2023

Benchmarks from 2023-07-20, branch master, commit be546bd7 (after implementing block cache)

lib/write_file/1 MiB    time:   [31.179 ms 31.461 ms 31.816 ms]
                        thrpt:  [31.431 MiB/s 31.785 MiB/s 32.073 MiB/s]
lib/write_file/8 MiB    time:   [817.12 ms 841.96 ms 862.66 ms]
                        thrpt:  [9.2737 MiB/s 9.5017 MiB/s 9.7905 MiB/s]

lib/read_file/1 MiB     time:   [20.560 ms 20.682 ms 20.778 ms]
                        thrpt:  [48.128 MiB/s 48.352 MiB/s 48.639 MiB/s]
lib/read_file/8 MiB     time:   [448.97 ms 473.12 ms 505.27 ms]
                        thrpt:  [15.833 MiB/s 16.909 MiB/s 17.819 MiB/s]

lib/remove_file/1 MiB   time:   [4.0579 ms 4.2400 ms 4.3779 ms]
                        thrpt:  [228.42 MiB/s 235.85 MiB/s 246.43 MiB/s]
lib/remove_file/8 MiB   time:   [12.471 ms 13.143 ms 13.754 ms]
                        thrpt:  [581.66 MiB/s 608.68 MiB/s 641.50 MiB/s]

lib/sync/1 MiB          time:   [440.09 ms 461.65 ms 485.87 ms]
                        thrpt:  [2.0582 MiB/s 2.1661 MiB/s 2.2722 MiB/s]
lib/sync/8 MiB          time:   [9.8484 s 10.870 s 11.606 s]
                        thrpt:  [705.84 KiB/s 753.64 KiB/s 831.81 KiB/s]

@madadam
Copy link
Collaborator Author

madadam commented Jul 25, 2023

Benchmarks from 2023-07-25, commit dc65140b (after implementing index cache)

lib/write_file/1 MiB    time:   [39.940 ms 40.560 ms 41.332 ms]
                        thrpt:  [24.194 MiB/s 24.655 MiB/s 25.037 MiB/s]
lib/write_file/8 MiB    time:   [655.15 ms 681.66 ms 710.10 ms]
                        thrpt:  [11.266 MiB/s 11.736 MiB/s 12.211 MiB/s]

lib/read_file/1 MiB     time:   [18.998 ms 19.818 ms 20.515 ms]
                        thrpt:  [48.746 MiB/s 50.460 MiB/s 52.637 MiB/s]
lib/read_file/8 MiB     time:   [132.52 ms 140.76 ms 149.80 ms]
                        thrpt:  [53.405 MiB/s 56.834 MiB/s 60.368 MiB/s]

lib/remove_file/1 MiB   time:   [4.1927 ms 4.3455 ms 4.5174 ms]
                        thrpt:  [221.37 MiB/s 230.13 MiB/s 238.51 MiB/s]
lib/remove_file/8 MiB   time:   [7.6434 ms 9.9009 ms 12.424 ms]
                        thrpt:  [643.90 MiB/s 808.01 MiB/s 1.0221 GiB/s]

lib/sync/1 MiB          time:   [516.70 ms 526.93 ms 536.37 ms]
                        thrpt:  [1.8644 MiB/s 1.8978 MiB/s 1.9353 MiB/s]
lib/sync/8 MiB          time:   [11.312 s 12.845 s 13.932 s]
                        thrpt:  [588.00 KiB/s 637.74 KiB/s 724.16 KiB/s]

@madadam
Copy link
Collaborator Author

madadam commented Aug 30, 2023

Benchmarks from 2023-08-30, commit eb88357 (after fixing some bugs in the index cache and implementing block expiration)

lib/write_file/1 MiB    time:   [41.547 ms 45.087 ms 49.703 ms]
                        thrpt:  [20.120 MiB/s 22.179 MiB/s 24.069 MiB/s]
lib/write_file/8 MiB    time:   [731.90 ms 757.83 ms 781.03 ms]
                        thrpt:  [10.243 MiB/s 10.557 MiB/s 10.930 MiB/s]

lib/read_file/1 MiB     time:   [18.647 ms 19.848 ms 21.076 ms]
                        thrpt:  [47.446 MiB/s 50.384 MiB/s 53.628 MiB/s]
lib/read_file/8 MiB     time:   [157.04 ms 168.33 ms 180.30 ms]
                        thrpt:  [44.371 MiB/s 47.525 MiB/s 50.943 MiB/s]

lib/remove_file/1 MiB   time:   [4.3012 ms 4.6538 ms 4.9002 ms]
                        thrpt:  [204.07 MiB/s 214.88 MiB/s 232.49 MiB/s]
lib/remove_file/8 MiB   time:   [8.8383 ms 10.772 ms 13.173 ms]
                        thrpt:  [607.29 MiB/s 742.69 MiB/s 905.15 MiB/s]

lib/sync/1 MiB          time:   [514.49 ms 567.70 ms 658.60 ms]
                        thrpt:  [1.5184 MiB/s 1.7615 MiB/s 1.9437 MiB/s]
lib/sync/8 MiB          time:   [11.353 s 12.212 s 13.120 s]
                        thrpt:  [624.39 KiB/s 670.79 KiB/s 721.58 KiB/s]

@madadam
Copy link
Collaborator Author

madadam commented Aug 31, 2023

Benchmarks from 2023-08-31, commit a6fe1d8 (after implementing local write batching)

lib/write_file/1 MiB    time:   [38.736 ms 41.759 ms 45.865 ms]
                        thrpt:  [21.803 MiB/s 23.947 MiB/s 25.816 MiB/s]
lib/write_file/8 MiB    time:   [184.70 ms 191.09 ms 197.60 ms]
                        thrpt:  [40.486 MiB/s 41.865 MiB/s 43.314 MiB/s]
lib/write_file/16 MiB   time:   [335.61 ms 345.49 ms 356.27 ms]
                        thrpt:  [44.910 MiB/s 46.312 MiB/s 47.674 MiB/s]

lib/read_file/1 MiB     time:   [26.984 ms 28.736 ms 30.969 ms]
                        thrpt:  [32.291 MiB/s 34.799 MiB/s 37.059 MiB/s]
lib/read_file/8 MiB     time:   [130.89 ms 139.32 ms 150.49 ms]
                        thrpt:  [53.161 MiB/s 57.420 MiB/s 61.122 MiB/s]
lib/read_file/16 MiB    time:   [279.10 ms 286.92 ms 295.59 ms]
                        thrpt:  [54.129 MiB/s 55.765 MiB/s 57.327 MiB/s]

lib/sync/1 MiB          time:   [716.29 ms 777.41 ms 850.79 ms]
                        thrpt:  [1.1754 MiB/s 1.2863 MiB/s 1.3961 MiB/s]
lib/sync/8 MiB          time:   [11.510 s 13.102 s 14.790 s]
                        thrpt:  [553.88 KiB/s 625.22 KiB/s 711.70 KiB/s]

Note: removed the remove_file benchmark because it's not meaningful (the actual removal happens in the background and is not counted towards the reported time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant