Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blobovnicza: Add benchmark to test different tree settings #2457

Merged

Conversation

cthulhu-rider
Copy link
Contributor

@cthulhu-rider cthulhu-rider commented Jul 25, 2023

current results:

goos: linux
goarch: amd64
pkg: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/blobstor/blobovniczatree cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
BenchmarkBlobovniczas_Put/tree=1x0-8                 100          53517515 ns/op         4609697 B/op        109 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                100          53417156 ns/op         5213543 B/op        119 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                 100          52124287 ns/op         5117641 B/op        158 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 100          37065895 ns/op         2129467 B/op        188 allocs/op

according to test, tree management doesn't really help when number of layers is small. 4x4 showed better performance which is suspicious: right now it may be expected since we didn't try to optimize working with single DB instance as described in #2453.

P.S. in current test implementation, it's not obvious do we force to switch to other DBs or not. I tried to reach this using -benchtime=100x where 100 is fullSizeLimit/singleObjectSize.

@codecov
Copy link

codecov bot commented Jul 25, 2023

Codecov Report

Merging #2457 (55243c8) into master (7b86fa2) will increase coverage by 0.02%.
Report is 5 commits behind head on master.
The diff coverage is 100.00%.

❗ Current head 55243c8 differs from pull request most recent head d22c757. Consider uploading reports for the commit d22c757 to get more accurate results

@@            Coverage Diff             @@
##           master    #2457      +/-   ##
==========================================
+ Coverage   29.44%   29.46%   +0.02%     
==========================================
  Files         399      399              
  Lines       30385    30392       +7     
==========================================
+ Hits         8946     8955       +9     
+ Misses      20696    20694       -2     
  Partials      743      743              
Files Changed Coverage Δ
pkg/local_object_storage/blobstor/put.go 88.46% <100.00%> (+4.25%) ⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@roman-khimov
Copy link
Member

Here:

BenchmarkBlobovniczas_Put/tree=1x0-8                  57          27046852 ns/op         4786905 B/op        113 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  36          32247503 ns/op         4681081 B/op        109 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  40          31945389 ns/op         4855398 B/op        110 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 34          33063479 ns/op         5213628 B/op        120 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 37          33285487 ns/op         5195977 B/op        119 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 45          33081278 ns/op         5128587 B/op        120 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  44          30096345 ns/op         5198224 B/op        150 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  70          19756034 ns/op         4950343 B/op        151 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  74          20097768 ns/op         5203916 B/op        153 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  74          17488102 ns/op         2129200 B/op        183 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  70          16829594 ns/op         2128992 B/op        182 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  74          18119955 ns/op         2129311 B/op        187 allocs/op

But it's intended for small objects, so I've made:

--- a/pkg/local_object_storage/blobstor/blobovniczatree/put_test.go
+++ b/pkg/local_object_storage/blobstor/blobovniczatree/put_test.go
@@ -47,7 +47,7 @@ func benchmarkPutMN(b *testing.B, depth, width uint64) {
                nBlobovniczas *= width
        }
 
-       const objSizeLimit = 1 << 20
+       const objSizeLimit = 1 << 12
        const fullSizeLimit = 100 << 20
 
        bbcz := NewBlobovniczaTree(
@@ -66,6 +66,7 @@ func benchmarkPutMN(b *testing.B, depth, width uint64) {
                RawData: make([]byte, objSizeLimit),
        }
 
+       rand.Read(prm.RawData)
        b.ReportAllocs()
        b.ResetTimer()
 
@@ -74,7 +75,6 @@ func benchmarkPutMN(b *testing.B, depth, width uint64) {
        for i := 0; i < b.N; i++ {
                b.StopTimer()
                prm.Address = oidtest.Address()
-               rand.Read(prm.RawData)
                b.StartTimer()
 
                _, err = bbcz.Put(prm)

And got:

BenchmarkBlobovniczas_Put/tree=1x0-8                  87          12323530 ns/op           33656 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  92          13031141 ns/op           33912 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  84          12944640 ns/op           33466 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 93          12600201 ns/op           33818 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 88          12914871 ns/op           33901 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 88          12480286 ns/op           33302 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  81          12740758 ns/op           34692 B/op        111 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  93          13036401 ns/op           34883 B/op        111 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  92          13253987 ns/op           35103 B/op        111 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  91          13660381 ns/op           37916 B/op        179 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  88          15612270 ns/op           37448 B/op        179 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  85          15245248 ns/op           38210 B/op        180 allocs/op

Then const objSizeLimit = 1 for the fun of it:

BenchmarkBlobovniczas_Put/tree=1x0-8                  84          13757404 ns/op           12682 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  80          14032831 ns/op           12531 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  82          13922973 ns/op           12701 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 94          14235156 ns/op           12628 B/op         69 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 85          14489805 ns/op           12677 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 81          14262814 ns/op           12712 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  81          14078772 ns/op           12554 B/op         99 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  88          14096657 ns/op           12192 B/op        100 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  96          14447775 ns/op           12224 B/op         99 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  93          14432768 ns/op           20945 B/op        164 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  93          13946014 ns/op           20724 B/op        164 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  81          14332493 ns/op           21495 B/op        167 allocs/op

4×4 tests take noticeably more time to initialize, probably related to #2215.

@roman-khimov
Copy link
Member

And for everyone who loves Bolt's Batch(), 4K rawdata again:

BenchmarkBlobovniczas_Put/tree=1x0-8                  79          13972488 ns/op           32855 B/op         76 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  75          14311804 ns/op           32893 B/op         76 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  70          14741192 ns/op           33038 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 74          21875172 ns/op           33024 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 51          19937993 ns/op           33017 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 60          19911600 ns/op           33390 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  57          19892171 ns/op           34676 B/op        112 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  55          20462990 ns/op           35634 B/op        112 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  51          20171586 ns/op           35801 B/op        112 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  74          16178143 ns/op           37526 B/op        177 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  82          15692310 ns/op           38175 B/op        180 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  70          15463406 ns/op           38270 B/op        181 allocs/op

And now 🪄:

--- a/pkg/local_object_storage/blobovnicza/put.go
+++ b/pkg/local_object_storage/blobovnicza/put.go
@@ -52,7 +52,7 @@ func (b *Blobovnicza) Put(prm PutPrm) (PutRes, error) {
        bucketName := bucketForSize(sz)
        key := addressKey(prm.addr)
 
-       err := b.boltDB.Batch(func(tx *bbolt.Tx) error {
+       err := b.boltDB.Update(func(tx *bbolt.Tx) error {
                if b.full() {
                        return ErrFull
                }

Which leads to

BenchmarkBlobovniczas_Put/tree=1x0-8                 702           2028573 ns/op           34427 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                 666           1835332 ns/op           34542 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                 670           1910506 ns/op           34466 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                649           2054211 ns/op           34187 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                633           1892979 ns/op           34329 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                662           2001137 ns/op           34367 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                 556           1927446 ns/op           34935 B/op        103 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                 634           1939887 ns/op           34880 B/op        104 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                 616           1912483 ns/op           35060 B/op        104 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 483           3807073 ns/op           37531 B/op        153 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 532           3495096 ns/op           37740 B/op        152 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 362           4435523 ns/op           36668 B/op        156 allocs/op

goos: linux
goarch: amd64
pkg: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/blobstor/blobovniczatree
cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
BenchmarkBlobovniczas_Put/tree=1x0-8                  61          19747177 ns/op           33398 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 60          18623677 ns/op           33600 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  56          20861449 ns/op           36191 B/op        112 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  43          25999988 ns/op           38511 B/op        182 allocs/op

Signed-off-by: Leonard Lyubich <[email protected]>
@cthulhu-rider cthulhu-rider force-pushed the bugfix/blobovnicza-tree-zero-depth branch from 2080185 to cc3fb47 Compare July 25, 2023 15:12
@cthulhu-rider
Copy link
Contributor Author

But it's intended for small objects, so I've made

yep, I got excited with 1M which is default for storage node. Changed to 4K and got similar results, so updated the test

And for everyone who loves Bolt's Batch()

personally, i still have not had the opportunity to test the benefits of native batching in Bolt in practice, but, according to the results, its benefits are doubtful in blobovnicza tree


I think we have a sufficient evidence base to try to optimize the data structure in the form of a single database with custom batching, huh? @roman-khimov @carpawell

@roman-khimov
Copy link
Member

Want more fun? Add some threading into the mix:

@@ -66,28 +67,33 @@ func benchmarkPutMN(b *testing.B, depth, width uint64) {
                RawData: make([]byte, objSizeLimit),
        }
 
+       rand.Read(prm.RawData)
        b.ReportAllocs()
        b.ResetTimer()
 
-       var err error
+       var wg sync.WaitGroup
 
-       for i := 0; i < b.N; i++ {
-               b.StopTimer()
-               prm.Address = oidtest.Address()
-               rand.Read(prm.RawData)
-               b.StartTimer()
+       var f = func(prm common.PutPrm) {
+               var err error
+               for i := 0; i < b.N; i++ {
+                       prm.Address = oidtest.Address()
 
-               _, err = bbcz.Put(prm)
+                       _, err = bbcz.Put(prm)
 
-               b.StopTimer()
-               if err != nil {
-                       if errors.Is(err, common.ErrNoSpace) {
-                               break
+                       if err != nil {
+                               if errors.Is(err, common.ErrNoSpace) {
+                                       break
+                               }
+                               require.NoError(b, err)
                        }
-                       require.NoError(b, err)
                }
-               b.StartTimer()
+               wg.Done()
        }
+       for j := 0; j < 20; j++ {
+               wg.Add(1)
+               go f(prm)
+       }
+       wg.Wait()
 }
 
 func BenchmarkBlobovniczas_Put(b *testing.B) {

Batched:

BenchmarkBlobovniczas_Put/tree=1x0-8                  82          14652080 ns/op          527065 B/op        771 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  79          17694800 ns/op          521569 B/op        770 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  49          22042910 ns/op          504812 B/op        718 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 49          23533586 ns/op          498820 B/op        709 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 54          43626269 ns/op          503995 B/op        723 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 73          14796724 ns/op          515087 B/op        754 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  86          14203539 ns/op          589119 B/op       1528 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  81          14044176 ns/op          576998 B/op       1511 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  84          14373252 ns/op          581395 B/op       1520 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  68          15251037 ns/op          866981 B/op       3052 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  72          22075904 ns/op          881648 B/op       3063 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  61          37769824 ns/op          867537 B/op       3084 allocs/op

Non-batched:

BenchmarkBlobovniczas_Put/tree=1x0-8                  16          74108820 ns/op          680705 B/op       1510 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  14          74459352 ns/op          679138 B/op       1492 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  15          75338695 ns/op          674179 B/op       1489 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 15          77471599 ns/op          686840 B/op       1500 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 15          73161863 ns/op          676613 B/op       1504 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 15          73685846 ns/op          679125 B/op       1498 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  38          28185505 ns/op          711560 B/op       2115 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  37          29134521 ns/op          713372 B/op       2116 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  40          31421310 ns/op          714485 B/op       2128 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 194           5928512 ns/op          927681 B/op       2907 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 177           5974973 ns/op          927027 B/op       2909 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 182           6380906 ns/op          925895 B/op       2908 allocs/op

100 threads batched:

BenchmarkBlobovniczas_Put/tree=1x0-8                  46          32179025 ns/op         2479720 B/op       3685 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  63          36400444 ns/op         2534475 B/op       3832 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  38          39817475 ns/op         2443816 B/op       3571 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 32          38128756 ns/op         2335914 B/op       3434 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 34          36719362 ns/op         2328869 B/op       3380 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 34          38328624 ns/op         2318948 B/op       3381 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  36          66034753 ns/op         2782671 B/op       7459 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  49          25038323 ns/op         2680942 B/op       6895 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  50          24928362 ns/op         2631979 B/op       6858 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  34          30344009 ns/op         4119804 B/op      14370 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  44          23330084 ns/op         4158518 B/op      14287 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  44          56350572 ns/op         4269706 B/op      14784 allocs/op

and not:

BenchmarkBlobovniczas_Put/tree=1x0-8                   6         378786863 ns/op         3420346 B/op       7844 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                   4         387815959 ns/op         3413120 B/op       7665 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                   3         347173756 ns/op         3375656 B/op       7548 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                  3         382709638 ns/op         3385261 B/op       7535 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                  3         384898018 ns/op         3405813 B/op       7527 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                  4         375371866 ns/op         3444208 B/op       7691 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                   9         154033942 ns/op         3552957 B/op      10697 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                   7         154035856 ns/op         3545157 B/op      10573 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                   7         157911228 ns/op         3545810 B/op      10581 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  31          37806730 ns/op         4648195 B/op      24533 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  15          82339257 ns/op         4463610 B/op      19863 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  60          21126481 ns/op         4325975 B/op      16357 allocs/op

@cthulhu-rider
Copy link
Contributor Author

Want more fun?

not really, swing in one direction or the other

Add some threading into the mix

i can add parallelism (as option or not)

@roman-khimov
Copy link
Member

not really, swing in one direction or the other

It's not in fact. It's very very consistent and predictable. But the point remains, bbcz tree adds zero value.

@roman-khimov
Copy link
Member

We can add all modes, merge this one (just to save it for the future) and then take a 🔪 and do #2453.

goos: linux
goarch: amd64
pkg: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/blobstor/blobovniczatree
cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
BenchmarkBlobovniczas_Put/tree=1x0_parallel-8                 50          25138722 ns/op          503770 B/op        718 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0_parallel-8                49          24535074 ns/op          502562 B/op        723 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2_parallel-8                 30          53013230 ns/op          630769 B/op       1748 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4_parallel-8                 19          54977576 ns/op          762231 B/op       3308 allocs/op

Signed-off-by: Leonard Lyubich <[email protected]>
@cthulhu-rider
Copy link
Contributor Author

added parallel runs d22c757

@roman-khimov roman-khimov merged commit 782f658 into nspcc-dev:master Jul 26, 2023
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants