Replies: 11 comments 8 replies
-
Thanks for your research into this @werbenhu, it's very comprehensive!
Did you mean temporarily use Badger/v1 ? |
Beta Was this translation helpful? Give feedback.
-
@mochi-co I personally prefer using V4 as V1 consumes a very large amount of disk space. Although the issue with GC not working in V4, it utilizes a much smaller amount of disk space. Perhaps there might be some inaccuracies in my test results. You all can also test it in your own environments. |
Beta Was this translation helpful? Give feedback.
-
@werbenhu Ah, I understand what you mean. Although it's not ideal, this sounds to me to be the best compromise for now since most complaints stem from the use of disk space. A 10x reduction in disk usage should be beneficial even with the GC issues. |
Beta Was this translation helpful? Give feedback.
-
@mochi-co I've submitted a PR #376, everyone please take a look first. It would be best if we could find a better solution (such as Badger community fixing this GC issue tomorrow😁). |
Beta Was this translation helpful? Give feedback.
-
Here at https://github.com/smallnest/kvbench, there are many key-value data implementations written in pure Go. Perhaps we can select one of them, a stable option, and implement a storage hook for it as a built-in persistent database for users to choose from. |
Beta Was this translation helpful? Give feedback.
-
I've found Pebble to be so perfect. Inserting 1 million data entries using the NoSync option only takes 2 seconds, with disk usage only at 80 MB. Searching for data by key prefix from within the 1 million entries is also very fast. I've decided to make time to implement the Pebble hook. Here's the code I tested: package main
import (
"fmt"
"log"
"net/http"
"os"
"os/signal"
"strconv"
"syscall"
"github.com/cockroachdb/pebble"
"github.com/gin-gonic/gin"
)
const (
PATH = ".pebble"
RATIO = 0.5
)
var db *pebble.DB
func insert(c *gin.Context) {
prefix := c.Query("prefix")
size := c.Query("size")
cnt, _ := strconv.Atoi(size)
fmt.Printf("insert cnt:%d, prefix:%s\n", cnt, prefix)
for i := 1; i <= cnt; i++ {
bs := make([]byte, 1024)
for j := 0; j < 1024; j++ {
bs[j] = byte(i)
}
err := db.Set([]byte(prefix+strconv.Itoa(i)), bs, pebble.NoSync)
if err != nil {
c.JSON(http.StatusOK, gin.H{
"err": err.Error(),
})
return
}
}
c.JSON(http.StatusOK, gin.H{
"message": "success",
})
}
func count(c *gin.Context) {
prefix := c.Query("prefix")
var count int
var iter *pebble.Iterator
if len(prefix) == 0 {
iter, _ = db.NewIter(nil)
} else {
iter, _ = db.NewIter(&pebble.IterOptions{
LowerBound: []byte(prefix),
UpperBound: keyUpperBound([]byte(prefix)),
})
}
for iter.First(); iter.Valid(); iter.Next() {
count++
}
if err := iter.Close(); err != nil {
log.Fatal(err)
}
c.JSON(http.StatusOK, gin.H{
"count": count,
})
}
func delete(c *gin.Context) {
prefix := c.Query("prefix")
size := c.Query("size")
cnt, _ := strconv.Atoi(size)
fmt.Printf("delete cnt:%d, prefix:%s\n", cnt, prefix)
for i := 1; i <= cnt; i++ {
err := db.Delete([]byte(prefix+strconv.Itoa(i)), pebble.NoSync)
if err != nil {
log.Fatal(err)
}
if err != nil {
c.JSON(http.StatusOK, gin.H{
"err": err.Error(),
})
return
}
}
c.JSON(http.StatusOK, gin.H{
"message": "success",
})
}
func find(c *gin.Context) {
key := c.Query("key")
value, closer, err := db.Get([]byte(key))
if err != nil {
c.JSON(http.StatusOK, gin.H{
"err": err.Error(),
})
return
}
defer closer.Close()
c.JSON(http.StatusOK, gin.H{
"message": "success",
"bs": value,
})
}
func keyUpperBound(b []byte) []byte {
end := make([]byte, len(b))
copy(end, b)
for i := len(end) - 1; i >= 0; i-- {
end[i] = end[i] + 1
if end[i] != 0 {
return end[:i+1]
}
}
return nil // no upper-bound
}
func main() {
sigs := make(chan os.Signal, 1)
done := make(chan bool, 1)
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
go func() {
<-sigs
done <- true
}()
var err error
db, err = pebble.Open(PATH, &pebble.Options{})
if err != nil {
log.Fatal(err)
}
r := gin.Default()
r.GET("/insert", insert)
r.GET("/delete", delete)
r.GET("/find", find)
r.GET("/count", count)
go r.Run(":9000")
<-done
fmt.Printf("Server done\n")
if err := db.Close(); err != nil {
fmt.Printf("close db err:%s\n", err.Error())
}
}
|
Beta Was this translation helpful? Give feedback.
-
I will test this later on this weekend. I have not used Pebble before but I trust almost anything that comes out of CockroachDB. It really seems like BadgerDB is almost more trouble than it is worth. |
Beta Was this translation helpful? Give feedback.
-
I agree with @dgduncan @werbenhu regarding Pebble. For what it's worth, I only implemented Badger as it was touted as the successor to Bolt at the time. I didn't put any extensive thought or usage research into it. Absolutely happy to deprecate it and integrate Pebble as the preferred storage mechanism. |
Beta Was this translation helpful? Give feedback.
-
@dgduncan I don't have any time this weekend. It would be great if you could implement this hook if you have time. |
Beta Was this translation helpful? Give feedback.
-
Hello, what's the status now? |
Beta Was this translation helpful? Give feedback.
-
@mochi-co @dgduncan PR #378 has added a persistence hook based on cockroachdb/pebble. Please review it. |
Beta Was this translation helpful? Give feedback.
-
Regarding issues with BadgerDB:
The difference between badgerhold and using Badger/v4 is essentially the difference between Badger/v1 and Badger/v4, as Badgerhold utilizes V1.
When using github.com/dgraph-io/badger and Badger/v2, inserting 1 million data entries consumes over 2 GB of disk space. Even after deleting all data and calling GC, there is still around 1 GB of residual disk space left. I haven't been able to figure out why this is happening. Additionally, inserting 1 million data entries is very slow.
With Badger/v3 and Badger/v4, inserting data is extremely fast. For the same 1 million data entries, disk usage is only between 200-300 MB. Insertion speed is also very fast. However, both versions suffer from the issue of GC not working properly. After deleting all data and calling GC, there is no reduction in disk space. You can refer to this issue: GC doesn't seem to run.
Considering the current situation, I think it might be best to temporarily use Badger/v4. I am still exploring whether there are alternative solutions. @thedevop @mochi-co @dgduncan I would like to hear your opinions on this matter.
The following is the code I tested:
Beta Was this translation helpful? Give feedback.
All reactions