You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed with testing Pulsar 4.0.3 / BookKeeper 4.17.1, that there are BookKeeper crashes that happen.
# │
│ # A fatal error has been detected by the Java Runtime Environment: │
│ # │
│ # SIGSEGV (0xb) at pc=0x0000ffff7f2d5f48, pid=1, tid=237 │
│ # │
│ # JRE version: OpenJDK Runtime Environment Corretto-21.0.6.7.1 (21.0.6+7) (build 21.0.6+7-LTS) │
│ # Java VM: OpenJDK 64-Bit Server VM Corretto-21.0.6.7.1 (21.0.6+7-LTS, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64) │
│ # Problematic frame: │
│ # C [librocksdbjni14395278800560636484.so+0x2a9f48] Java_org_rocksdb_RocksDB_getLongProperty+0x150 │
│ # │
│ # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again │
│ # │
│ # An error report file with more information is saved as: │
│ # /tmp/hs_err_pid1.log │
│ # │
│ # If you would like to submit a bug report, please visit: │
│ # https://github.com/corretto/corretto-21/issues/ │
│ # The crash happened outside the Java Virtual Machine in native code. │
│ # See problematic frame for where to report the bug. │
Steps are unclear. I upgraded a local Kubernetes cluster running with Apache Pulsar Helm chart version 3.9.0 / Pulsar 4.0.2 to current master branch version of the Helm chart / Pulsar 4.0.3
Expected behavior
Crashes shouldn't happen.
Additional context
I checked the code in KeyValueStorageRocksDB and there doesn't seem to be a solution to prevent calling count() after the storage is closed.
When looking at the close implementation, I noticed that before closing, the RocksDB WAL isn't flushed with fsync. There seems to be another issue where a graceful shutdown isn't performed for RocksDb when running with BookKeeper.
The text was updated successfully, but these errors were encountered:
I checked the code in KeyValueStorageRocksDB and there doesn't seem to be a solution to prevent calling count() after the storage is closed.
Yes, I think we need to stop the gauge when the object is closed, otherwise it will continue
When looking at the close implementation, I noticed that before closing, the RocksDB WAL isn't flushed with fsync. There seems to be another issue where a graceful shutdown isn't performed for RocksDb when running with BookKeeper.
We don't rely on flushing the wal on close. We don't care about wal, because we already rely on the BK journal.
BUG REPORT
Describe the bug
I noticed with testing Pulsar 4.0.3 / BookKeeper 4.17.1, that there are BookKeeper crashes that happen.
This occurs at
bookkeeper/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/KeyValueStorageRocksDB.java
Lines 511 to 518 in 26da346
getLongProperty
.This gets called from stats:
bookkeeper/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java
Lines 57 to 65 in 26da346
To Reproduce
Steps are unclear. I upgraded a local Kubernetes cluster running with Apache Pulsar Helm chart version 3.9.0 / Pulsar 4.0.2 to current master branch version of the Helm chart / Pulsar 4.0.3
Expected behavior
Crashes shouldn't happen.
Additional context
I checked the code in KeyValueStorageRocksDB and there doesn't seem to be a solution to prevent calling
count()
after the storage is closed.When looking at the
close
implementation, I noticed that before closing, the RocksDB WAL isn't flushed with fsync. There seems to be another issue where a graceful shutdown isn't performed for RocksDb when running with BookKeeper.The text was updated successfully, but these errors were encountered: