🌲 B-Tree Implementation

📚 Overview

This project implements a B-tree data structure, a self-balancing tree that maintains sorted data and allows searches, insertions, and deletions in logarithmic time. Unlike binary trees, B-trees are optimized for systems that read and write large blocks of data, such as databases and file systems.

Reference: "Baeldung B-Tree Data Structure", "Database Index Internals: Understanding the Data Structures"

🔑 Key Algorithm Points

B-Tree Properties

🏢 Each node contains multiple keys and children (high branching factor)
🔄 All leaf nodes are at the same depth (perfectly balanced)
📏 Nodes have between t-1 and 2t-1 keys (where t is the minimum degree)
🌿 Non-leaf nodes have between t and 2t children
📦 All keys within a node are sorted
🔍 Every key serves as a separator between child subtrees

Core Operations

🔎 Search: Navigate from root to leaf by comparing keys at each node
➕ Insert: Add key and split nodes when they become too full
➖ Delete: Remove key and rebalance by redistributing or merging nodes

🛠️ Core Methods

BTree Class

public class BTree<K extends Comparable<K>, V> {
    // Main methods
}

Method	Description	Time Complexity
`search(K key)`	🔍 Searches for a key and returns its value if found	O(log n)
`insert(K key, V value)`	➕ Inserts a key-value pair into the tree	O(log n)
`delete(K key)`	➖ Removes a key (and its value) from the tree	O(log n)
`print()`	📄 Prints the B-tree structure	O(n)

Internal Helper Methods

Method	Description
`splitChild(Node<K, V> parent, int index)`	🪓 Splits a full child node during insertion
`insertNonFull(Node<K, V> node, K key, V value)`	📥 Inserts a key into a non-full node
`removeFromLeaf(Node<K, V> node, int idx)`	🍃 Removes a key from a leaf node
`removeFromNonLeaf(Node<K, V> node, int idx)`	🌳 Removes a key from an internal node
`getPredecessor(Node<K, V> node, int idx)`	⬅️ Finds the predecessor of a key
`getSuccessor(Node<K, V> node, int idx)`	➡️ Finds the successor of a key
`borrowFromPrev(Node<K, V> node, int idx)`	◀️ Borrows a key from the previous child during deletion
`borrowFromNext(Node<K, V> node, int idx)`	▶️ Borrows a key from the next child during deletion
`merge(Node<K, V> node, int idx)`	🔄 Merges two child nodes during deletion

📊 Performance Test Scenarios

🚀 B-Tree Strengths

Random Access with Large Datasets 📈
- Test Description: Compares B-tree vs TreeMap for lookups in large datasets (100,000+ elements)
- Expected Results:
  - 🥇 B-tree should demonstrate comparable or slightly better performance for random lookups
  - With larger minimum degree (10+), B-tree's advantage becomes more significant
- Why: B-trees minimize the number of disk/memory accesses with higher branching factor. Each node visit retrieves multiple keys at once, unlike binary trees where each node contains a single key.
- Real-world Impact: In database systems where disk I/O is expensive, B-trees significantly outperform binary search trees.
Range Queries 🔢
- Test Description: Evaluates sequential range data retrieval between B-tree and TreeMap
- Expected Results:
  - 🥇 B-tree should excel at retrieving keys in a specific range
  - Advantage increases with larger node degrees and range sizes
- Why: Keys stored in sorted order within nodes allow efficient traversal between adjacent keys with fewer node jumps. Once a node is loaded, multiple keys in sequence can be processed.
- Real-world Impact: Critical for database query optimizers that need to retrieve ranges of records efficiently.

🐢 B-Tree Weaknesses

Memory Overhead 💾
- Test Description: Measures memory consumption with different tree structures and node degrees
- Expected Results:
  - 🚨 B-trees should show higher memory usage compared to binary tree implementations
  - Memory usage increases with node degree
- Why: B-tree nodes must allocate space for the maximum possible number of keys and children, often resulting in partially filled nodes. This is a fundamental space-time tradeoff in the B-tree design.
- Real-world Impact: In memory-constrained environments, B-trees may not be the optimal choice unless disk I/O savings outweigh the memory cost.
Update Performance ⚙️
- Test Description: Compares update operation speed between B-tree, TreeMap, and HashMap
- Expected Results:
  - 🚨 B-tree updates should be slower than both TreeMap and significantly slower than HashMap
  - Performance gap widens as the size of the tree increases
- Why: B-tree updates require complex rebalancing operations, including node splits, merges, and rotations. Each update might trigger cascading operations affecting multiple nodes.
- Real-world Impact: Applications with extremely high write volumes might benefit from alternative structures or buffering strategies.
Sequential Operation Patterns 📉
- Test Description: Tests sequential vs random insertion patterns in B-trees
- Expected Results:
  - 🚨 Sequential inserts should perform worse than random inserts in some cases
  - Performance degradation increases with strictly increasing or decreasing key sequences
- Why: Sequential inserts can cause repeated splitting at the rightmost (or leftmost) path of the tree, leading to poor performance and potentially unbalanced structures. This is particularly problematic in disk-based implementations.
- Real-world Impact: Applications with naturally sequential data (like timestamps) may need to consider key transformation strategies or bulk loading techniques.

🎯 Use Cases

💽 Databases: Ideal for database indexing due to efficient disk I/O
📁 File Systems: Many file systems use B-trees for directory structures
🔄 External Memory Algorithms: Where data doesn't fit entirely in memory
📊 Data Warehousing: For large dataset operations with infrequent updates

🧪 Running the Tests

Execute tests with Gradle:

./gradlew test

The test suite includes both functional tests (BTreeTest) and performance benchmarks (BTreePerformanceTest) that demonstrate both strengths and weaknesses of the B-tree implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
gradle		gradle
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌲 B-Tree Implementation

📚 Overview

🔑 Key Algorithm Points

B-Tree Properties

Core Operations

🛠️ Core Methods

BTree Class

Internal Helper Methods

📊 Performance Test Scenarios

🚀 B-Tree Strengths

🐢 B-Tree Weaknesses

🎯 Use Cases

🧪 Running the Tests

About

Uh oh!

Releases

Packages

Languages

akulik512/b-tree

Folders and files

Latest commit

History

Repository files navigation

🌲 B-Tree Implementation

📚 Overview

🔑 Key Algorithm Points

B-Tree Properties

Core Operations

🛠️ Core Methods

BTree Class

Internal Helper Methods

📊 Performance Test Scenarios

🚀 B-Tree Strengths

🐢 B-Tree Weaknesses

🎯 Use Cases

🧪 Running the Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages