Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Basic/LRU Segment Cache is not thread-safe #3334

Open
tazarov opened this issue Dec 19, 2024 · 0 comments · May be fixed by #3335
Open

[Bug]: Basic/LRU Segment Cache is not thread-safe #3334

tazarov opened this issue Dec 19, 2024 · 0 comments · May be fixed by #3335
Assignees
Labels
bug Something isn't working by-chroma Local Chroma An improvement to Local (single node) Chroma

Comments

@tazarov
Copy link
Contributor

tazarov commented Dec 19, 2024

What happened?

A user on discord brought up the following error when discussing memory management strategies and the use of the LRU cache:

image

image

Ref: https://discord.com/channels/1073293645303795742/1318044900134092920/1318478281468805152

After some digging and experimentation, turns out this is a thread-safety issue with LRU Segment cache implementation (also applies to the Basic cache, granted that there isn't much that can go wrong there with the current implementation).

As it turns out the LRU cache never had proper testing.

A simple test can reproduce the issue:

import threading
import numpy as np
import pytest
import uuid
from typing import Dict

from chromadb.segment.impl.manager.cache.cache import SegmentLRUCache
from chromadb.types import Segment, SegmentScope


def new_segment(collection_id: uuid.UUID) -> Segment:
    return Segment(
        id=uuid.uuid4(),
        type="test",
        scope=SegmentScope.VECTOR,
        collection=collection_id,
        metadata=None,
        file_paths={},
    )


@pytest.fixture
def cache_setup():
    def _get_segment_disk_size(key: uuid.UUID) -> int:
        return np.random.randint(1, 100)

    def callback_cache_evict(segment: Segment) -> None:
        pass

    class SetupData:
        def __init__(self):
            self.cache = SegmentLRUCache(
                capacity=1000,
                callback=lambda k, v: callback_cache_evict(v),
                size_func=lambda k: _get_segment_disk_size(k),
            )
            self.iterations = 10000
            self.num_threads = 50
            self.errors: Dict[str, int] = {
                "concurrency_error": 0
            }
            self.lock = threading.Lock()

    return SetupData()


def test_thread_safety(cache_setup):
    """Test that demonstrates thread safety issues in the LRU cache"""

    def worker():
        """Worker that performs multiple cache operations"""
        iter =0
        try:
            while cache_setup.errors["concurrency_error"] <=0:
                iter += 1
                key = uuid.uuid4()
                segment = new_segment(key)
                cache_setup.cache.set(key, segment)
                if iter >= cache_setup.iterations:
                    print(f"Stopping due to max iterations: {iter} reached")
                    break

        except Exception as e:
            if "dictionary changed size during iteration" in str(e):
                with cache_setup.lock:
                    cache_setup.errors["concurrency_error"] += 1

    # Create and start threads
    threads = []
    for _ in range(cache_setup.num_threads):
        t = threading.Thread(target=worker)
        threads.append(t)
        t.start()

    # Wait for all threads to complete
    for t in threads:
        t.join()

    # Assert that we found thread safety issues
    assert cache_setup.errors["concurrency_error"] > 0, "Expected to find thread safety issues but none were detected"

Versions

Chroma 0.4.23+, any python version and any OS

Relevant log output

No response

@tazarov tazarov added the bug Something isn't working label Dec 19, 2024
@tazarov tazarov added Local Chroma An improvement to Local (single node) Chroma by-chroma labels Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working by-chroma Local Chroma An improvement to Local (single node) Chroma
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants