Skip to content

s3fifo add ghost fifo #21653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
May 9, 2025
Merged

s3fifo add ghost fifo #21653

merged 39 commits into from
May 9, 2025

Conversation

cpegeric
Copy link
Contributor

@cpegeric cpegeric commented Apr 4, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #21654 #21589

What this PR does / why we need it:

  1. add ghost fifo to s3fifo cache as stated in the paper.
  2. clean up shard map to avoid race conditions with items under different locks (shards locks and queue lock).
  3. each item has its own mutex to protect postFunc() and its data and improve the concurrency.
  4. all queues have their own mutex to improve concurrency.
  5. replace VERY SLOW CompareAndSwap() with mutex to protect CacheItem for increment and decrement the frequency.
  6. introduce disable-s3fifo in config.

PR Type

Enhancement


Description

  • Introduced a new ghost structure for S3FIFO caching.

  • Enhanced Cache to integrate ghost for better eviction handling.

  • Added logic to manage ghost entries during enqueue and eviction.

  • Implemented a new file ghost.go for the ghost structure.


Changes walkthrough 📝

Relevant files
Enhancement
fifo.go
Integrate `ghost` structure into S3FIFO cache                       

pkg/fileservice/fifocache/fifo.go

  • Added a ghost field to the Cache structure.
  • Integrated ghost logic into enqueue and eviction processes.
  • Improved cache management by leveraging ghost for better eviction
    handling.
  • +20/-4   
    ghost.go
    Implement `ghost` structure for S3FIFO caching                     

    pkg/fileservice/fifocache/ghost.go

  • Introduced a new ghost structure for S3FIFO caching.
  • Implemented methods for adding, removing, and checking keys in ghost.
  • Added logic to manage the size of the ghost list.
  • +54/-0   

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Copy link

    qodo-merge-pro bot commented Apr 4, 2025

    Qodo Merge was enabled for this repository. To continue using it, please link your Git account with your Qodo account here.

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Potential Memory Leak

    The ghost implementation doesn't limit the size of the items map when keys are manually removed. While the list is capped, the map could grow unbounded if keys are frequently added and removed.

    func (b *ghost[K]) remove(key K) {
    	if e, ok := b.items[key]; ok {
    		b.ll.Remove(e)
    		delete(b.items, key)
    	}
    }
    Code Duplication

    The logic for handling ghost items is duplicated in both the main enqueue method and the help enqueue section. This could lead to maintenance issues if the logic needs to change.

    if c.ghost.contains(item.key) {
    	c.ghost.remove(item.key)
    	c.queue2.enqueue(item)
    	c.used2 += item.size
    } else {
    	c.queue1.enqueue(item)
    	c.used1 += item.size
    }
    
    // help enqueue
    for {
    	select {
    	case item := <-c.itemQueue:
    		if c.ghost.contains(item.key) {
    			c.ghost.remove(item.key)
    			c.queue2.enqueue(item)
    			c.used2 += item.size
    
    		} else {
    			c.queue1.enqueue(item)
    			c.used1 += item.size
    		}

    Copy link

    qodo-merge-pro bot commented Apr 4, 2025

    Qodo Merge was enabled for this repository. To continue using it, please link your Git account with your Qodo account here.

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Prevent nil pointer dereference
    Suggestion Impact:The commit addresses the nil pointer dereference concern by restructuring the code. Instead of adding the exact suggested nil check, the code was completely refactored. The problematic code at lines 137-138 now directly accesses the ghost field without a nil check, but the ghost field is properly initialized in the New() function at line 99, ensuring it won't be nil.

    code diff:

     	if c.ghost.contains(item.key) {
     		c.ghost.remove(item.key)

    Add a nil check for the ghost field before accessing it. The ghost field is
    initialized in New(), but if the Cache is created through another method or if
    initialization fails, accessing c.ghost could cause a nil pointer dereference.

    pkg/fileservice/fifocache/fifo.go [164-171]

    -if c.ghost.contains(item.key) {
    +if c.ghost != nil && c.ghost.contains(item.key) {
         c.ghost.remove(item.key)
         c.queue2.enqueue(item)
         c.used2 += item.size
     } else {
         c.queue1.enqueue(item)
         c.used1 += item.size
     }

    [Suggestion has been applied]

    Suggestion importance[1-10]: 8

    __

    Why: This suggestion adds a critical nil check that prevents potential runtime panics if the ghost field is nil. Since this code appears in the main execution path of the cache, a nil pointer dereference here would cause the entire cache to fail, making this a high-impact safety improvement.

    Medium
    Add nil check
    Suggestion Impact:The commit completely refactored the ghost implementation, including changing the eviction logic. The suggestion's concern about nil checks was addressed in the new implementation at lines 68-72, where a nil check is performed before accessing the element.

    code diff:

    +	if g.queue.Len() >= g.capacity {
    +		elem := g.queue.Front()
    +		if elem != nil {
    +			evictedKey := g.queue.Remove(elem).(K)
    +			delete(g.keys, evictedKey)
    +		}
     	}

    Add a nil check before accessing e.Value to prevent a potential panic if e is
    nil. The current code assumes e.Back() always returns a valid element, but it
    could return nil if the list is empty.

    pkg/fileservice/fifocache/ghost.go [27-31]

     for b.ll.Len() >= b.size {
         e := b.ll.Back()
    +    if e == nil {
    +        break
    +    }
         delete(b.items, e.Value.(K))
         b.ll.Remove(e)
     }

    [Suggestion has been applied]

    Suggestion importance[1-10]: 7

    __

    Why: This suggestion adds an important safety check to prevent a potential panic if b.ll.Back() returns nil. While the list should rarely be empty given the context, defensive programming with nil checks is a good practice for robustness.

    Medium
    • Update

    @mergify mergify bot merged commit 25554fb into matrixorigin:main May 9, 2025
    18 checks passed
    LeftHandCold added a commit to LeftHandCold/matrixone that referenced this pull request May 15, 2025
    cpegeric added a commit to cpegeric/matrixone that referenced this pull request May 15, 2025
    mergify bot pushed a commit that referenced this pull request May 20, 2025
    XuPeng-SH pushed a commit to XuPeng-SH/matrixone that referenced this pull request May 21, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    kind/enhancement Review effort 2/5 size/L Denotes a PR that changes [500,999] lines
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    6 participants