Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page leak in restore_savepoint() #832

Closed
mconst opened this issue Jul 26, 2024 · 6 comments · Fixed by #833 or #846
Closed

Page leak in restore_savepoint() #832

mconst opened this issue Jul 26, 2024 · 6 comments · Fixed by #833 or #846

Comments

@mconst
Copy link

mconst commented Jul 26, 2024

When restoring a savepoint, any pages which have been freed and then reallocated in the time since the savepoint was created are leaked permanently. Here's the sequence of events:

read starts (reading tx 0)
tx 1:
    adds page A to the freed tree
tx 2:
    can't free page A due to the live read
read ends
tx 3:
    creates savepoint
    processes the freed tree, freeing page A
tx 4:
    allocates page A
    restores savepoint

When tx 4 calls restore_savepoint(), it diffs the allocator state to find out which pages have been allocated since the savepoint was created -- but page A won't show up in the diff, because it was allocated at both times! The restored freed tree does have an entry saying that page A should be freed, but restore_savepoint() clears out the old freed-tree entries without processing them. At that point, the page is no longer referenced anywhere and has been leaked.

I believe restore_savepoint() shouldn't be throwing away the old freed-tree entries without looking at them. Instead, I think it needs to either look at each entry and free the page if it's currently allocated, or (maybe nicer) remove all those pages from the allocator snapshot before doing the diff, which will cause any reallocated pages to correctly show up in allocated_since_savepoint.

@mconst
Copy link
Author

mconst commented Jul 26, 2024

Note that the example above was slightly complicated in order to work around #830. Now that that's fixed, here's a simpler repro:

tx 1:
    adds page A to the freed tree
tx 2:
    creates savepoint
    processes the freed tree, freeing page A
tx 3:
    allocates page A
    restores savepoint

@cberner
Copy link
Owner

cberner commented Jul 27, 2024

Nice find! I think there may still be some bugs in the savepoint handling. But I believe I fixed this one and another one in the attached PR. And I'm modifying the fuzzer to detect leaked pages

@mconst
Copy link
Author

mconst commented Jul 28, 2024

Yay, thanks for fixing this so quickly!

You probably know this already, but just in case: I noticed that 819ccf9 removed the fast-path frees for uncommitted pages, saying "This can't happen, since we only restore at the beginning of a clean transaction". But I don't think that's true: redb does allow restore_savepoint() in a dirty transaction, and in fact both of those fast-path frees were reachable.

I don't think there's any need to fix this! The current code looks correct to me, and restoring in a dirty transaction doesn't seem like an important use case; it certainly doesn't need to be optimized. But if you were relying on the transaction being clean for some other safety properties, that's not actually being enforced right now.

@cberner
Copy link
Owner

cberner commented Jul 28, 2024

Oh ya, oops. I forgot I had allowed that :) Fixed in #835

@cberner cberner reopened this Aug 10, 2024
@cberner
Copy link
Owner

cberner commented Aug 10, 2024

I discovered there are still some cases where pages can leak, and added some design documentation here: #840

@cberner
Copy link
Owner

cberner commented Aug 11, 2024

Fixed another leak here: #841

Unfortunately that PR will make restore_savepoint() quite a bit slower on large databases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants