Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken page cache expiration logic #89

Open
sattlerc opened this issue Oct 25, 2022 · 9 comments
Open

Broken page cache expiration logic #89

sattlerc opened this issue Oct 25, 2022 · 9 comments

Comments

@sattlerc
Copy link
Contributor

sattlerc commented Oct 25, 2022

This is currently the logic for expiring a page:

  def expire_caches(page)
    expire_cached_summary_pages(page.web)
    pages_to_expire = [page.name] +
       WikiReference.pages_redirected_to(page.web, @will_expire) +
       WikiReference.pages_that_include(page.web, @will_expire)
    unless (page.name == @will_expire)
      pages_to_expire.concat ([@will_expire] +
        WikiReference.pages_that_link_to(page.web, @will_expire) +
        WikiReference.pages_that_reference(page.web, page.name))
    end
    pages_to_expire.uniq.each { |page_name| expire_cached_page(page.web, page_name) }
  end

But shouldn't it be as follows?

Notation for a set of pages X:

  • Write WITH_REDIRECTS(X) for the set of page names in X and redirects to some page in X.
  • Call X closed under inclusion if whenever p includes a page in WITH_REDIRECTS(X) X, then p is in X. Write INCLUSION_CLOSURE(X) for the smallest extension of X closed under inclusion. This can be computed recursively.
  • Write INCLUSION_REDIRECTS(X) for WITH_REDIRECTS(INCLUSION_CLOSURE(X)).
  • Write LINK_TO(X) for the set of pages linking to some page in X.

Then when a page p is edited:

  • For content changes: expire INCLUSION_REDIRECTS(p) INCLUSION_CLOSURE(p).
  • For wanted link status changes: expire INCLUSION_REDIRECTS(LINK_TO(WITH_REDIRECTS(p))) INCLUSION_CLOSURE(LINK_TO(WITH_REDIRECTS(p))).

There some further problems, which seem however by design and not so easily fixed:

  • None of the page/revision edit and wiki_reference query database transactions are atomic, so presumably there are race conditions with simultaneous edits.
  • Links to deleted redirects don't get expired, etc.
@distler
Copy link
Collaborator

distler commented Oct 25, 2022

Have you looked at WikiReferences::pages_redirected_to? Does it not do what you want?

I think your complaint is that WikiReferences::pages_that_include is not computed recursively (or, possibly, that it does not handle inclusions of redirected pages). But that's not quite right.

  • Let page A include page B and let page B include page C.
  • Edit page C.
  • When you save C, page A is expired.

I'm not sure what you mean by "links to deleted redirects don't get expired".

  • Say I have a page A that redirects for (the nonexistent) page B and pages C and D that link to B. Clicking on one of those links will land me on page A.
  • Now say I edit the page A and remove the redirect. Pages C and D gets expired, and the links on the are turned into "wanted page" links.
  • Now say I click on the link on page C. This creates the new page B. When I save B, page D is expired.

Is this not the behaviour you expect?

@distler
Copy link
Collaborator

distler commented Oct 25, 2022

Just to be clear, inclusion of redirected pages is not allowed.

  • Page A redirects for B.
  • On page C, [[!include A]] is allowed, but [[!include B]] is not.

@sattlerc
Copy link
Contributor Author

sattlerc commented Oct 25, 2022

I wanted to try, but I couldn't figure out how to install Instiki again. I tried on two systems with ruby-3.0 (Arch Linux and Debian) and also the Dockerfile. Following the README, the command

ruby bundle install --path vendor/bundle

finishes successfully. But when running instiki I get the same error in every configuration:

~/instiki/vendor/rails/activesupport/lib/active_support/dependencies.rb:182:in `require': cannot load such file -- rack/handler (LoadError)
        from ~/instiki/vendor/rails/activesupport/lib/active_support/dependencies.rb:182:in `block in require'
        from ~/instiki/vendor/rails/activesupport/lib/active_support/dependencies.rb:547:in `new_constants_in'
        from ~/instiki/vendor/rails/activesupport/lib/active_support/dependencies.rb:182:in `require'
        from ~/instiki/script/server:58:in `rescue in <top (required)>'
        from ~/instiki/script/server:55:in `<top (required)>'
        from ./instiki:6:in `load'
        from ./instiki:6:in `<main>'
~/instiki/vendor/rails/activesupport/lib/active_support/dependencies.rb:182:in `require': cannot load such file -- rack/handler (LoadError)
        from ~/instiki/vendor/rails/activesupport/lib/active_support/dependencies.rb:182:in `block in require'
        from ~/instiki/vendor/rails/activesupport/lib/active_support/dependencies.rb:547:in `new_constants_in'
        from ~/instiki/vendor/rails/activesupport/lib/active_support/dependencies.rb:182:in `require'
        from ~/instiki/script/server:56:in `<top (required)>'
        from ./instiki:6:in `load'
        from ./instiki:6:in `<main>'

What am I doing wrong?

EDIT: When I tried the Dockerfile, I had to lift the Ruby verison to 2.6. Otherwise, the container build failed.

@sattlerc
Copy link
Contributor Author

A separate cache expiration problem I just noticed when testing:

  • Every page says which pages it is included in. This is not expired when that information changes.

@sattlerc
Copy link
Contributor Author

Just to be clear, inclusion of redirected pages is not allowed.

* Page A redirects for B.

* On page C, `[[!include A]]` is allowed, but `[[!include B]]` is not.

I see. I didn't know that.

@sattlerc
Copy link
Contributor Author

I'm not sure what you mean by "links to deleted redirects don't get expired".

Example:

  • Create X with link to Y. Link is wanted.
  • Create Y1 with redirect for Y. Link on X is not wanted.
  • Delete the redirect on Y1. Link on X is not wanted (but should be wanted).

It seems even links to renamed pages don't get expired. Example:

  • Create X with link to Y. Link is wanted.
  • Create Y. Link on X is not wanted.
  • Rename X to X1 without setting a redirect. Link on X is not wanted (but should be wanted).

@sattlerc
Copy link
Contributor Author

  • Say I have a page A that redirects for (the nonexistent) page B and pages C and D that link to B. Clicking on one of those links will land me on page A.
  • Now say I edit the page A and remove the redirect. Pages C and D gets expired, and the links on the are turned into "wanted page" links.

I can't reproduce this.

@sattlerc
Copy link
Contributor Author

Here is another bug:

  • Create A with inclusion of B. Shows "could not include B".
  • Create C.
  • Rename C to B. A still shows "could not include B".

@sattlerc
Copy link
Contributor Author

sattlerc commented Oct 25, 2022

I think your complaint is that WikiReferences::pages_that_include is not computed recursively (or, possibly, that it does not handle inclusions of redirected pages). But that's not quite right.

* Let page A include page B and let page B include page C.

* Edit page C.

* When you save C, page A is expired.

Counterexample (not involving redirects):

  • Create A with inclusion of B. Shows "could not include B".
  • Create B with inclusion of C. Shows "could not include C". Footer says "Linked from: A" (should say: included from A).
  • Refresh A. Shows "could not include C".
  • Create C. Refresh A. Still says "could not include C" (should include C).

EDIT: Page C is not necessary in this example. So just:

  • Create A with inclusion of B. Shows "could not include B".
  • Create B. Footer says "Linked from: A" (should say: included from A).
  • Refresh A. Shows content of B.
  • Edit B. Refresh A. Still shows old content of B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants