Prune messages after merge #1989

loriopatrick · 2024-05-07T18:20:11Z

Updates isPrunable to be getPruneAction which lets caller know if
message should be pruned (original function) or would cause a prune
action (added functionality). When applicable we prune for the given
FID after the message is merged.

Prune job is still running in the background but can be removed
in the future once it's confirmed there's no stragglers.

Benchmarking with production data, this change seems to add a 5%
overhead on average. The FID lock added to Store::prune_messages
is required for performance reasons. Without the lock, the overhead
is run dependent and takes 100%-500% longer.

Motivation

Currently hubs are seeing regular latency spikes due the periodic pruning job. This update causes the hub to immediately prune messages for a given FID after the message is merged.

Change Summary

Updates isPrunable to be getPruneAction which lets caller know if message should be pruned (original function) or would cause a prune action (added functionality). When applicable we prune for the given FID after the message is merged.

Merge Checklist

Choose all relevant options below by adding an x now or at any time before submitting for review

PR title adheres to the conventional commits standard
PR has a changeset
PR has been tagged with a change label(s) (i.e. documentation, feature, bugfix, or chore)
All commits have been signed

Additional Context

Adds a FID based lock to Store::prune_messages. This might be an issue for the prune job which are left running. If a prune job is taking a long time it may block merges. I suspect this won't be too different from what's happening today during prunes as the threadpool is limited to 4 threads, so a slow prune would already be blocking things.

PR-Codex overview

This PR focuses on enhancing message merge operations in the store.rs file and adding a PruneAction enum in storeEventHandler.ts.

Detailed summary

Improved atomic merge operations with locks for performance
Added PruneAction enum to determine pruning actions
Updated functions for better message pruning and handling

The following files were skipped due to too many changes: apps/hubble/src/storage/stores/reactionStore.test.ts, apps/hubble/src/storage/stores/rustStoreBase.ts, apps/hubble/src/storage/stores/castStore.test.ts

✨ Ask PR-Codex anything about this PR by commenting with /codex {your question}

changeset-bot · 2024-05-07T18:20:15Z

🦋 Changeset detected

Latest commit: f5d1705

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@farcaster/hubble	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2024-05-07T18:20:16Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
hub-monorepo	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 14, 2024 10:11pm

loriopatrick · 2024-05-07T18:34:12Z

apps/hubble/src/addon/src/store/store.rs

@@ -803,7 +803,7 @@ impl Store {
        ts_hash: &[u8; TS_HASH_LENGTH],
        message: &Message,
    ) -> Result<Vec<u8>, HubError> {
-        // If the store supports compact state messages, we don't merge messages that don't exist in the compact state


I think this comment was wrong, or am I missing something?

I think it should say: "we don't merge messages older than the compact state message and don't exist within it". Compact state message is like a checkpoint, you cannot go back and add messages that weren't present at the time of compaction.

codecov · 2024-05-07T21:06:36Z

Codecov Report

Attention: Patch coverage is 90.47619% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 48.98%. Comparing base (1880070) to head (4c7f1dc).
Report is 153 commits behind head on main.

❗ Current head 4c7f1dc differs from pull request most recent head f5d1705. Consider uploading reports for the commit f5d1705 to get more accurate results

Files	Patch %	Lines
apps/hubble/src/storage/stores/rustStoreBase.ts	86.66%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1989       +/-   ##
===========================================
- Coverage   74.05%   48.98%   -25.08%     
===========================================
  Files          99      118       +19     
  Lines        9425    15653     +6228     
  Branches     2097     4885     +2788     
===========================================
+ Hits         6980     7667      +687     
- Misses       2327     7985     +5658     
+ Partials      118        1      -117

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sanjayprabhu · 2024-05-07T23:16:37Z

apps/hubble/src/storage/stores/rustStoreBase.ts

@@ -117,19 +123,26 @@ export abstract class RustStoreBase<TAdd extends Message, TRemove extends Messag
      }
    }

+    for (const [fid, _] of fidToPrune) {
+      const pruneResult = await this.pruneMessages(fid);


This approach works, but I'm curious if it slows down the merge code path.

I suspect this is somewhat inefficient, but I don't know for sure.

e.g. current flow is:

check if is prunable in js (gets the storage count and earliest ts hash)

merge in rust

call the prune logic from js, which again gets the storage count etc and passes it to rust which iterates the db to find the earliest hash and prunes it.

But isPrunable already knows the storage units, message count and the earliestTsHash. One approach could be the pass in the earliestTsHash if it would cause a prune, and have the store prune the earliest message within the same transaction as the merge. Then we don't have to go back and forth between js and rust and we save some db calls.

But if you think perf impact is not that bad, then this is definitely simpler code wise.

I'll update my benchmarking branch to see how much overhead the pruneMessage call adds to try and get some real numbers.

Reading getMessageCount, it seems like the value is cached so the second load shouldn't be too bad.

One approach could be the pass in the earliestTsHash if it would cause a prune, and have the store prune the earliest message within the same transaction as the merge.

I like the idea of that. When you add a single message it can optionally remove the earliest stored item in the same operation. Is the earliestTsHash for a particular FID & store cached somewhere on the javascript side? If not, we'd probably still need to hit the db again. It would also change mergeMessages a bit as you'd need to do a remove for each merge rather than for each unique FID at the end (probably doesn't matter that much).

Here's an alternative design that would probably help with perf. We keep the prune job but mergeMessages simply helps the job track which FIDs on which stores need pruning. The job runs more frequently and only targets what needs to be updated. That would avoid the extra logic on the hot path and make the prune job run much faster. Thoughts? Might just be worth making that change as the design is still fairly simple.

Yes, isPrunable fetches the earliest tsHash from cache. I think a design where we can perform the prune at the same time as the merge so we don't even need the job would be better.

If that is too complicated, I prefer the current approach of doing the prune within the merge call but outside the tx. I don't think we should do the third of just having the job run more frequently. I think we should eventually get rid of the job entirely.

The first approach above is more effort, so maybe let's decide based on the benchmark results?

loriopatrick · 2024-05-08T23:42:36Z

apps/hubble/src/addon/src/store/store.rs

@@ -615,14 +636,19 @@ impl Store {
        Ok(())
    }

-    pub fn merge(&self, message: &Message) -> Result<Vec<u8>, HubError> {
-        // Grab a merge lock. The typescript code does this by individual fid, but we don't have a
+    pub fn merge(&self, message: &Message, max_mesages_for_fid: u32) -> Result<Vec<u8>, HubError> {


New prune logic is going here.

todo: move argument max_mesages_for_fid to FidInfoCache

Updates isPrunable to be getPruneAction which lets caller know if message should be pruned (original function) or would cause a prune action (added functionality). When applicable we prune for the given FID after the message is merged. Prune job is still running in the background but can be removed in the future once it's confirmed there's no stragglers. Benchmarking with production data, this change seems to add a 5% overhead on average. The FID lock added to Store::prune_messages is required for performance reasons. Without the lock, the overhead is run dependent and takes 100%-500% longer.

sanjayprabhu

Looks good, I would just update the tests to cover both the mergeMessages and mergeMessage codepath. Don't have to extensive, just one of the tests using the mergeMessages should be good.

sanjayprabhu · 2024-05-14T17:22:04Z

apps/hubble/src/storage/stores/castStore.test.ts

-      for (const result of mergeResults.values()) {
-        expect(result.isOk()).toBeTruthy();
-      }
+    test("earlier remove messages gets pruned", async () => {


This test seems duplicated? One of these should test the mergeMessages (bundles) code path

Oops. A lot of the tests are similar across stores, I must have copied / migrated the test over incorrectly. Will fix 👍

apps/hubble/src/storage/stores/castStore.test.ts

apps/hubble/src/addon/src/store/store.rs

sds · 2024-09-26T21:58:22Z

Closing due to age, and not actively being pursued. Thanks!

vercel bot deployed to Preview May 7, 2024 18:20 View deployment

loriopatrick commented May 7, 2024

View reviewed changes

loriopatrick force-pushed the plorio/prune branch from c8994a2 to aeab0ae Compare May 7, 2024 19:05

vercel bot deployed to Preview May 7, 2024 19:05 View deployment

vercel bot deployed to Preview May 7, 2024 19:43 View deployment

vercel bot deployed to Preview May 7, 2024 20:00 View deployment

loriopatrick changed the title ~~[WIP] Prune message after merge~~ [WIP] Prune messages after merge May 7, 2024

loriopatrick added the javascript Pull requests that update Javascript code label May 7, 2024

vercel bot deployed to Preview May 7, 2024 20:39 View deployment

loriopatrick force-pushed the plorio/prune branch from 32f3803 to e228fe6 Compare May 7, 2024 20:56

vercel bot deployed to Preview May 7, 2024 20:56 View deployment

loriopatrick force-pushed the plorio/prune branch 2 times, most recently from 8a4fe2b to acf6f1b Compare May 7, 2024 20:58

vercel bot deployed to Preview May 7, 2024 20:59 View deployment

loriopatrick changed the title ~~[WIP] Prune messages after merge~~ Prune messages after merge May 7, 2024

loriopatrick requested a review from sanjayprabhu May 7, 2024 21:45

sanjayprabhu reviewed May 7, 2024

View reviewed changes

loriopatrick changed the title ~~Prune messages after merge~~ [WIP] Prune messages after merge May 8, 2024

vercel bot deployed to Preview May 8, 2024 23:41 View deployment

loriopatrick commented May 8, 2024

View reviewed changes

loriopatrick force-pushed the plorio/prune branch from 0cdc068 to d2f7e40 Compare May 13, 2024 21:51

vercel bot deployed to Preview May 13, 2024 21:51 View deployment

loriopatrick force-pushed the plorio/prune branch from d2f7e40 to 45e800c Compare May 13, 2024 21:58

vercel bot deployed to Preview May 13, 2024 21:58 View deployment

loriopatrick force-pushed the plorio/prune branch from 45e800c to 4c7f1dc Compare May 13, 2024 21:59

vercel bot deployed to Preview May 13, 2024 21:59 View deployment

loriopatrick changed the title ~~[WIP] Prune messages after merge~~ Prune messages after merge May 13, 2024

sanjayprabhu reviewed May 14, 2024

View reviewed changes

WIP, rabbit hole with mergeMany logic

f5d1705

vercel bot deployed to Preview May 14, 2024 22:11 View deployment

sds closed this Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prune messages after merge #1989

Prune messages after merge #1989

loriopatrick commented May 7, 2024 •

edited by pr-codex bot

Loading

changeset-bot bot commented May 7, 2024 •

edited

Loading

vercel bot commented May 7, 2024 •

edited

Loading

loriopatrick May 7, 2024

sanjayprabhu May 7, 2024

codecov bot commented May 7, 2024 •

edited

Loading

sanjayprabhu May 7, 2024

loriopatrick May 7, 2024 •

edited

Loading

sanjayprabhu May 8, 2024

loriopatrick May 8, 2024

sanjayprabhu left a comment

sanjayprabhu May 14, 2024

loriopatrick May 14, 2024

sds commented Sep 26, 2024

Prune messages after merge #1989

Prune messages after merge #1989

Conversation

loriopatrick commented May 7, 2024 • edited by pr-codex bot Loading

Motivation

Change Summary

Merge Checklist

Additional Context

PR-Codex overview

Detailed summary

changeset-bot bot commented May 7, 2024 • edited Loading

🦋 Changeset detected

vercel bot commented May 7, 2024 • edited Loading

loriopatrick May 7, 2024

Choose a reason for hiding this comment

sanjayprabhu May 7, 2024

Choose a reason for hiding this comment

codecov bot commented May 7, 2024 • edited Loading

Codecov Report

sanjayprabhu May 7, 2024

Choose a reason for hiding this comment

loriopatrick May 7, 2024 • edited Loading

Choose a reason for hiding this comment

sanjayprabhu May 8, 2024

Choose a reason for hiding this comment

loriopatrick May 8, 2024

Choose a reason for hiding this comment

sanjayprabhu left a comment

Choose a reason for hiding this comment

sanjayprabhu May 14, 2024

Choose a reason for hiding this comment

loriopatrick May 14, 2024

Choose a reason for hiding this comment

sds commented Sep 26, 2024

loriopatrick commented May 7, 2024 •

edited by pr-codex bot

Loading

changeset-bot bot commented May 7, 2024 •

edited

Loading

vercel bot commented May 7, 2024 •

edited

Loading

codecov bot commented May 7, 2024 •

edited

Loading

loriopatrick May 7, 2024 •

edited

Loading