Skip to content

Partial autoload for postgres-persister #245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

jakubriedl
Copy link
Collaborator

Summary

Implements partial update in autoload from postgres to save significant resources and prevent full load on every change. Related issue here #236

How did you test this change?

All postgres tests are passing and I've tested it manually as well. But I'm open to suggestions what are best practices to test with the partial loads

@jakubriedl jakubriedl force-pushed the postgres-partial-autoload branch from 7872be3 to 91e5ace Compare May 12, 2025 12:06
@jakubriedl jakubriedl force-pushed the postgres-partial-autoload branch from 8fbbb60 to a514726 Compare May 13, 2025 08:49
@jamesgpearce
Copy link
Contributor

My main questions are:

  1. I wonder if we can wire up something similar for sqlite (even though the implementation will be quite different). I can look into this.

2a) We are sending both rows over the channel and out to the client if it's an update. Is it possible (or more efficient) to calculate the changes in the SQL function and just send the diff?

2b) We are sending the old row if it's a delete. We don't need it!

...which make me wonder if we should have three separate SQL functions (we already have three triggers) to create different payloads for each operation - and then the JS function can just apply whatever it's been sent.

2a may be tricky for SQL, I admit. But if possible, let's try.

@jakubriedl
Copy link
Collaborator Author

  1. I'll look into that
    2a) It seems to be possible
    2b) we still need a row.id to know what to delete, we could theoretically send just that though

So theoretically we could make the payload look like { rowId: "xy", operation: "insert/update/delete", changes: "{}:json" }

For both of the 2) items it feels like bit of tradeoff between simplicity and ease to understand vs squeezing tiny fractions of performance. In which case I'd probably pick simplicity, but I do understand if you want to go the other direction.

@jamesgpearce
Copy link
Contributor

Yes it might be overkill. But theoretically anything going over the channel will be over a network connection, whereas between trigger and function will be in-memory. If you had a row that looked like |id|booleanFlag|largeSlabOfText| and only the booleanFlag changed, it would be a shame to send (two!) largeSlabOfText fields in JSON for no reason.

But don't sweat it too much. We can always optimize later if it gets too tricky.

@jamesgpearce
Copy link
Contributor

jamesgpearce commented May 14, 2025

Crude, but something like this in the pgsql should allow you to detect new and changed cells:

SELECT * FROM
  (SELECT key, value FROM json_each(row_to_json(NEW))) a
    LEFT OUTER JOIN
  (SELECT key, value FROM json_each(row_to_json(OLD))) b
    ON a.key = b.key

(maybe a full outer join to get deletions too!)

@jakubriedl
Copy link
Collaborator Author

yeah I've saw similar in the SO and keen to try that

@jamesgpearce
Copy link
Contributor

(Do we want to keep these now you have your dedicated connector?)

@jakubriedl
Copy link
Collaborator Author

let's close this, to make the postgres partial autoload work well it needed much more work than just this and is solved by the custom persister.

@jakubriedl jakubriedl closed this Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants