-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to include a list of column names for updates in the case… #99
base: master
Are you sure you want to change the base?
Conversation
… that toasted data for those columns is missing
Could you elaborate the use case? If this feature is related to UPDATE, the columns that was not informed will stay as is, hence, wal2json don't need to mention them. Even if you force with REPLICA IDENTITY FULL, you won't get them. |
Sure. So, I am creating a data warehouse where I want to use the logical decoding as a source to stage changes so I can have a full audit trail of the history of my data (as it's updated over time). The process I use to update this data back into my data warehouse expects all the columns to be present on an update. If I know that certain columns were omitted from the data stream (due to the fact that they are un-updated TOAST fields), I can easily go back and re-read the old values for those fields. Otherwise, I have to create some crazy logic where I compare the list of columns given by wal2json with the list of actual columns in the table and figure out what's missing that way. So, in short, it's just a way to know which columns were excluded from the data stream in the event of an update so that I can handle that case specially. See also that I added a test case and updated the docs for this. Thanks for your consideration. |
Hello. Is there any further information I can provide for this or anything else I need to do before we can proceed with this? Thanks. |
@euIerto I think that I'll run into this issue as well shortly. Nearly an identical use case. |
As an alternative to what I've done here, I could make it so that instead of having a new data element "missingtoastcols", I could include the "missing columns" in the columnnames array and then just use some kind of sentinel value in the columnvalues array indicating that that value is toasted and was not changed. If this is preferable to what I have here, let me know and I can submit a new PR. |
@eulerto this would be really useful. What would it take to get this tested and merged? |
Any reason why this change is not merged? Is there any way in the changelogs to find out which of the columns are made null because they are unchanged toasted columns? If we clearly know all the column names which are made null because they are toasted and unchanged, we can differentiate between the actual nulls and nulls because of the unchanged toasted column. |
… that toasted data for those columns is missing
I have added an option called include-missing-toast so that an additional data element will be added to the output which would list the names of all the columns which are not included in the case where the toast fields were not updated. This way consumers of the data could easily determine that certain fields are missing from the output and handle this case accordingly.