Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to include a list of column names for updates in the case… #99

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

greigwise
Copy link

… that toasted data for those columns is missing

I have added an option called include-missing-toast so that an additional data element will be added to the output which would list the names of all the columns which are not included in the case where the toast fields were not updated. This way consumers of the data could easily determine that certain fields are missing from the output and handle this case accordingly.

@eulerto
Copy link
Owner

eulerto commented Feb 12, 2019

Could you elaborate the use case? If this feature is related to UPDATE, the columns that was not informed will stay as is, hence, wal2json don't need to mention them. Even if you force with REPLICA IDENTITY FULL, you won't get them.

@greigwise
Copy link
Author

Sure. So, I am creating a data warehouse where I want to use the logical decoding as a source to stage changes so I can have a full audit trail of the history of my data (as it's updated over time). The process I use to update this data back into my data warehouse expects all the columns to be present on an update.

If I know that certain columns were omitted from the data stream (due to the fact that they are un-updated TOAST fields), I can easily go back and re-read the old values for those fields. Otherwise, I have to create some crazy logic where I compare the list of columns given by wal2json with the list of actual columns in the table and figure out what's missing that way.

So, in short, it's just a way to know which columns were excluded from the data stream in the event of an update so that I can handle that case specially.

See also that I added a test case and updated the docs for this.

Thanks for your consideration.

@greigwise
Copy link
Author

Hello. Is there any further information I can provide for this or anything else I need to do before we can proceed with this? Thanks.

@jmealo
Copy link

jmealo commented Apr 6, 2019

@euIerto I think that I'll run into this issue as well shortly. Nearly an identical use case.

@greigwise
Copy link
Author

As an alternative to what I've done here, I could make it so that instead of having a new data element "missingtoastcols", I could include the "missing columns" in the columnnames array and then just use some kind of sentinel value in the columnvalues array indicating that that value is toasted and was not changed. If this is preferable to what I have here, let me know and I can submit a new PR.

@jfinzel
Copy link

jfinzel commented May 30, 2019

@eulerto this would be really useful. What would it take to get this tested and merged?

@ramaguruprakash
Copy link

Any reason why this change is not merged? Is there any way in the changelogs to find out which of the columns are made null because they are unchanged toasted columns?

If we clearly know all the column names which are made null because they are toasted and unchanged, we can differentiate between the actual nulls and nulls because of the unchanged toasted column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants