Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. #511

Howard229 · 2024-07-07T06:49:51Z

COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight.

Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not.

We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.)

This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic.

replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there.

COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/[email protected]

CLAassistant · 2024-07-07T06:49:56Z

All committers have signed the CLA.

polardb-bot · 2024-07-07T06:51:22Z

Hi @Howard229 ~ Congratulations to your first PR to PolarDB. 🎉

Please make sure that your PR conforms the standard, and has passed all the checks.

We will review your PR as soon as possible.

polardb-bot · 2024-07-07T07:34:11Z

Hey @Howard229 :

Something wrong occuried during the checks of your commit 😟, please check the detail:

⚠️ build-and-publish-rpm (rocky8) View more details

⚠️ build-and-publish-rpm (rocky9) View more details

polardb-bot bot added the ci/failure CI status is failure label Jul 7, 2024

Howard229 closed this Jul 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. #511

Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. #511

Howard229 commented Jul 7, 2024 •

edited

Loading

CLAassistant commented Jul 7, 2024 •

edited

Loading

polardb-bot bot commented Jul 7, 2024

polardb-bot bot commented Jul 7, 2024

Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. #511

Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. #511

Conversation

Howard229 commented Jul 7, 2024 • edited Loading

CLAassistant commented Jul 7, 2024 • edited Loading

polardb-bot bot commented Jul 7, 2024

polardb-bot bot commented Jul 7, 2024

Howard229 commented Jul 7, 2024 •

edited

Loading

CLAassistant commented Jul 7, 2024 •

edited

Loading