Skip to content

Conversation

@plampio
Copy link

@plampio plampio commented Jun 13, 2025

       in void trx_t::commit_in_memory(const mtr_t*)

This bug fix prevents debug assertion failure when Galera feature retry applying is enabled. This patch introduces also a general mechanism for temporarily disabling some debug assertions in InnoDB code.

The unwanted debug assertion failure may occur when a slave tries to apply again a WSREP transaction. The initial step in the retry is to rollback (locally) the failed transaction. During the rollback the WSREP transaction is temporarily marked as a non-WSREP transaction. When the rollback is completed, the transaction is marked a WSREP transaction again.

The failing assertion in InnoDB code (trx/trx0trx.cc) detects that a transaction that was flagged a WSREP transaction at creation is now a non-WSREP transaction. This causes the assertion failure.

@plampio plampio requested review from sjaakola and temeo June 13, 2025 12:19
@janlindstrom
Copy link

@plampio Can you improve commit message why assertion does not hold in retry applying and why we can't instead get it to hold on retry applying or why not?


#ifdef WITH_WSREP
ut_ad(is_wsrep() == wsrep_on(mysql_thd));
if (!mysql_thd || !wsrep_get_assert(mysql_thd, WSREP_ASSERT_INNODB_TRX)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a function in wsrep service just for avoiding this assert seems a bit excessive to me. Would there be some more lightweight solution for this, like

/* Assert that any transaction which is started as a wsrep transaction, also commits
* or rolls back as a wsrep transaction. BF transactions may turn wsrep off temporarily
* for rollback when they are retried. */
ut_ad(is_wsrep() == wsrep_on(mysql_thd) || (in_rollback && wsrep_thd_is_BF(mysql_thd));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree Teemu and comment above makes things a lot clearer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is a more light-weight solution.
I suggested this new function in anticipation that there would be similar problems in the future:
this new function would allow solving them easily.

Your suggestion is certainly simpler and nicer if we think that this problem is just one of a kind.

Copy link
Author

@plampio plampio Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented @temeo 's suggestion but unfortunately it does not work.

The problem is that the trx->in_rollback flag is already false when the failing assert is reached:
the flag is false at the beginning of the trx_t::rollback_finish() from which the failing trx_t::commit() is called. Here is the call stack:

`#12 0x00007fe66fab7e96 in __GI___assert_fail (assertion=0x5571ab7dc298 "is_wsrep() == wsrep_on(mysql_thd) || (in_rollback && wsrep_thd_is_BF(mysql_thd, false))",
file=0x5571ab7dae18 "/home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc", line=1507, function=0x5571ab7dc1a0 "void trx_t::commit_in_memory(const mtr_t*)") at ./assert/assert.c:103

#13 0x00005571aaf8dd9f in trx_t::commit_in_memory (mtr=0x0, this=0x7fe63f3fec00) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc:1507

#14 trx_t::commit_low (this=0x7fe63f3fec00, mtr=0x0) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc:1602

#15 0x00005571aaf8deff in trx_t::commit_persist (this=0x7fe63f3fec00) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc:1616

#16 0x00005571aaf8e020 in trx_t::commit (this=0x7fe63f3fec00) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc:1625

#17 0x00005571aaf7c38c in trx_t::rollback_finish (this=0x7fe63f3fec00) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0roll.cc:65
`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plampio Why trx->in_rollback flag setting to false can't be moved later ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janlindstrom Moving the trx->in_rollback flag setting to false later does not help because it is never set to true during this particular rollback. This flag is set to true only in some code paths associated with rollback.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-implemented the fix:

Modified the bug fix for MDEV-36554 by replacing the general
mechanism for disabling assertions in InnoDB code with solution
specific to this rollback issue.

The appliers sets a flag in the applier thread for the duration of the
local rollback. This is used in InnoDB code to detect a local
rollback.

Copy link

@janlindstrom janlindstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@plampio plampio self-assigned this Jul 7, 2025
           in void trx_t::commit_in_memory(const mtr_t*)

This bug fix prevents debug assertion failure when Galera feature
retry applying is enabled.

The applier sets a flag in the applier thread for the duration of the
local rollback. This is used in InnoDB code to detect a local
rollback and to skip the failing assertion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants