-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADBDEV-4251: Fix use by ORCA of a newer index with HOT-chain in an older transaction. #619
Conversation
5d29ce0
to
a3af2f5
Compare
Allure report https://allure-ee.adsw.io/launch/54747 |
a3af2f5
to
a7a900c
Compare
Allure report https://allure-ee.adsw.io/launch/54823 |
…e current transaction as the postgres optimizer does When updating a row in heap tables, a HOT (Heap-Only Tuple) chain is created where the old version of the row points to the new version. When creating an index on such a table and there is already another index, the new index's elements will point to the beginning of the chain (the old value), while having the new value. In a situation where we have a parallel transaction reading this table using the new index, we may find the old row. for more info, see src/backend/access/heap/README.HOT example: In the initial state, we have a tuple that the index points to. In another session, we open a transaction with a repeatable read isolation level and do a select. in session 1, we update the tuple, thereby creating a HOT chain and creating an index for the updated column. the new index will reference the first version of the tuple. if now in session 2 we do a search for a new index with a new value, then we will find the old tuple, since for this session the first version is alive. The PostgreSQL optimizer handles this situation correctly and ignores the new index in the old transaction. ORCA does not. This patch fixes this by adding a visibility check for the index in ORCA. Also, in such a case, we mark the cache and the plan as transient to reset the mdcache in ORCA and the cached plan in the case of a prepare at the beginning of the next transaction.
a7a900c
to
2e76a39
Compare
aed85a1
to
f4c4ee2
Compare
Allure report https://allure-ee.adsw.io/launch/55104 |
This comment was marked as resolved.
This comment was marked as resolved.
Allure report https://allure-ee.adsw.io/launch/57898 |
Failed job Regression tests with Postgres on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/760819 |
Failed job Regression tests with Postgres on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/760818 |
Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/760820 |
Failed job Regression tests with ORCA on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/760821 |
…ly calling functions from gdb. added cache invalidation test.
b48aa81
to
d3c0ff7
Compare
Allure report https://allure-ee.adsw.io/launch/57910 |
Failed job Regression tests with Postgres on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/761870 |
Failed job Regression tests with Postgres on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/761869 |
4042701
to
3fb85e0
Compare
3fb85e0
to
b757ec5
Compare
Allure report https://allure-ee.adsw.io/launch/60413 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/887142 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/887143 |
Allure report https://allure-ee.adsw.io/launch/60460 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/888541 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/888542 |
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before.
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before. (cherry-picked from commit 5894018)
…on. (#619) When heap tuple is updated by legacy planner and the updated tuple is placed at the same page (heap-only tuple, HOT), an update chain is created. It's a chain of updated tuples, in which each tuple's ctid points to the next tuple in the chain. HOT chains allow to store only one index entry, which points to the first tuple in the chain. And during Index Scan we pass through the chain, and the first tuple visible for the current transaction is taken (for more information, see src/backend/access/heap/README.HOT). If we create a second index on column that has been updated, it will store the ctid of the beginning of the existing HOT chain. If a repeatable read transaction started before the transaction in which the second index was created, then this index could be used in the query plan. As a result of the search for this index, a tuple could be found that does not meet the search condition (by a new value that is not visible to the transaction) In the case of the legacy planner, this problem is solved the following way: "To address this issue, regular (non-concurrent) CREATE INDEX makes the new index usable only by new transactions and transactions that don't have snapshots older than the CREATE INDEX command. This prevents queries that can see the inconsistent HOT chains from trying to use the new index and getting incorrect results. Queries that can see the index can only see the rows that were visible after the index was created, hence the HOT chains are consistent for them." But ORCA does not handle this case and can use an index with a broken HOT-chain. This patch resolves the issue for ORCA in the same way as legacy planner. During planning we ignore newly created indexes based on their xmin. Additionally, ORCA faced another related problem. Since ORCA has its own cache (MD Cache) and can cache a relation object without an index that cannot be used in the current snapshot (because MDCacheSetTransientState function returns true), we won't be able to use the index after the problematic snapshot changes. Therefore, we need to reset the cache after the snapshot changes in order to use index. This patch solves the problem in the following way: during index filtering, if we encounter an index that we cannot use, we save TransactionXmin in the mdcache_transaction_xmin variable. In the next queries, we check the saved xmin, and if it is valid and differs from the current one, we reset the cache. The create_index_hot test has also been changed. Now optimizer is turned off before the update. Since ORCA always uses Split Update, in which case HOT chains are not created and the problem is not reproduced. And that's why ORCA wasn't actually tested before. (cherry-picked from commit 5894018)
When heap tuple is updated by legacy planner and the updated tuple is
placed at the same page (heap-only tuple, HOT), an update chain is created.
It's a chain of updated tuples, in which each tuple's ctid points to the next
tuple in the chain.
HOT chains allow to store only one index entry, which points to the first tuple
in the chain. And during Index Scan we pass through the chain, and the first
tuple visible for the current transaction is taken (for more information, see
src/backend/access/heap/README.HOT).
If we create a second index on column that has been updated, it will store the
ctid of the beginning of the existing HOT chain. If a repeatable read
transaction started before the transaction in which the second index was
created, then this index could be used in the query plan. As a result of the
search for this index, a tuple could be found that does not meet the search
condition (by a new value that is not visible to the transaction)
In the case of the legacy planner, this problem is solved the following way:
"To address this issue, regular (non-concurrent) CREATE INDEX makes the
new index usable only by new transactions and transactions that don't
have snapshots older than the CREATE INDEX command. This prevents
queries that can see the inconsistent HOT chains from trying to use the
new index and getting incorrect results. Queries that can see the index
can only see the rows that were visible after the index was created,
hence the HOT chains are consistent for them."
But ORCA does not handle this case and can use an index with a broken HOT-chain.
This patch resolves the issue for ORCA in the same way as legacy planner. During
planning we ignore newly created indexes based on their xmin.
Additionally, ORCA faced another related problem. Since ORCA has its own cache
(MD Cache) and can cache a relation object without an index that cannot be used
in the current snapshot (because MDCacheSetTransientState function returns true),
we won't be able to use the index after the problematic snapshot changes.
Therefore, we need to reset the cache after the snapshot changes in order to use
index.
This patch solves the problem in the following way: during index filtering, if
we encounter an index that we cannot use, we save TransactionXmin in the
mdcache_transaction_xmin variable. In the next queries, we check the saved xmin,
and if it is valid and differs from the current one, we reset the cache.
The create_index_hot test has also been changed. Now optimizer is turned off
before the update. Since ORCA always uses Split Update, in which case HOT chains
are not created and the problem is not reproduced. And that's why ORCA wasn't
actually tested before.