forked from greenplum-db/gpdb-archive
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADBDEV-5465: Split temp tables storage from temp files storage (#889) #1069
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
silent-observer
force-pushed
the
ADBDEV-5465
branch
from
October 4, 2024 06:28
e8c8227
to
be22a27
Compare
bandetto
reviewed
Oct 7, 2024
silent-observer
force-pushed
the
ADBDEV-5465
branch
from
October 7, 2024 08:37
be22a27
to
3718611
Compare
bandetto
reviewed
Oct 7, 2024
Some operations create temporary files on disk. The growth of temporary files in conjunction with temporary tables can lead to 100% usage of the data directory and an emergency stop of the DBMS. Implement temp_spill_files_tablespaces GUC that controls tablespaces where temporary files are stored. Temporary tables will still be stored in temp_tablespaces. In case temp_spill_files_tablespaces is not set: the behavior will remain the same, and both files and tables will be stored in the specified table spaces according to the list in temp_tablespaces, or if it's also empty, in system table spaces. If the temp_spill_files_tablespaces is not empty, but temp_tablespaces is empty, only temporary files will be saved in the specified table spaces. Temporary table files will still be stored in the system tablespace. If temp_spill_files_tablespaces is set to the empty string ("") or 'pg_default', it will not fall back to temp_tablespaces and will instead use the default tablespace. Changes from original commit: 1. Remove DITA format docs since they are missing in GPDB 7. 2. OpenNamedTemporaryFile is missing in GPDB 7, no fix needed. 3. Use GPDB 7 implementation of get_session_temp_tablespace instead of GPDB 6. 4. Additionally fix GetTempTablespaces (used by SharedFileSetInit) which now uses spill file tablespaces if there are any. 5. There is no GetSessionTempTableSpace, so call get_session_temp_tablespace directly from GetNextTempTableSpace. Test-related changes: 1. Add file_monitor.c from GPDB 6. 2. Add GNUMakefile entry for file_monitor from GPDB 6. 3. Add gp_tablespace_tmppath function to regress.c from GPDB 6. 4. Add plpython functions from GPDB 6 with fixes for GPDB 7 (switch to plpython3, add SETOF, add utf-8 encoding). 5. Fix last query from gp_tablespace_file_report to execute on primary segments using gp_segment_configuration. (cherry picked from commit 4bb482a)
Previously GUC sync worked by sending SET commands to segments. However, some values (for example the empty string for quoted GUCs) cannot be set this way. This affects specifically temp_spill_files_tablespaces since "" and "\"\"" have different semantic meaning for it. This patch changes the way GUC sync works by using pg_catalog.set_config() function instead of SET commands. This function sets the value of the GUC directly without any quoting issues, and so now empty strings are handled correctly. Ticket: ADBDEV-6438 (cherry picked from commit 993b6c4)
silent-observer
force-pushed
the
ADBDEV-5465
branch
from
October 28, 2024 08:17
7d06628
to
3f40630
Compare
bandetto
approved these changes
Oct 29, 2024
bimboterminator1
approved these changes
Oct 31, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Split temp tables storage from temp files storage (#889)
Some operations create temporary files on disk. The growth of temporary files in
conjunction with temporary tables can lead to 100% usage of the data directory
and an emergency stop of the DBMS.
Implement temp_spill_files_tablespaces GUC that controls tablespaces where
temporary files are stored. Temporary tables will still be stored in
temp_tablespaces.
In case temp_spill_files_tablespaces is not set: the behavior will remain the
same, and both files and tables will be stored in the specified table spaces
according to the list in temp_tablespaces, or if it's also empty, in system
table spaces.
If the temp_spill_files_tablespaces is not empty, but temp_tablespaces is empty,
only temporary files will be saved in the specified table spaces. Temporary
table files will still be stored in the system tablespace.
If temp_spill_files_tablespaces is set to the empty string ("") or 'pg_default',
it will not fall back to temp_tablespaces and will instead use the default
tablespace.
Changes from original commit:
uses spill file tablespaces if there are any.
directly from GetNextTempTableSpace.
Test-related changes:
plpython3, add SETOF, add utf-8 encoding).
segments using gp_segment_configuration.
(cherry picked from commit 4bb482a)
Fix temp_spill_files_tablespaces GUC sync (#1074)
Previously GUC sync worked by sending SET commands to segments. However,
some values (for example the empty string for quoted GUCs) cannot be set this
way. This affects specifically temp_spill_files_tablespaces since "" and
"\"\"" have different semantic meaning for it.
This patch changes the way GUC sync works by using pg_catalog.set_config()
function instead of SET commands. This function sets the value of the GUC
directly without any quoting issues, and so now empty strings are handled
correctly.
(cherry picked from commit 993b6c4)
Note: do not squash the commit to preserve authorship.