Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADBDEV-5465: Split temp tables storage from temp files storage (#889) #1069

Merged
merged 3 commits into from
Nov 1, 2024

Conversation

silent-observer
Copy link

@silent-observer silent-observer commented Oct 3, 2024

Split temp tables storage from temp files storage (#889)

Some operations create temporary files on disk. The growth of temporary files in
conjunction with temporary tables can lead to 100% usage of the data directory
and an emergency stop of the DBMS.

Implement temp_spill_files_tablespaces GUC that controls tablespaces where
temporary files are stored. Temporary tables will still be stored in
temp_tablespaces.

In case temp_spill_files_tablespaces is not set: the behavior will remain the
same, and both files and tables will be stored in the specified table spaces
according to the list in temp_tablespaces, or if it's also empty, in system
table spaces.

If the temp_spill_files_tablespaces is not empty, but temp_tablespaces is empty,
only temporary files will be saved in the specified table spaces. Temporary
table files will still be stored in the system tablespace.

If temp_spill_files_tablespaces is set to the empty string ("") or 'pg_default',
it will not fall back to temp_tablespaces and will instead use the default
tablespace.

Changes from original commit:

  1. Remove DITA format docs since they are missing in GPDB 7.
  2. OpenNamedTemporaryFile is missing in GPDB 7, no fix needed.
  3. Use GPDB 7 implementation of get_session_temp_tablespace instead of GPDB 6.
  4. Additionally fix GetTempTablespaces (used by SharedFileSetInit) which now
    uses spill file tablespaces if there are any.
  5. There is no GetSessionTempTableSpace, so call get_session_temp_tablespace
    directly from GetNextTempTableSpace.

Test-related changes:

  1. Add file_monitor.c from GPDB 6.
  2. Add GNUMakefile entry for file_monitor from GPDB 6.
  3. Add gp_tablespace_tmppath function to regress.c from GPDB 6.
  4. Add plpython functions from GPDB 6 with fixes for GPDB 7 (switch to
    plpython3, add SETOF, add utf-8 encoding).
  5. Fix last query from gp_tablespace_file_report to execute on primary
    segments using gp_segment_configuration.

(cherry picked from commit 4bb482a)


Fix temp_spill_files_tablespaces GUC sync (#1074)

Previously GUC sync worked by sending SET commands to segments. However,
some values (for example the empty string for quoted GUCs) cannot be set this
way. This affects specifically temp_spill_files_tablespaces since "" and
"\"\"" have different semantic meaning for it.

This patch changes the way GUC sync works by using pg_catalog.set_config()
function instead of SET commands. This function sets the value of the GUC
directly without any quoting issues, and so now empty strings are handled
correctly.

(cherry picked from commit 993b6c4)


Note: do not squash the commit to preserve authorship.

bandetto and others added 2 commits October 28, 2024 08:16
Some operations create temporary files on disk. The growth of temporary files in
conjunction with temporary tables can lead to 100% usage of the data directory
and an emergency stop of the DBMS.

Implement temp_spill_files_tablespaces GUC that controls tablespaces where
temporary files are stored. Temporary tables will still be stored in
temp_tablespaces.

In case temp_spill_files_tablespaces is not set: the behavior will remain the
same, and both files and tables will be stored in the specified table spaces
according to the list in temp_tablespaces, or if it's also empty, in system
table spaces.

If the temp_spill_files_tablespaces is not empty, but temp_tablespaces is empty,
only temporary files will be saved in the specified table spaces. Temporary
table files will still be stored in the system tablespace.

If temp_spill_files_tablespaces is set to the empty string ("") or 'pg_default',
it will not fall back to temp_tablespaces and will instead use the default
tablespace.

Changes from original commit:
1. Remove DITA format docs since they are missing in GPDB 7.
2. OpenNamedTemporaryFile is missing in GPDB 7, no fix needed.
3. Use GPDB 7 implementation of get_session_temp_tablespace instead of GPDB 6.
4. Additionally fix GetTempTablespaces (used by SharedFileSetInit) which now
   uses spill file tablespaces if there are any.
5. There is no GetSessionTempTableSpace, so call get_session_temp_tablespace
   directly from GetNextTempTableSpace.

Test-related changes:
1. Add file_monitor.c from GPDB 6.
2. Add GNUMakefile entry for file_monitor from GPDB 6.
3. Add gp_tablespace_tmppath function to regress.c from GPDB 6.
4. Add plpython functions from GPDB 6 with fixes for GPDB 7 (switch to
   plpython3, add SETOF, add utf-8 encoding).
5. Fix last query from gp_tablespace_file_report to execute on primary
   segments using gp_segment_configuration.

(cherry picked from commit 4bb482a)
Previously GUC sync worked by sending SET commands to segments. However,
some values (for example the empty string for quoted GUCs) cannot be set this
way. This affects specifically temp_spill_files_tablespaces since "" and
"\"\"" have different semantic meaning for it.

This patch changes the way GUC sync works by using pg_catalog.set_config()
function instead of SET commands. This function sets the value of the GUC
directly without any quoting issues, and so now empty strings are handled
correctly.

Ticket: ADBDEV-6438
(cherry picked from commit 993b6c4)
@Stolb27 Stolb27 enabled auto-merge (rebase) November 1, 2024 08:49
@Stolb27 Stolb27 merged commit 22fc4aa into adb-7.2.0 Nov 1, 2024
5 checks passed
@Stolb27 Stolb27 deleted the ADBDEV-5465 branch November 1, 2024 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants