Skip to content

Commit

Permalink
Split temp tables storage from temp files storage (#889)
Browse files Browse the repository at this point in the history
Some operations create temporary files on disk. The growth of temporary files in
conjunction with temporary tables can lead to 100% usage of the data directory
and an emergency stop of the DBMS.

Implement temp_spill_files_tablespaces GUC that controls tablespaces where
temporary files are stored. Temporary tables will still be stored in
temp_tablespaces.

In case temp_spill_files_tablespaces is not set: the behavior will remain the
same, and both files and tables will be stored in the specified table spaces
according to the list in temp_tablespaces, or if it's also empty, in system
table spaces.

If the temp_spill_files_tablespaces is not empty, but temp_tablespaces is empty,
only temporary files will be saved in the specified table spaces. Temporary
table files will still be stored in the system tablespace.

If temp_spill_files_tablespaces is set to the empty string ("") or 'pg_default',
it will not fall back to temp_tablespaces and will instead use the default
tablespace.

Changes from original commit:
1. Remove DITA format docs since they are missing in GPDB 7.
2. OpenNamedTemporaryFile is missing in GPDB 7, no fix needed.
3. Use GPDB 7 implementation of get_session_temp_tablespace instead of GPDB 6.
4. Additionally fix GetTempTablespaces (used by SharedFileSetInit) which now
   uses spill file tablespaces if there are any.
5. There is no GetSessionTempTableSpace, so call get_session_temp_tablespace
   directly from GetNextTempTableSpace.

Test-related changes:
1. Add file_monitor.c from GPDB 6.
2. Add GNUMakefile entry for file_monitor from GPDB 6.
3. Add gp_tablespace_tmppath function to regress.c from GPDB 6.
4. Add plpython functions from GPDB 6 with fixes for GPDB 7 (switch to
   plpython3, add SETOF, add utf-8 encoding).
5. Fix last query from gp_tablespace_file_report to execute on primary
   segments using gp_segment_configuration.

(cherry picked from commit 4bb482a)
  • Loading branch information
bandetto authored and silent-observer committed Oct 4, 2024
1 parent c9ba080 commit be22a27
Show file tree
Hide file tree
Showing 15 changed files with 943 additions and 53 deletions.
2 changes: 2 additions & 0 deletions gpdb-doc/markdown/admin_guide/ddl/ddl-tablespace.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ CREATE TABLE foo(i int);

There is also the `temp_tablespaces` configuration parameter, which determines the placement of temporary tables and indexes, as well as temporary files that are used for purposes such as sorting large data sets. This can be a comma-separate list of tablespace names, rather than only one, so that the load associated with temporary objects can be spread over multiple tablespaces. A random member of the list is picked each time a temporary object is to be created.

If you need to separate temporary files from temporary tables, `temp_spill_files_tablespaces` configuration parameter allows to change their placement. If this parameter is empty, temporary files are created according to `temp_tablespaces`.

The tablespace associated with a database stores that database's system catalogs, temporary files created by server processes using that database, and is the default tablespace selected for tables and indexes created within the database, if no `TABLESPACE` is specified when the objects are created. If you do not specify a tablespace when you create a database, the database uses the same tablespace used by its template database.

You can use a tablespace from any database in the Greenplum Database system if you have appropriate privileges.
Expand Down
12 changes: 11 additions & 1 deletion gpdb-doc/markdown/ref_guide/config_params/guc-list.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -3343,7 +3343,17 @@ When setting `temp_tablespaces` interactively, avoid specifying a nonexistent ta

The default value is an empty string, which results in all temporary objects being created in the default tablespace of the current database.

See also [default\_tablespace](#default_tablespace).
See also [temp\_spill\_files\_tablespaces](#temp_spill_files_tablespaces), [default\_tablespace](#default_tablespace).

|Value Range|Default|Set Classifications|
|-----------|-------|-------------------|
|one or more tablespace names|unset|master, session, reload|

## <a id="temp_spill_files_tablespaces"></a>temp\_spill\_files\_tablespaces

Specifies tablespaces in which to create temporary files for purposes such as large data set sorting. This setting takes precedence over `temp_tablespaces` for temporary files.

The value is a comma-separated list of tablespace names. When the list contains more than one tablespace name, Greenplum chooses a random list member each time it creates a temporary file. An exception applies within a transaction, where successively created temporary files are placed in successive tablespaces from the list. If the selected element of the list is an empty string, Greenplum automatically falls back to tablespaces specified in `temp_tablespaces`. If `temp_tablespaces` is empty, Greenplum uses the default tablespace of the current database instead.</p>

|Value Range|Default|Set Classifications|
|-----------|-------|-------------------|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -362,6 +362,7 @@ These configuration parameters set defaults that are used for client connections
- [gin_pending_list_limit](guc-list.html#gin_pending_list_limit)
- [statement_timeout](guc-list.html#statement_timeout)
- [temp_tablespaces](guc-list.html#temp_tablespaces)
- [temp_spill_files_tablespaces](guc-list.html#temp_spill_files_tablespaces)
- [vacuum_cleanup_index_scale_factor](guc-list.html#vacuum_cleanup_index_scale_factor)
- [vacuum_freeze_min_age](guc-list.html#vacuum_freeze_min_age)

Expand Down
52 changes: 40 additions & 12 deletions src/backend/commands/tablespace.c
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@
/* GUC variables */
char *default_tablespace = NULL;
char *temp_tablespaces = NULL;
char *temp_spill_files_tablespaces = NULL;
bool allow_in_place_tablespaces = false;


Expand Down Expand Up @@ -1790,26 +1791,39 @@ assign_temp_tablespaces(const char *newval, void *extra)
SetTempTablespaces(NULL, 0);
}

/* assign_hook: do extra actions as needed */
void
assign_temp_spill_files_tablespaces(const char *newval, void *extra)
{
temp_tablespaces_extra *myextra = (temp_tablespaces_extra *) extra;

/*
* If check_temp_tablespaces was executed inside a transaction, then pass
* the list it made to fd.c. Otherwise, clear fd.c's list; we must be
* still outside a transaction, or else restoring during transaction exit,
* and in either case we can just let the next PrepareTempTablespaces call
* make things sane.
*/
if (myextra)
SetTempFileTablespaces(myextra->tblSpcs, myextra->numSpcs);
else
SetTempFileTablespaces(NULL, 0);
}

/*
* PrepareTempTablespaces -- prepare to use temp tablespaces
*
* If we have not already done so in the current transaction, parse the
* temp_tablespaces GUC variable and tell fd.c which tablespace(s) to use
* for temp files.
* for temporary files or tables.
*/
void
PrepareTempTablespaces(void)
static void
PrepareTempTablespacesImpl(char *gucstr, void (*setTablespacesFunc)(Oid *, int))
{
char *rawname;
List *namelist;
Oid *tblSpcs;
int numSpcs;
ListCell *l;

/* No work if already done in current transaction */
if (TempTablespacesAreSet())
return;

/*
* Can't do catalog access unless within a transaction. This is just a
* safety check in case this function is called by low-level code that
Expand All @@ -1821,13 +1835,13 @@ PrepareTempTablespaces(void)
return;

/* Need a modifiable copy of string */
rawname = pstrdup(temp_tablespaces);
rawname = pstrdup(gucstr);

/* Parse string into list of identifiers */
if (!SplitIdentifierString(rawname, ',', &namelist))
{
/* syntax error in name list */
SetTempTablespaces(NULL, 0);
setTablespacesFunc(NULL, 0);
pfree(rawname);
list_free(namelist);
return;
Expand Down Expand Up @@ -1879,12 +1893,26 @@ PrepareTempTablespaces(void)
tblSpcs[numSpcs++] = curoid;
}

SetTempTablespaces(tblSpcs, numSpcs);
setTablespacesFunc(tblSpcs, numSpcs);

pfree(rawname);
list_free(namelist);
}

/*
* PrepareTempTablespaces -- prepare to use tablespaces set in temp_tablespaces
* and temp_spill_files_tablespaces. No work if already done in current
* transaction.
*/
void
PrepareTempTablespaces(void)
{
if (!TempTablespacesAreSet())
PrepareTempTablespacesImpl(temp_tablespaces, SetTempTablespaces);

if (!TempFileTablespacesAreSet())
PrepareTempTablespacesImpl(temp_spill_files_tablespaces, SetTempFileTablespaces);
}

/*
* get_tablespace_oid - given a tablespace name, look up the OID
Expand Down
Loading

0 comments on commit be22a27

Please sign in to comment.