Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADBDEV-6442: Refactor diskquota local_table_stats_map #34

Merged
merged 16 commits into from
Oct 17, 2024
Merged

Conversation

RekGRpth
Copy link
Member

@RekGRpth RekGRpth commented Oct 10, 2024

Refactor diskquota local_table_stats_map

During initialization, diskquota used a non-optimal structure for the local
hashmap local_table_stats_map. In a hashmap, there is quite a significant
overhead for each entry. Therefore, a large number of small entries led to
increased RAM consumption during cluster startup. Change the specified
structure, making the table oid as the key, and an array of sizes by segments
as the value. This significantly reduces the amount of memory consumed, because
now there will be SEGCOUNT times fewer records. Also fix a small bug with
duplicate oid tables in the active_oids string array in the dispatch_rejectmap
function.


Tests are not provided, but you can estimate the hashmap size using the hash_estimate_size function, for example, like this:

diff --git a/src/gp_activetable.c b/src/gp_activetable.c
index 0888c4d..5b7654d 100644
--- a/src/gp_activetable.c
+++ b/src/gp_activetable.c
@@ -954,6 +954,9 @@ load_table_size(HTAB *local_table_stats_map)
 	Portal                    portal;
 	char                     *sql = "select tableid, size, segid from diskquota.table_size";
 
+	elog(WARNING, "DiskQuotaActiveTableEntry = %li", hash_estimate_size(1000*1000*1000, sizeof(DiskQuotaActiveTableEntry)));
+	elog(WARNING, "ActiveTableEntryCombined = %li", hash_estimate_size(1000*1000, offsetof(ActiveTableEntryCombined, tablesize) + (1000 + 1) * sizeof(Size)));
+
 	if ((plan = SPI_prepare(sql, 0, NULL)) == NULL)
 		ereport(ERROR, (errmsg("[diskquota] SPI_prepare(\"%s\") failed", sql)));
 	if ((portal = SPI_cursor_open(NULL, plan, NULL, NULL, true)) == NULL)

that gives

2024-10-10 14:15:23.186300 +05,,,p684028,th1126361536,,,,0,con9,,seg-1,,,,sx1,"WARNING","01000","DiskQuotaActiveTableEntry = 40623489136",,,,,,,0,,"gp_activetable.c",957,
2024-10-10 14:15:23.186315 +05,,,p684028,th1126361536,,,,0,con9,,seg-1,,,,sx1,"WARNING","01000","ActiveTableEntryCombined = 8040421488",,,,,,,0,,"gp_activetable.c",958,

That is, the memory consumption for 1,000,000 tables on a 1000-segment cluster dropped from 38 gigabytes to 7.5 gigabytes.


It is easier to view the changes with the "Hide whitespace" option enabled.

@RekGRpth RekGRpth marked this pull request as ready for review October 10, 2024 10:21
src/gp_activetable.c Outdated Show resolved Hide resolved
src/quotamodel.c Outdated Show resolved Hide resolved
src/gp_activetable.c Outdated Show resolved Hide resolved
src/gp_activetable.h Outdated Show resolved Hide resolved
src/gp_activetable.c Outdated Show resolved Hide resolved
src/gp_activetable.c Outdated Show resolved Hide resolved
@RekGRpth RekGRpth merged commit 24546b2 into gpdb Oct 17, 2024
2 checks passed
@RekGRpth RekGRpth deleted the ADBDEV-6442 branch October 17, 2024 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants