Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADBDEV-6443: Refactor diskquota load_table_size #39

Closed
wants to merge 133 commits into from
Closed

Conversation

RekGRpth
Copy link
Member

@RekGRpth RekGRpth commented Oct 17, 2024

Refactor diskquota load_table_size

diskquota used a local hashmap local_table_stats_map in the
gp_fetch_active_tables function. During initialization, it loaded all the
information from the table with the sizes diskquota.table_size into it using
the load_table_size function. And during normal operation, diskquota loaded
information about active tables with sizes from segments into this local
hashmap using the pull_active_list_from_seg, convert_map_to_string, and
pull_active_table_size_from_seg functions. This led to increased memory
consumption, especially during initialization, since with a large number of
active tables, the size of this local hashmap local_table_stats_map was quite
large.

At each iteration of the loop in the calculate_table_disk_usage function,
diskquota loaded the oids of all user tables into one common list oidlist using
the get_rel_oid_list function. During initialization, the oids from the table
with sizes diskquota.table_size were added to the same list. Then, for each oid
from this list, diskquota found indexes using the diskquota_get_index_list
function and added their oids to the same list. Also, the oids of uncommitted
tables were added to this list using the merge_uncommitted_table_to_oidlist
function, which extracted them from the shared relation_cache hashmap. All this
also led to increased memory consumption by such a large list of oids.
Obviously, both of these functions used locks.

Next, in the calculate_table_disk_usage function, there was a loop through this
large list of oids oidlist, at each iteration of which diskquota tried to find
an oid in the cache using the SearchSysCacheCopy1 function. If it succeeded, it
retrieved the owner, schema, and tablespace, and if not, it checked the oid by
querying the shared hashmap relation_cache of uncommitted tables again using
hash_search. If the oid was found there, diskquota retrieved the same
information for it. Then diskquota checked the oid against the local hashmap of
active tables local_table_stats_map using hash_search, and if it was there, it
calculated the sizes. Clearly, locks were used again here.

This patch completely changes the logic described above, getting rid of
unnecessary hashmaps, lists, functions and locks. Instead of a list of oids and
a loop through it, as well as instead of a local hashmap with the sizes of
active tables and instead of checking a shared hashmap with uncommitted tables,
one general query is now made, the selection from which occurs using a cursor.
This query already contains all the necessary information: oids, owners,
schemas and tablespaces, as well as the sizes of active tables.


It is easier to view the changes with the "Hide whitespace" option enabled.

Base automatically changed from ADBDEV-6520 to gpdb November 12, 2024 09:31
@RekGRpth
Copy link
Member Author

#46

@RekGRpth RekGRpth closed this Dec 18, 2024
@RekGRpth RekGRpth deleted the ADBDEV-6443 branch December 18, 2024 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant