-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run migrations and vacuum on staging and prod postgres via CLI #17220
Comments
There is a lot to be considered prior to execute this process and ad-hoc events can have a negative impact in the system's structure, performance and availability of services. Running 1. Understand this process does
Safe Approach Considerations
2. Run this process on All TablesOption 1: Simple Command (Safe for Small Databases)VACUUM ANALYZE; This applies to all tables but may be expensive on large databases. Option 2: Run on Each Table Iteratively (Safer for Large Databases)You can run DO $$
DECLARE
r RECORD;
BEGIN
FOR r IN (
SELECT schemaname,
tablename
FROM pg_tables
WHERE schemaname
NOT IN ('pg_catalog', 'information_schema')
)
LOOP
EXECUTE format(
'VACUUM ANALYZE %I.%I',
r.schemaname,
r.tablename
);
END LOOP;
END $$;
Option 3: Use Parallel Execution (For Faster Execution)If you have a large number of tables, parallel execution can speed things up. Using GNU Parallel in Bashpsql -d your_database \
-t \
-c "
SELECT 'VACUUM ANALYZE ' || schemaname || '.' || tablename || ';'
FROM pg_tables
WHERE schemaname
NOT IN ('pg_catalog', 'information_schema');
" \
| parallel -j 4 psql -d your_database -c ;
3. Automate Regular Vacuuming with a Cron JobFor automated safe execution, you can set a cron job on a database server: 0 3 * * * psql -d your_database -c "VACUUM ANALYZE;"
4. Check for Long-Running Transactions (Before Running Vacuum)Long transactions can prevent SELECT pid,
age(clock_timestamp(), query_start),
state,
query
FROM pg_stat_activity
WHERE state = 'active'
AND query NOT LIKE 'VACUUM%';
5. Monitor Vacuum Progress and LogsAfter running SELECT relname,
last_vacuum,
last_autovacuum,
last_analyze,
last_autoanalyze
FROM pg_stat_user_tables
ORDER BY last_vacuum DESC;
6. Optimize Autovacuum SettingsInstead of manual vacuuming, fine-tune autovacuum in autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50
autovacuum_vacuum_cost_limit = 2000
autovacuum_vacuum_cost_delay = 10ms
7. Final Recommendations
|
I have created a separate GitHub Workflow to handle these operations in a scheduled basis but will track that development in a separate ticket. |
Based on current scope and instructions provided to me, this operation will be requested to the CDC DBA engineers to perform these requests to all databases (across all environments). |
I will be posting shortly, the ticket numbers associated with these operations. |
Production request at the CDC Service Desk: Server Name or Alias: pdhprod-pgsql.postgres.database.azure.com Instructions: Implementation time: 02/07/2025 13:48:25 |
Production request at the CDC Service Desk: Server Name or Alias: pdhstaging-pgsql.postgres.database.azure.com Instructions: Implementation time: 02/07/2025 13:48:25 |
Production request at the CDC Service Desk: Server Name or Alias: pdhtest-pgsql.postgres.database.azure.com Instructions: Implementation time: 02/07/2025 13:48:25 |
I have learned from the CDC DBA team that the Vacuum Analyze (contrary to popular believe) are processes that are already enabled at the system level in the current setup. Whenever these operations are performed, there is a significant amount of tables locking and CPU spikes that affects the system. The Vacuum Analyze is currently setup as part of a percentage threshold that gets triggered whenever these conditions are met. --ALTER TABLE cycle2425.student SET (autovacuum_vacuum_scale_factor = 0.05);
--ALTER TABLE cycle2425.student SET (autovacuum_vacuum_threshold = 2000000);
ALTER TABLE cycle2425.student SET (autovacuum_analyze_scale_factor = 0.20);
ALTER TABLE cycle2425.student SET (autovacuum_analyze_threshold = 20000000); |
We have these queries that can be ran to determine the status of these operations: last autovacuum:select schemaname,
relname,
autovacuum_count,
last_autovacuum,
last_autoanalyze,
last_vacuum,
last_analyze
from pg_stat_user_tables
order by autovacuum_count
desc, last_autovacuum ; last auto and manual times and counts:select schemaname,
relname,
autovacuum_count,
autoanalyze_count,
last_autovacuum,
last_autoanalyze,
last_vacuum,
last_analyze
from pg_stat_user_tables
order by autoanalyze_count
desc, last_autovacuum ; last auto times and counts:select schemaname,
relname,
autovacuum_count,
autoanalyze_count,
last_autovacuum,
last_autoanalyze
from pg_stat_user_tables
order by autoanalyze_count
desc, last_autovacuum ; all info autovacuum:select *
from pg_stat_user_tables
order by autovacuum_count
desc ; |
This is the current understanding we have. Be mindful that the DBA team is still in the process of performing a proper assessment of the database systems and their configurations. Certain adjustments will need to be made but in time, they will be able to give us a final verdict as to what is:
Unless the development team has some objections, I propose to cancel these requests to enact scheduled and manual Vacuum Analyze operations. Nevertheless, this someone else with more knowledge should make that decision. I am exclusively stating facts. |
As a last note and base on the DBA preliminary recommendations, its possible to override these system-wide configurations on a table by table basis. Note: It's also recommended that we stay away from "automated" auto-vacuuming operations. |
Target location for SOP documentation for running raw queries
The text was updated successfully, but these errors were encountered: