Skip to content

Improve autovacuum "queue and progress" howto #30

@NikolayS

Description

@NikolayS

Context

The howto at docs/postgres-howtos/database-administration/maintenance/autovacuum-queue-and-progress.md covers autovacuum queue depth and live progress monitoring. A review against the Cybertec autovacuum monitoring article and the postgres-ai SQL style guide surfaced a number of gaps and improvements.

Correctness / coverage

  • Add PG 18+ autovacuum_vacuum_max_threshold to the trigger formula — effective threshold is least(max_threshold, threshold + scale_factor * reltuples); without it the query overstates urgency on large tables under PG 18
  • Handle autovacuum_vacuum_insert_threshold / autovacuum_vacuum_insert_scale_factor (PG 13+) — insert-mostly tables are invisible to the current query
  • Fix inverted autovacuum_enabled label — case when autovacuum_enabled then 'DISABLED' else 'enabled' end is backwards
  • Add anti-wraparound urgency — compute age(relfrozenxid) vs. autovacuum_freeze_max_age and mxid_age(relminmxid) vs. autovacuum_multixact_freeze_max_age, surface wraparound_urgency and per-table overrides
  • Add a trigger_reason column — {dead_tuples, inserts, freeze, multixact, in_progress} so each queue row says why it qualifies

Query quality

  • Replace regex-based reloption parsing with LATERAL unnest(reloptions) + FILTER — shorter, faster, easier to extend
  • Compute explicit vacuum_urgency = n_dead_tup / effective_threshold ratio and order by it, replacing the current boolean in/out-of-queue approach
  • Make query version-portable — use current_setting('autovacuum_vacuum_max_threshold', true) with 'Infinity'::float8 fallback for PG 13–17
  • Handle reltuples = -1 (PG 14+ initial value for freshly created tables) — urgency is meaningless in that case
  • Filter out temp tables / unlogged tables that won't be autovacuumed
  • Use format('%I.%I', nspname, relname) for the relation column instead of string concatenation
  • Simplify the full outer join — a left join + optional union handles the "another DB" edge case more clearly

Adjacent diagnostics

  • Add a pg_stat_progress_analyze companion query (the "natural extension" already mentioned in the page)
  • Add visibility-map health query via pg_visibility_map_summary — different signal from dead-tuple count, relevant for index-only scans
  • Cross-link last_autoanalyze / n_mod_since_analyze alongside last_autovacuum
  • Surface throttle math next to in-progress rows — effective MB/s ceiling from [auto]vacuum_cost_limit, [auto]vacuum_cost_delay, page-cost params

Tuning guidance / prose

  • Add a concrete worked example of the "load average" analogy (e.g., "10 tables in queue, 3 workers → effective depth of 3.3")
  • Explain when per-table overrides are better than bumping autovacuum_max_workers globally
  • Add a "why is this table stuck in the queue?" checklist: long-running transactions, replication slots, prepared transactions, hot-standby feedback, DDL locks, corruption
  • Note vacuum_buffer_usage_limit (PG 16+) when discussing "make workers run faster"
  • Cross-link how-to-monitor-xmin-horizon.md as the natural next step when cleanup is blocked
  • Add a version-compatibility callout (which columns/settings require which PG version)
  • Cite the Cybertec article as further reading

Output / UX

  • Refresh the screenshot (0067_tables_to_be_autovacuumed_2.png)
  • Provide a compact variant (no progress columns) for the common case when nothing is actively vacuuming
  • Provide a \watch-friendly variant for staring at an in-progress vacuum

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions