Vacuum, Autovacuum, and Analyze — Everything a PostgreSQL DBA Must Know
Vacuum, Autovacuum, and Analyze — Everything a PostgreSQL DBA Must Know
Introduction
PostgreSQL uses MVCC (Multi-Version Concurrency Control) to handle concurrent transactions.
-
When rows are updated or deleted, PostgreSQL doesn’t overwrite them.
-
Instead, it marks the old versions as dead tuples and keeps new versions.
-
Over time, dead tuples accumulate → bloat, wasted space, slower queries.
This is where VACUUM, AUTOVACUUM, and ANALYZE come in.
They are essential DBA tools to keep PostgreSQL lean, performant, and healthy.
Why Do We Need VACUUM?
-
Dead tuples waste disk space.
-
Queries scanning bloated tables run slower.
-
Without cleanup, PostgreSQL risks transaction ID wraparound, which can lead to downtime.
Types of VACUUM
1. VACUUM
-
Removes dead tuples.
-
Space is reusable inside the table (not returned to OS).
2. VACUUM FULL
-
Rewrites table → physically shrinks it.
-
Space is returned to OS.
-
Requires exclusive lock.
3. VACUUM FREEZE
-
Marks tuples as “frozen” → prevents transaction ID wraparound issues.
What is AUTOVACUUM?
Autovacuum = PostgreSQL’s automatic background cleaner.
-
Runs automatically based on thresholds.
-
Prevents table bloat and transaction wraparound.
Key settings in postgresql.conf
:
Example: For a 1,000,000 row table with scale factor 0.2
:
-
Autovacuum triggers at 200,000 dead tuples.
What is ANALYZE?
ANALYZE
updates statistics used by PostgreSQL’s query planner.
-
Without up-to-date stats → bad query plans → poor performance.
Examples:
Always run ANALYZE
after bulk loads, deletes, or schema changes.
Step-by-Step Demo with Outputs
1. Create Test Table
2. Insert 100,000 Rows
✅ Row count:
Table size:
3. Delete 50,000 Rows
Check dead tuples:
Example result:
Table | Live Rows | Dead Rows |
---|---|---|
vacuum_demo | 50,000 | 50,000 |
4. Run Plain VACUUM
Re-check stats:
Example result:
Table | Live Rows | Dead Rows |
---|---|---|
vacuum_demo | 50,000 | 0 |
Dead tuples are gone.
Table size still:
5. Run VACUUM FULL
Check table size:
✅ Table shrank from 16 MB → 8 MB.
6. Run ANALYZE
Check stats:
Query planner now has fresh statistics for optimal execution plans.
7. Watch Autovacuum in Action
Insert & delete again:
Check autovacuum process:
Check dead tuples trend over time:
After some minutes, autovacuum will reduce dead tuples automatically.
Visual Bloat Check
Using pgstattuple
Example before cleanup (after delete):
After VACUUM
:
After VACUUM FULL
:
Quick Approximate Bloat Query
Fast way to spot bloated tables.
DBA Best Practices
-
✅ Keep autovacuum enabled (don’t disable unless you have a custom scheduler).
-
✅ Tune autovacuum scale factors for high-update tables.
-
✅ Monitor bloat regularly with
pgstattuple
or catalog queries. -
✅ Run
VACUUM FULL
only during maintenance windows. -
✅ Always
ANALYZE
after bulk inserts or deletes. -
✅ Watch for transaction ID wraparound — PostgreSQL will stop writes if ignored.
Final Thoughts
-
VACUUM
keeps your tables clean. -
AUTOVACUUM
keeps your database running automatically. -
ANALYZE
keeps your queries fast and efficient.
Together, these tools are not optional — they are mission-critical maintenance tasks every PostgreSQL DBA must master.
By monitoring, tuning, and scheduling these correctly, your PostgreSQL database will stay lean, performant, and production-ready .