Vacuum, Autovacuum, and Analyze — Everything a PostgreSQL DBA Must Know

Introduction

PostgreSQL uses MVCC (Multi-Version Concurrency Control) to handle concurrent transactions.

When rows are updated or deleted, PostgreSQL doesn’t overwrite them.
Instead, it marks the old versions as dead tuples and keeps new versions.
Over time, dead tuples accumulate → bloat, wasted space, slower queries.

This is where VACUUM, AUTOVACUUM, and ANALYZE come in.
They are essential DBA tools to keep PostgreSQL lean, performant, and healthy.

Why Do We Need VACUUM?

Dead tuples waste disk space.
Queries scanning bloated tables run slower.
Without cleanup, PostgreSQL risks transaction ID wraparound, which can lead to downtime.

Types of VACUUM

1. VACUUM

Removes dead tuples.
Space is reusable inside the table (not returned to OS).

2. VACUUM FULL

Rewrites table → physically shrinks it.
Space is returned to OS.
Requires exclusive lock.

3. VACUUM FREEZE

Marks tuples as “frozen” → prevents transaction ID wraparound issues.

What is AUTOVACUUM?

Autovacuum = PostgreSQL’s automatic background cleaner.

Runs automatically based on thresholds.
Prevents table bloat and transaction wraparound.

Key settings in postgresql.conf:

Example: For a 1,000,000 row table with scale factor 0.2:

Autovacuum triggers at 200,000 dead tuples.

What is ANALYZE?

ANALYZE updates statistics used by PostgreSQL’s query planner.

Without up-to-date stats → bad query plans → poor performance.

Examples:

Always run ANALYZE after bulk loads, deletes, or schema changes.

Step-by-Step Demo with Outputs

1. Create Test Table

2. Insert 100,000 Rows

✅ Row count:

Table size:

3. Delete 50,000 Rows

Check dead tuples:

Example result:

Table	Live Rows	Dead Rows
vacuum_demo	50,000	50,000

4. Run Plain VACUUM

Re-check stats:

Example result:

Table	Live Rows	Dead Rows
vacuum_demo	50,000	0

Dead tuples are gone.
Table size still:

5. Run VACUUM FULL

Check table size:

✅ Table shrank from 16 MB → 8 MB.

6. Run ANALYZE

Check stats:

Query planner now has fresh statistics for optimal execution plans.

7. Watch Autovacuum in Action

Insert & delete again:

Check autovacuum process:

Check dead tuples trend over time:

After some minutes, autovacuum will reduce dead tuples automatically.

Visual Bloat Check

Using `pgstattuple`

Example before cleanup (after delete):

After VACUUM:

After VACUUM FULL:

Quick Approximate Bloat Query

Fast way to spot bloated tables.

DBA Best Practices

✅ Keep autovacuum enabled (don’t disable unless you have a custom scheduler).
✅ Tune autovacuum scale factors for high-update tables.
✅ Monitor bloat regularly with pgstattuple or catalog queries.
✅ Run VACUUM FULL only during maintenance windows.
✅ Always ANALYZE after bulk inserts or deletes.
✅ Watch for transaction ID wraparound — PostgreSQL will stop writes if ignored.

Final Thoughts

VACUUM keeps your tables clean.
AUTOVACUUM keeps your database running automatically.
ANALYZE keeps your queries fast and efficient.

Together, these tools are not optional — they are mission-critical maintenance tasks every PostgreSQL DBA must master.

By monitoring, tuning, and scheduling these correctly, your PostgreSQL database will stay lean, performant, and production-ready .

Archives

Categories

Meta