The Most Common PostgreSQL Performance Mistakes
Most PostgreSQL performance problems in production come from a short list of root causes: missing indexes, N+1 query patterns generated by ORMs, bloated tables from infrequent vacuuming, and connection exhaustion under concurrent load. This guide walks through diagnosing each problem and the concrete fixes that actually work in production.
Start With EXPLAIN ANALYZE
Before optimising anything, understand what the database is actually doing. EXPLAIN ANALYZE executes the query and shows the actual execution plan with timing at each node.
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.email, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id, u.email
ORDER BY order_count DESC
LIMIT 100;
Look for: Seq Scan on large tables (should be an Index Scan), high actual rows vs estimated rows (stale statistics), and Hash Join on very large datasets (may benefit from an index).
Index Strategy
The most impactful indexes to add are on columns used in WHERE, JOIN ON, and ORDER BY clauses on large tables. Use partial indexes when you frequently filter on a specific condition.
-- Partial index — only index active users
CREATE INDEX idx_users_email_active
ON users (email)
WHERE deleted_at IS NULL;
-- Composite index — column order matters
-- This serves WHERE status = ? AND created_at > ?
CREATE INDEX idx_orders_status_created
ON orders (status, created_at DESC);
-- Covering index — avoids table heap fetch entirely
CREATE INDEX idx_orders_user_covering
ON orders (user_id)
INCLUDE (total, status, created_at);
A covering index (using INCLUDE) lets PostgreSQL answer the query entirely from the index without touching the table. This eliminates random I/O on heap pages and is often a 5-10x improvement for read-heavy queries.
Connection Pooling with PgBouncer
PostgreSQL creates an OS process per connection. At high concurrency, thousands of connections exhaust RAM and CPU. PgBouncer sits in front of PostgreSQL and multiplexes many client connections onto a small pool of actual database connections.
# pgbouncer.ini
[databases]
mydb = host=localhost port=5432 dbname=mydb
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3
Use transaction mode for stateless API backends — it is the most efficient. Avoid session mode unless you use session-level features like advisory locks or SET LOCAL.
Autovacuum Tuning for Write-Heavy Tables
PostgreSQL's MVCC model means old row versions accumulate in the heap. Autovacuum reclaims this dead space, but default settings are tuned for small databases. Write-heavy tables in production need more aggressive vacuum settings.
-- Per-table autovacuum tuning
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.01, -- vacuum when 1% of rows are dead
autovacuum_analyze_scale_factor = 0.005,
autovacuum_vacuum_cost_delay = 2
);
-- Check table bloat
SELECT
schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
n_dead_tup, n_live_tup,
round(n_dead_tup::numeric / nullif(n_live_tup, 0) * 100, 2) AS dead_ratio
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;
Query Patterns to Avoid
- SELECT *: Always select only the columns you need.
SELECT *defeats covering indexes and wastes network bandwidth. - OFFSET pagination on large tables:
OFFSET 10000 LIMIT 20scans and discards 10,000 rows. Use keyset pagination:WHERE id > last_seen_id LIMIT 20. - Functions on indexed columns in WHERE:
WHERE LOWER(email) = ?cannot use an index onemail. Use a functional index:CREATE INDEX ON users (LOWER(email)). - Non-selective indexes: An index on a boolean column with 95%
truevalues is useless — the planner will choose a seq scan. Use partial indexes instead.
Monitoring Slow Queries
-- Enable in postgresql.conf
log_min_duration_statement = 100 -- log queries over 100ms
shared_preload_libraries = 'pg_stat_statements'
-- Find the worst queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
Run pg_stat_statements in production from day one. The data it collects is the fastest path to finding your worst-performing queries without needing to reproduce them in a dev environment.