A beginner-friendly guide to setting up PostgreSQL replication with 1 primary and 2 replicas using Docker.
- Understanding Replication
- Setup: 1 Primary + 2 Replicas
- Verify and Test Replication
- Managing Data (CRUD Operations)
- Failover and Promotion
- Monitoring and Troubleshooting
- Cleanup
What is Streaming Replication?
- Primary server accepts writes and reads
- Replica servers receive changes via WAL (Write-Ahead Log) streaming
- Replicas are read-only (hot standby mode)
- Changes replicate in near real-time
Key Concepts:
- WAL (Write-Ahead Log): Transaction log that records all database changes
- Hot Standby: Replicas can serve read queries while replicating
- pg_basebackup: Tool to clone initial data from primary to replica
- Promotion: Converting a replica to primary (failover)
Architecture:
┌─────────────────┐
│ Primary Server │ (Read + Write)
│ postgres-primary│
│ Port: 5432 │
└────────┬────────┘
│
├──── WAL Stream ────> Replica 1 (postgres-replica1, Port: 5433) [Read-Only]
│
└──── WAL Stream ────> Replica 2 (postgres-replica2, Port: 5434) [Read-Only]
# Create network for database communication
docker network create postgres-network
# Create volumes for persistent storage
docker volume create postgres-primary-data
docker volume create postgres-replica1-data
docker volume create postgres-replica2-dataWhy? Containers need a shared network to communicate, and volumes persist data even if containers are deleted.
docker run -d \
--name postgres-primary \
--network postgres-network \
-e POSTGRES_PASSWORD=primarypass \
-e POSTGRES_USER=postgres \
-e POSTGRES_DB=testdb \
-v postgres-primary-data:/var/lib/postgresql/data \
-p 5432:5432 \
postgres:15Test connection:
docker exec -it postgres-primary psql -U postgres -d testdb -c "SELECT version();"Access container and edit configuration:
docker exec -it postgres-primary bashInside the container:
# Add replication settings to postgresql.conf
cat >> /var/lib/postgresql/data/postgresql.conf << EOF
# Replication Configuration
wal_level = replica
max_wal_senders = 3
max_replication_slots = 3
hot_standby = on
EOFWhat these settings mean:
wal_level = replica: Enables WAL logging for replicationmax_wal_senders = 3: Allows up to 3 replicas to connect simultaneouslymax_replication_slots = 3: Reserves slots for replicas (prevents WAL deletion)hot_standby = on: Allows replicas to accept read queries
Create replication user:
psql -U postgres -c "CREATE ROLE replicator WITH REPLICATION PASSWORD 'replicapass' LOGIN;"Allow replication connections in pg_hba.conf:
echo "host replication replicator 0.0.0.0/0 md5" >> /var/lib/postgresql/data/pg_hba.confExit and restart:
exit
docker restart postgres-primary# Clone data from primary using temporary container
docker run --rm \
--network postgres-network \
-v postgres-replica1-data:/var/lib/postgresql/data \
postgres:15 \
bash -c "PGPASSWORD=replicapass pg_basebackup -h postgres-primary -D /var/lib/postgresql/data -U replicator -Fp -Xs -P -R"
# Start replica container with cloned data
docker run -d \
--name postgres-replica1 \
--network postgres-network \
-e POSTGRES_PASSWORD=replicapass \
-v postgres-replica1-data:/var/lib/postgresql/data \
-p 5433:5432 \
postgres:15What pg_basebackup does:
-h postgres-primary: Connect to primary server-D /var/lib/postgresql/data: Target data directory-U replicator: Use replication user-Fp: Plain format (directory copy)-Xs: Stream WAL during backup-P: Show progress-R: Auto-create standby.signal and replication configuration
# Clone data from primary
docker run --rm \
--network postgres-network \
-v postgres-replica2-data:/var/lib/postgresql/data \
postgres:15 \
bash -c "PGPASSWORD=replicapass pg_basebackup -h postgres-primary -D /var/lib/postgresql/data -U replicator -Fp -Xs -P -R"
# Start replica container
docker run -d \
--name postgres-replica2 \
--network postgres-network \
-e POSTGRES_PASSWORD=replicapass \
-v postgres-replica2-data:/var/lib/postgresql/data \
-p 5434:5432 \
postgres:15On primary - view connected replicas:
docker exec -it postgres-primary psql -U postgres -c \
"SELECT client_addr, state, sync_state, write_lag, replay_lag FROM pg_stat_replication;"Expected output:
client_addr | state | sync_state | write_lag | replay_lag
-------------+-----------+------------+-----------+------------
172.18.0.3 | streaming | async | 00:00:00 | 00:00:00
172.18.0.4 | streaming | async | 00:00:00 | 00:00:00
Check if each instance is primary or replica:
# Returns 'f' (false) = primary, 't' (true) = replica
docker exec -it postgres-primary psql -U postgres -c "SELECT pg_is_in_recovery();"
docker exec -it postgres-replica1 psql -U postgres -c "SELECT pg_is_in_recovery();"
docker exec -it postgres-replica2 psql -U postgres -c "SELECT pg_is_in_recovery();"Check replication slots:
docker exec -it postgres-primary psql -U postgres -c "SELECT * FROM pg_replication_slots;"On primary:
# Create table and insert 1000 rows
docker exec -it postgres-primary psql -U postgres -d testdb -c \
"CREATE TABLE test_replication (id SERIAL PRIMARY KEY, data TEXT, created_at TIMESTAMP DEFAULT NOW());"
docker exec -it postgres-primary psql -U postgres -d testdb -c \
"INSERT INTO test_replication (data) SELECT 'Row ' || g FROM generate_series(1,1000) g;"Verify count on primary:
docker exec -it postgres-primary psql -U postgres -d testdb -c \
"SELECT count(*) FROM test_replication;"Wait a moment for replication, then check replicas:
sleep 2
# Check replica 1
docker exec -it postgres-replica1 psql -U postgres -d testdb -c \
"SELECT count(*) FROM test_replication;"
docker exec -it postgres-replica1 psql -U postgres -d testdb -c \
"SELECT * FROM test_replication ORDER BY id DESC LIMIT 5;"
# Check replica 2
docker exec -it postgres-replica2 psql -U postgres -d testdb -c \
"SELECT count(*) FROM test_replication;"
docker exec -it postgres-replica2 psql -U postgres -d testdb -c \
"SELECT * FROM test_replication ORDER BY id DESC LIMIT 5;"Result: Both replicas should show the same data as primary.
# Insert more rows
docker exec -it postgres-primary psql -U postgres -d testdb -c \
"INSERT INTO test_replication (data) SELECT 'New row ' || g FROM generate_series(1,100) g;"
# Update existing rows
docker exec -it postgres-primary psql -U postgres -d testdb -c \
"UPDATE test_replication SET data = 'Updated: ' || data WHERE id <= 10;"Verify changes replicated:
sleep 2
docker exec -it postgres-replica1 psql -U postgres -d testdb -c \
"SELECT * FROM test_replication WHERE id <= 10;"docker exec -it postgres-primary psql -U postgres -d testdb -c \
"DELETE FROM test_replication WHERE id BETWEEN 50 AND 100;"Verify deletion replicated:
docker exec -it postgres-replica1 psql -U postgres -d testdb -c \
"SELECT count(*) FROM test_replication;"Trying to write on replica will fail:
docker exec -it postgres-replica1 psql -U postgres -d testdb -c \
"INSERT INTO test_replication (data) VALUES ('test');"Error: ERROR: cannot execute INSERT in a read-only transaction
Simulate primary failure:
docker stop postgres-primaryPromote replica1 to primary:
docker exec -it postgres-replica1 psql -U postgres -c "SELECT pg_promote(wait_seconds => 60);"Verify promotion:
# Should return 'f' (false) = now a primary
docker exec -it postgres-replica1 psql -U postgres -c "SELECT pg_is_in_recovery();"Test writes on new primary:
docker exec -it postgres-replica1 psql -U postgres -d testdb -c \
"INSERT INTO test_replication (data) VALUES ('Written to new primary');"Point replica2 to new primary (re-clone approach):
# Stop replica2
docker stop postgres-replica2
docker rm postgres-replica2
# Re-clone from new primary (replica1)
docker run --rm \
--network postgres-network \
-v postgres-replica2-data:/var/lib/postgresql/data \
postgres:15 \
bash -c "rm -rf /var/lib/postgresql/data/* && PGPASSWORD=replicapass pg_basebackup -h postgres-replica1 -D /var/lib/postgresql/data -U replicator -Fp -Xs -P -R"
# Start replica2
docker run -d \
--name postgres-replica2 \
--network postgres-network \
-e POSTGRES_PASSWORD=replicapass \
-v postgres-replica2-data:/var/lib/postgresql/data \
-p 5434:5432 \
postgres:15Verify new topology:
# replica1 is now primary
docker exec -it postgres-replica1 psql -U postgres -c \
"SELECT client_addr, state FROM pg_stat_replication;"When old primary comes back, convert it to replica:
# Stop and remove old primary container
docker stop postgres-primary
docker rm postgres-primary
# Re-clone data from current primary (replica1)
docker run --rm \
--network postgres-network \
-v postgres-primary-data:/var/lib/postgresql/data \
postgres:15 \
bash -c "rm -rf /var/lib/postgresql/data/* && PGPASSWORD=replicapass pg_basebackup -h postgres-replica1 -D /var/lib/postgresql/data -U replicator -Fp -Xs -P -R"
# Start as replica
docker run -d \
--name postgres-primary \
--network postgres-network \
-e POSTGRES_PASSWORD=primarypass \
-v postgres-primary-data:/var/lib/postgresql/data \
-p 5432:5432 \
postgres:15Verify it's now a replica:
docker exec -it postgres-primary psql -U postgres -c "SELECT pg_is_in_recovery();"
# Should return 't' (true)If you accidentally promoted a replica and want to revert:
# Stop and remove the promoted replica
docker stop postgres-replica1
docker rm postgres-replica1
# Re-clone from current primary
docker run --rm \
--network postgres-network \
-v postgres-replica1-data:/var/lib/postgresql/data \
postgres:15 \
bash -c "rm -rf /var/lib/postgresql/data/* && PGPASSWORD=replicapass pg_basebackup -h postgres-primary -D /var/lib/postgresql/data -U replicator -Fp -Xs -P -R"
# Restart as replica
docker run -d \
--name postgres-replica1 \
--network postgres-network \
-e POSTGRES_PASSWORD=replicapass \
-v postgres-replica1-data:/var/lib/postgresql/data \
-p 5433:5432 \
postgres:15On primary:
docker exec -it postgres-primary psql -U postgres -c \
"SELECT client_addr, state,
pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS lag_bytes,
write_lag, flush_lag, replay_lag
FROM pg_stat_replication;"On primary (current WAL position):
docker exec -it postgres-primary psql -U postgres -c "SELECT pg_current_wal_lsn();"On replica (last replayed WAL position):
docker exec -it postgres-replica1 psql -U postgres -c "SELECT pg_last_wal_replay_lsn();"Stop a replica:
docker stop postgres-replica1Insert data on primary while replica is down:
docker exec -it postgres-primary psql -U postgres -d testdb -c \
"INSERT INTO test_replication (data) VALUES ('Data while replica1 down');"Restart replica - it will automatically catch up:
docker start postgres-replica1
sleep 5
# Verify it caught up
docker exec -it postgres-replica1 psql -U postgres -d testdb -c \
"SELECT * FROM test_replication ORDER BY id DESC LIMIT 5;"# Primary logs
docker logs postgres-primary --tail 50
# Replica logs
docker logs postgres-replica1 --tail 50# Stop all containers
docker stop postgres-primary postgres-replica1 postgres-replica2
# Remove containers
docker rm postgres-primary postgres-replica1 postgres-replica2
# Remove volumes (WARNING: deletes all data)
docker volume rm postgres-primary-data postgres-replica1-data postgres-replica2-data
# Remove network
docker network rm postgres-networkdocker stop postgres-primary postgres-replica1 postgres-replica2
docker rm postgres-primary postgres-replica1 postgres-replica2
# Volumes remain - you can recreate containers using same volumesSolution: Check network and pg_hba.conf configuration on primary.
docker exec -it postgres-primary cat /var/lib/postgresql/data/pg_hba.conf | grep replicationSolution: Pick one as canonical primary, re-clone others from it (see Scenario 2 & 3).
Solution: Use the temporary container approach which cleans and clones in one step.
Solution: Check network, disk I/O, and consider synchronous replication or connection pooling.
- Replication Slots: Create named slots to prevent WAL deletion before replica catches up
- Synchronous Replication: Wait for replica confirmation before commit (safer, slower)
- Automated Failover: Use Patroni or repmgr for automatic promotion
- Connection Pooling: Use PgBouncer to manage connections efficiently
- Monitoring: Set up Prometheus + Grafana for metrics
- Partitioning: Learn table partitioning for large datasets
- Clustering: Explore Citus or Patroni + etcd for distributed PostgreSQL
# Check if primary or replica
docker exec -it <container> psql -U postgres -c "SELECT pg_is_in_recovery();"
# View replication status
docker exec -it postgres-primary psql -U postgres -c "SELECT * FROM pg_stat_replication;"
# Promote replica to primary
docker exec -it <replica> psql -U postgres -c "SELECT pg_promote();"
# Check WAL position
docker exec -it <container> psql -U postgres -c "SELECT pg_current_wal_lsn();"
# View replication slots
docker exec -it postgres-primary psql -U postgres -c "SELECT * FROM pg_replication_slots;"
# Create replication slot
docker exec -it postgres-primary psql -U postgres -c "SELECT pg_create_physical_replication_slot('slot_name');"