Some tips and gotchas of setting up PostgreSQL replication between docker containers.
Setting up replication in docker containers, and especially starting up a replica node, is less straightforward then one would assume. Here are some tips and some of the gotchas that I have encountered while trying to get the Postgres Docker Official Image (this one) to cooperate.1: Use
docker network create postgresnet
to be able to directly approach all nodes by name. (you can obviously replace 'postgresnet' by anything else, but I do like that name.)2: Do not map the pgdata directory directly to a local folder on MacOS, the init script will screw up the permissions. (This was on a macbook, it may be fine on windows or linux.)
3: Files inside the /docker-entrypoint-initdb.d folder in the docker container (that can be mapped to some local folder) are executed if there is no database in the pgdata folder. Great for things that have to happen once at first start. However, this happens after initdb has been run and the database has started (possibly in single user mode).
That means that it is not possible to have a pg_basebackup stream the contents of the master-database to the datadirectory at this point!
It is also not possible to stop the database, as that will trigger the container to restart in order to 'fix the problem' of the database not running.
To solve this the pg_basebackup has to be done on the master, the result dumped into a folder that is accessible from the host (e.g. mounted), and copied into the replica's pgdata folder (with docker cp) while the replica is off.
4: When copying the pgdata to the replica, do not copy the postgresql.auto.conf file (if present) without inspection. It contains a primary_conninfo line that pg_basebackup prepared for the replica, but I have not seen it being correct yet, and copying it over when it is improperly configured will actually prevent the replication from working!
Better craft the primary_conninfo line properly by hand.
Ellert van Koperen, January 2024.