一、项目背景

        由于项目要求主从数据库,之前想到的方案是pgpool+repmgr来实现负载均衡和流复制,但是在网络不通时,容易出现集群脑裂问题。比如,当前运行环境,主库1+备库2+witness3,当主库1与外界网络断开后主库并不知道自身与外界断开(需要额外的脚本来定时检测),一直保持主库的状态,而备库2与witness3形成的集群自动将2提升为主库来保证集群的可行性状态,这时候如果1与外界网络恢复后,就会形成两个主库1和2,造成集群脑力的问题,需要人工干预恢复。如果1还有数据写入的话,就涉及到了数据库备份和同步的问题。

        当然我也想过通过后台脚本来定时检测网络从而将网络不通的主库自动降为备库,但是该方案较为复杂并且可行性还有待校验……

二、项目环境

1、基本情况

docker部署,三台服务器(haproxy+etcd+patroni)。

2、虚拟机情况

patroni1:192.168.13.30,docker-ce+docker-compose

patroni2:192.168.13.31,docker-ce+docker-compose

patroni3:192.168.13.32,docker-ce+docker-compose

3、版本

haproxy+etcd+patroni

haproxy3.2.1、etcd3.3.13、patroni4.0.6、postgresql17.5

打包完成后即可得到patroni:4.0.6-latest的镜像包,在每台服务器上安装镜像。

关于patroni最新版镜像的打包与使用可以看我的另外一篇博客(《patroni镜像的打包、安装与部署——patroni避坑》)。

三、配置文件

1、目录

/home/patroni_cluster/

                                --etcd-data

                                        --etcd1.etcd

                                        --etcd2.etcd

                                        --etcd3.etcd

                                --patroni-data

                                --docker-compose.yml

                                --.env

                                --haproxy.cfg

                                --patroni.yml

                                --init.sh

                                --init.sql

2、各文件内容

①docker-compose.yml

version: '4.0'
services:
  etcd:
    image: patroni:4.0.6-latest
    container_name: etcd${CURRENT_NODE_NAME}
    restart: always
    command: etcd
    environment:
      - ALLOW_NONE_AUTHENTICATION=yes
      - ETCD_NAME=etcd${CURRENT_NODE_NAME}
      - ETCD_ADVERTISE_CLIENT_URLS=${LOCAL_ADDRESS}:2379
      - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
      - ETCD_INITIAL_ADVERTISE_PEER_URLS=http://${LOCAL_ADDRESS}:2380
      - ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
      - ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
      - ETCD_INITIAL_CLUSTER=etcd1=http://${PATRONI1}:2380,etcd2=http://${PATRONI2}:2380,etcd3=http://${PATRONI3}:2380
      - ETCD_INITIAL_CLUSTER_STATE=new
    volumes:
      - ./etcd-data:/home/postgres
    ports:
      - "2379:2379"
      - "2380:2380"

  patroni:
    image: patroni:4.0.6-latest
    container_name: patroni${CURRENT_NODE_NAME}
    restart: always
    command: patroni
    environment:
      PATRONI_NAME: patroni${CURRENT_NODE_NAME}
      PATRONI_SCOPE: demo
      ETCDCTL_ENDPOINTS: http://${PATRONI1}:2379,http://${PATRONI2}:2379,http://${PATRONI3}:2379
      PATRONI_ETCD3_HOSTS: ${PATRONI1}:2379,${PATRONI2}:2379,${PATRONI3}:2379
      PATRONI_POSTGRESQL_LISTEN: 0.0.0.0:5432
      PATRONI_POSTGRESQL_CONNECT_ADDRESS: ${LOCAL_ADDRESS}:5432
      PATRONI_RESTAPI_LISTEN: 0.0.0.0:8008
      PATRONI_RESTAPI_CONNECT_ADDRESS: ${LOCAL_ADDRESS}:8008
      PATRONI_DEMO_LISTEN_ADDRESS: "0.0.0.0:5432"
      PATRONI_DEMO_CONNECT_ADDRESS: "${LOCAL_ADDRESS}:5432"
      PATRONI_POSTGRESQL_PASSWORD: postgres
      PATRONI_POSTGRESQL_USERNAME: postgres
      PATRONI_SUPERUSER_PASSWORD: postgres
      PATRONI_REPLICATION_USERNAME: replicator
      PATRONI_REPLICATION_PASSWORD: replicatorpass
      PATRONI_POSTGRESQL_BIN_DIR: /usr/lib/postgresql/17/bin
    ports:
      - "5432:5432"
      - "8008:8008"
    volumes:
      - ./patroni-data:/home/postgres/data
      - ./patroni.yml:/home/postgres/postgres0.yml
      - ./init.sh:/docker-entrypoint-initdb.d/init.sh
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

  haproxy:
    image: patroni:4.0.6-latest
    container_name: haproxy${CURRENT_NODE_NAME}
    restart: always
    command: haproxy
    environment: 
      ETCDCTL_ENDPOINTS: http://${PATRONI1}:2379,http://${PATRONI2}:2379,http://${PATRONI3}:2379
      PATRONI_ETCD3_HOSTS: "'${PATRONI1}:2379','${PATRONI2}:2379','${PATRONI3}:2379'"
      PATRONI_SCOPE: demo
    ports:
      - "5000:5000"
      - "5001:5001"
      - "8404:8404"

②.env

CURRENT_NODE_NAME=1
LOCAL_ADDRESS=192.168.13.30
PATRONI1=192.168.13.30
PATRONI2=192.168.13.31
PATRONI3=192.168.13.32

其中,CURRENT_NODE_NAME为当前节点序号,LOCAL_ADDRESS为当前节点ip,PATRONI1为节点1的ip,PATRONI2为节点2的ip,PATRONI3为节点3的ip,可以根据部署的服务器来自行更改。

③haproxy.cfg

global
    maxconn 100

defaults
    log global
    mode tcp
    retries 2
    timeout client 30m
    timeout connect 4s
    timeout server 30m
    timeout check 5s

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen batman
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_127.0.0.1_5432 127.0.0.1:5432 maxconn 100 check port 8008
    server postgresql_127.0.0.1_5433 127.0.0.1:5433 maxconn 100 check port 8009

④patroni.yml

#scope: batman
#namespace: /service/
#name: postgresql0

#restapi:
#  listen: 127.0.0.1:8008
#  connect_address: 127.0.0.1:8008
#  cafile: /etc/ssl/certs/ssl-cacert-snakeoil.pem
#  certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem
#  keyfile: /etc/ssl/private/ssl-cert-snakeoil.key
#  authentication:
#    username: username
#    password: password

#ctl:
#  insecure: false # Allow connections to Patroni REST API without verifying certificates
#  certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem
#  keyfile: /etc/ssl/private/ssl-cert-snakeoil.key
#  cacert: /etc/ssl/certs/ssl-cacert-snakeoil.pem

#citus:
#  database: citus
#  group: 0  # coordinator

#etcd:
  #Provide host to do the initial discovery of the cluster topology:
#  host: 127.0.0.1:2379
  #Or use "hosts" to provide multiple endpoints
  #Could be a comma separated string:
  #hosts: host1:port1,host2:port2
  #or an actual yaml list:
  #hosts:
  #- host1:port1
  #- host2:port2
  #Once discovery is complete Patroni will use the list of advertised clientURLs
  #It is possible to change this behavior through by setting:
  #use_proxies: true

#raft:
#  data_dir: .
#  self_addr: 127.0.0.1:2222
#  partner_addrs:
#  - 127.0.0.1:2223
#  - 127.0.0.1:2224

# The bootstrap configuration. Works only when the cluster is not yet initialized.
# If the cluster is already initialized, all changes in the `bootstrap` section are ignored!

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
  initdb:
    - encoding: UTF8
    - data-checksums
  post_init: /docker-entrypoint-initdb.d/init.sh
  users:
    postgres:
      password: postgres
      options:
        - superuser
        - createdb
  pg_hba:
    - host all all 0.0.0.0/0 md5
    - host replication replicator 0.0.0.0/0 md5

postgresql:
  listen: 127.0.0.1:5432
  connect_address: 127.0.0.1:5432
  data_dir: /var/lib/postgresql/data
  bin_dir: /usr/lib/postgresql/17/bin
  authentication:
    superuser:
      username: postgres
      password: postgres
    replication:
      username: replicator
      password: replicatorpass
  parameters:
    wal_level: replica
    hot_standby: "on"
    wal_keep_segments: 8
    max_wal_senders: 10
    max_replication_slots: 10
    archive_mode: "on"
    archive_timeout: 1800s
    archive_command: 'cd .'

tags:
    # failover_priority: 1
    noloadbalance: false
    clonefrom: false
    nosync: false
    nostream: false

由于大部分配置都在docker-compose.yml中进行了配置,这里主要只进行初始化脚本和数据库的配置。

post_init即为patroni初始化连接数据库后,自动执行的脚本。用户可自行修改。

⑤init.sh

初始化脚本,用户可自行更改。

#!/bin/bash
set -e
psql -U postgres -c "CREATE DATABASE test_init;"
psql -U postgres -d test_init -f /docker-entrypoint-initdb.d/init.sql

⑥init.sql

初始化sql,用户可自行更改。

DROP SEQUENCE IF EXISTS "public"."test_init_id__seq";
CREATE SEQUENCE "public"."test_init_id__seq" 
INCREMENT 1
MINVALUE  1
MAXVALUE 2147483647
START 1
CACHE 1;

DROP TABLE IF EXISTS "public"."test_init";
CREATE TABLE "public"."test_init" (
  "id" varchar(50) COLLATE "pg_catalog"."default",
  "name" varchar(50) COLLATE "pg_catalog"."default",
)
;

3、说明

执行docker-compose up -d前,要给目录和文件赋权。
 

chown -R 999:999 /home/patroni_cluster
chmod -R 700 /home/patroni_cluster

etcd-data目录为存储etcd数据的目录,通过docker-compose.yml挂载。下面三个子目录(etcd1.etcd|etcd2.etcd|etcd3.etcd)分别为当前etcd的数据目录。正常使用不必清空该目录,并且重启后能够正常连接上,如果清空某一etcd的数据目录,则需要在集群中删除原有member并重新添加,这个会在后面补充。

patroni-data目录为存储patroni数据的目录,如果清空该目录则重新启动集群时,会执行post_init的初始化脚本。

四、部署命令

1、部署命令

cd /home/patroni_cluster

#检查docker镜像环境,要求安装patroni:4.0.6-latest
docker image ls

#文件夹赋权
chown -R 999:999 /home/patroni_cluster
chmod -R 700 /home/patroni_cluster

#启动
docker-compose up -d haproxy
docker-compose up -d etcd
docker-compose up -d patroni

#验证etcd集群状态
#docker exec -it etcd1 etcdctl --write-out=table  member list
source .env && docker exec -it etcd${CURRENT_NODE_NAME} bash -c "etcdctl --write-out=table  member list"


#查看本节点状态
#docker exec -it etcd1 etcdctl --write-out=table  endpoint status
source .env && docker exec -it etcd${CURRENT_NODE_NAME} bash -c "etcdctl --write-out=table  endpoint status"

#验证数据库状态
#docker exec -it patroni1 patronictl list
source .env && docker exec -it patroni${CURRENT_NODE_NAME} bash -c "patronictl list"


三节点集群需要两个或以上etdc节点才能正常运行,所以第一个启动后并不能正常验证状态,需要第二个启动后才可以。

2、验证命令

当节点正常启动后可以查看到如下输出:
 

[root@localhost patroni_cluster]# #验证etcd集群状态
[root@localhost patroni_cluster]# #docker exec -it etcd1 etcdctl --write-out=table  member list
[root@localhost patroni_cluster]# source .env && docker exec -it etcd${CURRENT_NODE_NAME} bash -c "etcdctl --write-out=table  member list"
+------------------+---------+-------+---------------------------+---------------------------+
|        ID        | STATUS  | NAME  |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+-------+---------------------------+---------------------------+
| 452b2197e8572737 | started | etcd1 | http://192.168.13.30:2380 | http://192.168.13.30:2379 |
| 780d78ebccec6830 | started | etcd3 | http://192.168.13.32:2380 | http://192.168.13.32:2379 |
| 90417d05415afbb5 | started | etcd2 | http://192.168.13.31:2380 | http://192.168.13.31:2379 |
+------------------+---------+-------+---------------------------+---------------------------+
[root@localhost patroni_cluster]# #查看本节点状态
[root@localhost patroni_cluster]# #docker exec -it etcd1 etcdctl --write-out=table  endpoint status
[root@localhost patroni_cluster]# source .env && docker exec -it etcd${CURRENT_NODE_NAME} bash -c "etcdctl --write-out=table  endpoint status"
+----------------+------------------+---------+---------+-----------+-----------+------------+
|    ENDPOINT    |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------+------------------+---------+---------+-----------+-----------+------------+
| 127.0.0.1:2379 | 452b2197e8572737 |  3.3.13 |  254 kB |      true |        17 |        690 |
+----------------+------------------+---------+---------+-----------+-----------+------------+
[root@localhost patroni_cluster]# #验证数据库状态
[root@localhost patroni_cluster]# #docker exec -it patroni1 patronictl list
[root@localhost patroni_cluster]# source .env && docker exec -it patroni${CURRENT_NODE_NAME} bash -c "patronictl list"
+ Cluster: demo (7519811644213669911) -----------+----+-----------+
| Member   | Host          | Role    | State     | TL | Lag in MB |
+----------+---------------+---------+-----------+----+-----------+
| patroni1 | 192.168.13.30 | Replica | streaming |  1 |         0 |
| patroni2 | 192.168.13.31 | Leader  | running   |  1 |           |
| patroni3 | 192.168.13.32 | Replica | streaming |  1 |         0 |
+----------+---------------+---------+-----------+----+-----------+
[root@localhost patroni_cluster]#

3、验证使用

通过navicat也可连接:

192.168.13.30:5000(读写)/192.168.13.31:5000(读写)/192.168.13.32:5000(读写)

192.168.13.30:5001(只读)/192.168.13.31:5001(只读)/192.168.13.32:5001(只读)

192.168.13.30:5432(只读)/192.168.13.31:5432(只读)/192.168.13.32:5432(只读)

至此,hproxy+etcd+patroni三节点集群安装部署完成。

Logo

开源鸿蒙跨平台开发社区汇聚开发者与厂商,共建“一次开发,多端部署”的开源生态,致力于降低跨端开发门槛,推动万物智联创新。

更多推荐