Redis集群与高可用-redis哨兵

Redis集群与高可用的三种模式:

  • Redis主从复制
  • Redis哨兵(Sentinel)
  • Redis Cluster

Redis 哨兵(Sentinel)

Sentinel实现故障转移过程

  1. 多个 sentinel 发现并确认master有问题
  2. 选举出一个 sentinel 作为领导
  3. 选出一个 slave 作为 master
  4. 通知其余 slave 成为新 master 的 slave
  5. 通知客户端主从变化
  6. 等待旧的 master 复活成为新 master 的 slave
  • Redis Sentinel 节点与普通 Redis 没有区别,要实现读写分离依赖于客户端程序

  • Sentinel 机制类似于MySQL中的MHA功能,只解决master和slave角色的自动故障转移问题,但单个Master 的性能瓶颈问题并没有解决

  • Redis 3.0 之前版本中,生产环境一般使用哨兵模式较多,Redis 3.0后推出Redis cluster功能,可以支持更大规模的高并发环境

使用哨兵只能解决redis高可用问题,实现自动故障转移,无法解决master节点的性能瓶颈问题。要解决单机性能瓶颈,提高整体性能,可以使用分布式集群的解决方案

实现哨兵配置案例:

前提:实现哨兵的前提是已经实现了redis的主从复制。master 配置文件中 masterauth 和 slave都必须相同,哨兵节点必须大于等于3,最好为奇数

使用脚本编译安装:

master host41 192.168.23.41
slave host42 192.168.23.42
slave host43 192.168.23.43
1、实现主从复制
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# master 修改配置文件并重启服务 masterauth 和 requirepass 必须一致
sed -i -e 's/bind 127.0.0.1/bind 0.0.0.0/' \
-e 's/^# masterauth .*$/masterauth 12345678/' \
-e 's/^# requirepass .*$/requirepass 12345678/' \
/apps/redis/etc/redis.conf
# slave 修改配置文件并重启服务 (在两台slave执行下面命令)masterauth 和 requirepass 必须一致
sed -i -e 's/bind 127.0.0.1/bind 0.0.0.0/' \
-e 's/^# masterauth .*$/masterauth 12345678/' \
-e 's/^# requirepass .*$/requirepass 12345678/' \
-e 's/^# replicaof .*$/replicaof 192.168.23.41 6379/' \
/apps/redis/etc/redis.conf ;\
systemctl restart redis

# 检查slave
192.168.23.42:6379> INFO REPLICATION
# Replication
role:slave
master_host:192.168.23.41
master_port:6379
master_link_status:up
master_last_io_seconds_ago:8
master_sync_in_progress:0
slave_read_repl_offset:98
slave_repl_offset:98
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:91831adacc78acd95246d7bf4a38c048c37b0931
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:98
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:98

# 检查master
192.168.23.41:6379> INFO REPLICATION
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.23.42,port=6379,state=online,offset=70,lag=0
slave1:ip=192.168.23.43,port=6379,state=online,offset=70,lag=0
master_failover_state:no-failover
master_replid:91831adacc78acd95246d7bf4a38c048c37b0931
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:70
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:70
2、配置哨兵配置文件,在所有节点执行
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 把编译源码目录中的 setinel.conf 复制到安装目录
root@host41:~# cp /usr/local/src/redis-7.2.4/sentinel.conf /apps/redis/etc/
root@host41:~# chown redis.redis /apps/redis/etc/sentinel.conf

# 修改如下配置基
sed -i -e 's/^sentinel monitor mymaster .*$/sentinel monitor mymaster 192.168.23.41 6379 2/' \
-e 's/^# sentinel auth-pass mymaster .*$/sentinel auth-pass mymaster 12345678/' \
-e 's/^sentinel down-after-milliseconds .*$/sentinel down-after-milliseconds mymaster 3000/' \
/apps/redis/etc/sentinel.conf

# 编译安装修改为如下配置项
daemonize yes # 后台守护进程运行
pidfile /apps/redis/run/redis-sentinel.pid
logfile /apps/redis/log/redis-sentinel.log
dir /apps/redis/data/
sentinel monitor mymaster 192.168.23.41 6379 2 # sentinel哨兵集群初始配置的主节点、端口、法定人数限制
sentinel auth-pass mymaster 12345678 # sentinel哨兵集群master密码
sentinel down-after-milliseconds mymaster 3000 # 判断主观下线时间

# 将配置文件复制到其余节点
root@host41:~# scp /apps/redis/etc/sentinel.conf 192.168.23.42:/apps/redis/etc/
root@host41:~# scp /apps/redis/etc/sentinel.conf 192.168.23.43:/apps/redis/etc/
3、配置启动服务,在所有节点执行
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# 可使用命令行测试启动
root@host41:~# /apps/redis/bin/redis-sentinel /apps/redis/etc/sentinel.conf

# 使用cat命令创建service启动文件
cat > /lib/systemd/system/redis-sentinel.service <<EOF
[Unit]
Description=Redis Sentinel
Documentation=https://redis.io/documentation
Wants=network-online.target
After=network-online.target

[Service]
ExecStart=/apps/redis/bin/redis-sentinel /apps/redis/etc/sentinel.conf --supervised systemd
ExecStop=/bin/kill -s QUIT $MAINPID
LimitNOFILE=1000000
NoNewPrivileges=yes
Type=notify
TimeoutStartSec=infinity
TimeoutStopSec=infinity
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755

[Install]
WantedBy=multi-user.target
EOF

# 如果使用命令测试启动过,注意重新修改权限
root@host41:/apps/redis/log# chown -R redis.redis /apps/redis/

# 启动服务
root@host41:~# systemctl daemon-reload
root@host41:~# systemctl enable --now redis-sentinel
4、验证哨兵服务
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# 检查26379端口
root@host41:~# ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 511 0.0.0.0:6379 0.0.0.0:*
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 511 0.0.0.0:26379 0.0.0.0:*
LISTEN 0 1024 127.0.0.53%lo:53 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 511 [::1]:6379 [::]:*
LISTEN 0 511 [::]:26379 [::]:*

# 查看日志
root@host41:~# tail -f /apps/redis/log/redis-sentinel.log
1167:X 19 Jun 2024 02:23:39.819 * Redis version=7.2.4, bits=64, commit=00000000, modified=0, pid=1167, just started
1167:X 19 Jun 2024 02:23:39.819 * Configuration loaded
1167:X 19 Jun 2024 02:23:39.819 * monotonic clock: POSIX clock_gettime
1167:X 19 Jun 2024 02:23:39.864 * Running mode=sentinel, port=26379.
1167:X 19 Jun 2024 02:23:39.865 * Sentinel ID is a42023cc2a11b727d57e4152df7e571e0d2c3c4f
1167:X 19 Jun 2024 02:23:39.865 # +monitor master mymaster 192.168.23.41 6379 quorum 2
1167:X 19 Jun 2024 02:23:42.859 # +sdown sentinel 1054a8b0c6a2c3c2d2bd9e99976cdebdae73eca2 192.168.23.42 26379 @ mymaster 192.168.23.41 6379
1167:X 19 Jun 2024 02:23:42.859 # +sdown sentinel 5c62cfa404d82ff4011f4a0f0d3146ee37a267e9 192.168.23.43 26379 @ mymaster 192.168.23.41 6379
1167:X 19 Jun 2024 02:25:33.924 # -sdown sentinel 1054a8b0c6a2c3c2d2bd9e99976cdebdae73eca2 192.168.23.42 26379 @ mymaster 192.168.23.41 6379
1167:X 19 Jun 2024 02:34:47.798 # -sdown sentinel 5c62cfa404d82ff4011f4a0f0d3146ee37a267e9 192.168.23.43 26379 @ mymaster 192.168.23.41 6379

# 查看sentinel info
root@host41:~# redis-cli -p 26379
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_tilt_since_seconds:-1
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.23.41:6379,slaves=2,sentinels=3
5、检查故障转移
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# 将当前master(192.168.23.41)服务关闭
root@host41:~# systemctl stop redis
root@host41:~# systemctl stop redis-sentinel

# 观察slave的sentinel状态,master由192.168.23.41转移到了192.168.23.43
root@host42:~# redis-cli -a 12345678 -p 26379
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_tilt_since_seconds:-1
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.23.41:6379,slaves=2,sentinels=3

127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_tilt_since_seconds:-1
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.23.43:6379,slaves=2,sentinels=3

# 日志
root@host42:~# cat /apps/redis/log/redis-sentinel.log
656:X 19 Jun 2024 03:08:48.074 # +sdown master mymaster 192.168.23.41 6379
656:X 19 Jun 2024 03:08:48.537 * Sentinel new configuration saved on disk
656:X 19 Jun 2024 03:08:48.537 # +new-epoch 3
656:X 19 Jun 2024 03:08:48.646 * Sentinel new configuration saved on disk
656:X 19 Jun 2024 03:08:48.646 # +vote-for-leader a42023cc2a11b727d57e4152df7e571e0d2c3c4f 3
656:X 19 Jun 2024 03:08:49.224 # +odown master mymaster 192.168.23.41 6379 #quorum 3/2
656:X 19 Jun 2024 03:08:49.224 * Next failover delay: I will not start a failover before Wed Jun 19 03:14:48 2024
656:X 19 Jun 2024 03:08:49.530 # +config-update-from sentinel a42023cc2a11b727d57e4152df7e571e0d2c3c4f 192.168.23.41 26379 @ mymaster 192.168.23.41 6379
656:X 19 Jun 2024 03:08:49.530 # +switch-master mymaster 192.168.23.41 6379 192.168.23.43 6379
656:X 19 Jun 2024 03:08:49.531 * +slave slave 192.168.23.42:6379 192.168.23.42 6379 @ mymaster 192.168.23.43 6379
656:X 19 Jun 2024 03:08:49.531 * +slave slave 192.168.23.41:6379 192.168.23.41 6379 @ mymaster 192.168.23.43 6379
656:X 19 Jun 2024 03:08:49.913 * Sentinel new configuration saved on disk
656:X 19 Jun 2024 03:08:52.547 # +sdown slave 192.168.23.41:6379 192.168.23.41 6379 @ mymaster 192.168.23.43 6379
656:X 19 Jun 2024 03:09:06.525 # +sdown sentinel a42023cc2a11b727d57e4152df7e571e0d2c3c4f 192.168.23.41 26379 @ mymaster 192.168.23.43 6379
1
2
3
4
5
6
7
8
9
10
11
12
# 重新启动192.168.23.41的redis和redis-sentinel,观察日志
root@host41:~# systemctl start redis redis-sentinel

root@host41:~# cat /apps/redis/log/redis-sentinel.log
856:X 19 Jun 2024 03:16:39.889 * Supervised by systemd. Please make sure you set appropriate values for TimeoutStartSec and TimeoutStopSec in your service unit.
856:X 19 Jun 2024 03:16:39.889 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
856:X 19 Jun 2024 03:16:39.889 * Redis version=7.2.4, bits=64, commit=00000000, modified=0, pid=856, just started
856:X 19 Jun 2024 03:16:39.890 * Configuration loaded
856:X 19 Jun 2024 03:16:39.891 * monotonic clock: POSIX clock_gettime
856:X 19 Jun 2024 03:16:39.892 * Running mode=sentinel, port=26379.
856:X 19 Jun 2024 03:16:39.893 * Sentinel ID is a42023cc2a11b727d57e4152df7e571e0d2c3c4f
856:X 19 Jun 2024 03:16:39.893 # +monitor master mymaster 192.168.23.43 6379 quorum 2
1
2
3
4
5
6
7
8
9
10
11
12
13
# 查看配置文件中的主从配置,已自动变更为192.168.23.43 
root@host41:~# grep ^replicaof /apps/redis/etc/redis.conf
replicaof 192.168.23.43 6379

root@host42:~# grep ^replicaof /apps/redis/etc/redis.conf
replicaof 192.168.23.43 6379

# 查看哨兵配置文件monitorip也已变更
root@host41:~# grep 192.168 /apps/redis/etc/sentinel.conf
sentinel monitor mymaster 192.168.23.43 6379 2

root@host42:~# grep 192.168 /apps/redis/etc/sentinel.conf
sentinel monitor mymaster 192.168.23.43 6379 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
在新的主节点(192.168.23.43)检查主从状态
root@host43:~# redis-cli -a 12345678
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.23.42,port=6379,state=online,offset=481380,lag=0
slave1:ip=192.168.23.41,port=6379,state=online,offset=481239,lag=0
master_failover_state:no-failover
master_replid:21bca4b1979437ee3e37477174274c20a59a8f33
master_replid2:39c37024640c52126cbbc6ab64e39e0b5d489024
master_repl_offset:481380
second_repl_offset:231581
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:3318
repl_backlog_histlen:478063

Redis集群与高可用-redis哨兵
https://www.xcjyc.top/2024/06/14/Redis集群与高可用-redis哨兵/
作者
XCJYC
发布于
2024年6月14日
许可协议