<div class="artical-content-bak main-content editor-side-new">
<div class="con editor-preview-side" id="result">
<p>Redis的复制解决了单点问题,但主节点若出现故障,则要人工干预进行故障转移。先看看1主2从(master,slave-1和slave-2)的Redis主从模式下,如何进行故障转移的。<br></p>
<p><br></p>
<p>1. 主节点发生故障后,客户端连接主节点失败,两个从节点与主节点连接失败造成复制中断。</p>
<p>2. 需要选出一个从节点(slave-1),对其执行slaveof no one命令使其成为新的主节点(new-master)。</p>
<p>3. 从节点(slave-1)成为新的主节点后,更新应用方的主节点信息,重新启动应用方。</p>
<p>4. 客户端命令另一个从节点(slave-2)去复制新的主节点(new-master)。</p>
<p>5. 待原来的主节点恢复后,让它去复制新的主节点。</p>
<p><br></p>
<p>如上人工干预的过程,很难保证准确性,实效性,这正是Redis Sentinel要解决的问题。</p>
<p><br></p>
<p>Redis Sentinel是一个分布式架构,其中包含若干个Sentinel节点和Redis数据节点。每个Sentinel节点会对数据节点和其余Sentinel节点进行监控,当它发现节点不可达时,会对节点做下线标识。如果被标识的是主数据节点,它还会和其它Sentinel节点进行协商,当大多数Sentinel节点都认为主节点不可达时,它们会选举出一个Sentinel节点来完成自动故障转移的工作,同时会将这个变化实时通知给Redis应用方。整个过程不需人工介入,有效的解决了Redis的高可用问题。</p>
<p><br></p>
<p><br></p>
<p>部署Redis Sentinel的高可用架构</p>
<p><br></p>
<p>1. 搭建3个Redis数据节点,初始状态:master节点,6479端口;slave-1节点,6480端口和slave-2节点,6481端口。</p>
<p>127.0.0.1:6479> info replication</p>
<p># Replication</p>
<p>role:master</p>
<p>connected_slaves:2</p>
<p>slave0:ip=127.0.0.1,port=6480,state=online,offset=845,lag=0</p>
<p>slave1:ip=127.0.0.1,port=6481,state=online,offset=845,lag=0</p>
<p><br></p>
<p>2. 搭建3个Sentinel节点,初始配置文件如下(3个节点分别对应26479,26480和26481端口):</p>
<p>port 26479</p>
<p>daemonize yes</p>
<p>loglevel notice</p>
<p>dir "/home/redis/stayfoolish/26479/data"</p>
<p>logfile "/home/redis/stayfoolish/26479/log/sentinel.log"</p>
<p>pidfile "/home/redis/stayfoolish/26479/log/sentinel.pid"</p>
<p>unixsocket "/home/redis/stayfoolish/26479/log/sentinel.sock"</p>
<p><br></p>
<p># sfmaster</p>
<p>sentinel monitor sfmaster 127.0.0.1 6479 2</p>
<p>sentinel auth-pass sfmaster abcdefg</p>
<p>sentinel down-after-milliseconds sfmaster 30000</p>
<p>sentinel parallel-syncs sfmaster 1</p>
<p>sentinel failover-timeout sfmaster 180000</p>
<p><br></p>
<p>启动Sentinel节点,查看信息,可见其找到了主节点,发现了2个从节点,也发现了一共3个Sentinel节点。</p>
<p>127.0.0.1:26479> info sentinel</p>
<p># Sentinel</p>
<p>sentinel_masters:1</p>
<p>sentinel_tilt:0</p>
<p>sentinel_running_scripts:0</p>
<p>sentinel_scripts_queue_length:0</p>
<p>sentinel_simulate_failure_flags:0</p>
<p>master0:name=sfmaster,status=ok,address=127.0.0.1:6479,slaves=2,sentinels=3</p>
<p><br></p>
<p>至此Redis Sentinel已经搭建起来了,有了Redis复制的基础,该过程还比较容易。</p>
<p><br></p>
<p><br></p>
<p>下面kill -9杀掉6479主节点,模拟故障,通过日志查看下故障转移的过程。</p>
<p><br></p>
<p>1. 杀掉6479主节点</p>
<p>$ ps -ef | egrep 'redis-server.*6479' | egrep -v 'egrep' | awk '{print $2}' | xargs kill -9</p>
<p><br></p>
<p>127.0.0.1:6479> info replication</p>
<p>Could not connect to Redis at 127.0.0.1:6479: Connection refused</p>
<p>not connected> </p>
<p><br></p>
<p>2. 看下Redis节点6480端口的日志,显示了无法连接6479端口,被Sentinel节点提升为新主节点,和响应6481端口复制请求的过程。</p>
<p>~/stayfoolish/6480/log $ tail -f redis.log</p>
<p>20047:S 22 Jul 03:03:22.946 # Error condition on socket for SYNC: Connection refused</p>
<p>20047:S 22 Jul 03:03:23.954 * Connecting to MASTER 127.0.0.1:6479</p>
<p>20047:S 22 Jul 03:03:23.955 * MASTER <-> SLAVE sync started</p>
<p>20047:S 22 Jul 03:03:23.955 # Error condition on socket for SYNC: Connection refused</p>
<p>...</p>
<p>20047:S 22 Jul 03:03:38.061 * MASTER <-> SLAVE sync started</p>
<p>20047:S 22 Jul 03:03:38.061 # Error condition on socket for SYNC: Connection refused</p>
<p>20047:M 22 Jul 03:03:38.963 * Discarding previously cached master state.</p>
<p>20047:M 22 Jul 03:03:38.963 * MASTER MODE enabled (user request from 'id=27 addr=127.0.0.1:37972 fd=10 name=sentinel-68102904-cmd age=882 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')</p>
<p>20047:M 22 Jul 03:03 |
|