MongoDB复制集选举原理及管理详解-创新互联

MongoDB复制集的节点是通过选举产生主节点的,下面将介绍复制集节点间选举的过程

  • MongoDB复制的原理

复制是基于操作日志oplog,相当于MySQL中的二进制日志,只记录发生改变的记录。复制是将主节点的oplog日志同步并应用到其他从节点过程

  • MongoDB选举的原理

节点类型分为标准(host)节点 、被动(passive)节点和仲裁(arbiter)节点。

(1)只有标准节点可能被选举为主(primary)节点,有选举权;被动节点有完整副本,只能作为复制集保存,不可能成为主节点,没有选举权;仲裁节点不存放数据,只负责投票选举,不可能成为主节点,不存放数据,依然没有选举权

(2)标准节点与被动节点的区别:priority值高者是标准节点,低者则为被动节点

(3)选举规则是票数高者获胜,priority是优先权为0~1000的值,相当于额外增加0~1000的票数。选举结果:票数高者获胜;若票数相同,数据新者获胜

  • MongoDB复制集节点间选举如图所示

MongoDB复制集选举原理及管理详解

为安州等地区用户提供了全套网页设计制作服务,及安州网站建设行业解决方案。主营业务为做网站、网站制作、安州网站设计,以传统方式定制建设网站,并提供域名空间备案等一条龙服务,秉承以专业、用心的态度为用户提供真诚的服务。我们深信只要达到每一位用户的要求,就会得到认可,从而选择与我们长期合作。这样,我们也可以走得更远!

下面通过实例来演示MongoDB复制集节点间的选举原理

  • 在一台CentOS7主机上使用yum在线安装Mongodb,并创建多实例,进行部署MongoDB复制集

首先配置网络YUM源,baseurl(下载路径)指定为mongodb官网提供的yum仓库

vim /etc/yum.repos.d/mongodb.repo

[mongodb-org]

name=MongoDB Repository

baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.6/x86_64/             #指定获得下载的路径

gpgcheck=1                     #表示对从这个源下载的rpm包进行校验

enabled=1                   #表示启用这个源。

gpgkey=https://www.mongodb.org/static/pgp/server-3.6.asc

重新加载yum源,并使用yum命令下载安装mongodb

yum list

yum -y install mongodb-org

准备4个实例,设置两个标准节点, 一个被动节点和一个仲裁节点   

  • 创建数据文件和日志文件存储路径,并赋予权限

    [root@localhost ~]# mkdir -p /data/mongodb{2,3,4}
    [root@localhost ~]# mkdir /data/logs
    [root@localhost ~]# touch /data/logs/mongodb{2,3,4}.log
    [root@localhost ~]# chmod 777 /data/logs/mongodb*
    [root@localhost ~]# ll /data/logs/
    总用量 0
    -rwxrwxrwx. 1 root root 0 9月  15 22:31 mongodb2.log
    -rwxrwxrwx. 1 root root 0 9月  15 22:31 mongodb3.log
    -rwxrwxrwx. 1 root root 0 9月  15 22:31 mongodb4.log

编辑4个MongoDB实例的配置文件

  • 先编辑yum安装的默认实例的配置文件/etc/mongod.conf,指定监听IP,端口默认为27017,开启replication参数配置,replSetName:true(自定义)

[root@localhost ~]# vim /etc/mongod.conf

# mongod.conf

# for documentation of all options, see:
#   http://docs.mongodb.org/manual/reference/configuration-options/

# where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

# Where and how to store data.
storage:
  dbPath: /var/lib/mongo
  journal:
enabled: true
#  engine:
#  mmapv1:
#  wiredTiger:

# how the process runs
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid  # location of pidfile
  timeZoneInfo: /usr/share/zoneinfo

# network interfaces
net: 

port: 27017                    #默认端口          
  bindIp: 0.0.0.0             #监听任意地址

#security:

#operationProfiling:

replication:                   #去掉前面的“#”注释,开启该参数设置
replSetName: true          #设置复制集名称

  • 复制配置文件给其他实例,并将mongodb2.conf 中的port参数配置为27018,mongod3.conf中的port参数配置为27019,mongod4.conf中的port参数配置为27020。 同样也将dbpath和logpath参数修改为对应的路径值

cp  /etc/mongod.conf /etc/mongod2.conf

cp /etc/mongod2.conf /etc/mongod3.conf

cp /etc/mongod2.conf /etc/mongod4.conf

  • 实例2的配置文件mongodb2.conf 修改

vim /etc/mongod2.conf

systemLog:

destination: file

logAppend: true

path: /data/logs/mongodb2.log   

storage:

dbPath: /data/mongodb/mongodb2

journal:

enabled: true

port: 27018 

bindIp: 0.0.0.0  # Listen to local interface only, comment to listen on all interfaces.

#security:

#operationProfiling:

replication:
replSetName: true

  • 实例3的配置文件mongodb3.conf 修改

vim /etc/mongod3.conf

systemLog:

destination: file

logAppend: true

path: /data/logs/mongodb3.log   

storage:

dbPath: /data/mongodb/mongodb3 

journal:

enabled: true

port: 27019 

bindIp: 0.0.0.0  # Listen to local interface only, comment to listen on all interfaces.

#security:

#operationProfiling:

replication:
replSetName: true

  • 实例4的配置文件mongodb4.conf 修改

vim /etc/mongod4.conf

systemLog:

destination: file

logAppend: true

path: /data/logs/mongodb4.log

storage:

dbPath: /data/mongodb/mongodb4

journal:

enabled: true

port: 27020

bindIp: 0.0.0.0  # Listen to local interface only, comment to listen on all interfaces.

#security:

#operationProfiling:

replication:
replSetName: true

启动mongodb各实例

[root@localhost ~]# mongod -f /etc/mongod.conf
about to fork child process, waiting until server is ready for connections.
forked process: 93576
child process started successfully, parent exiting
[root@localhost ~]# mongod -f /etc/mongod2.conf
about to fork child process, waiting until server is ready for connections.
forked process: 93608
child process started successfully, parent exiting
[root@localhost ~]# mongod -f /etc/mongod3.conf
about to fork child process, waiting until server is ready for connections.
forked process: 93636
child process started successfully, parent exiting
[root@localhost ~]# mongod -f /etc/mongod4.conf
about to fork child process, waiting until server is ready for connections.
forked process: 93664
child process started successfully, parent exiting
[root@localhost ~]# netstat -antp | grep mongod                        //查看mongodb进程状态
tcp        0      0 0.0.0.0:27019           0.0.0.0:*               LISTEN      93636/mongod      
tcp        0      0 0.0.0.0:27020           0.0.0.0:*               LISTEN      93664/mongod      
tcp        0      0 0.0.0.0:27017           0.0.0.0:*               LISTEN      93576/mongod      
tcp        0      0 0.0.0.0:27018           0.0.0.0:*               LISTEN      93608/mongod

配置复制集的优先级

  • 登录默认实例 mongo,配置4个节点 MongoDB 复制集,设置两个标准节点,一个被动节点和一个仲裁节点,

  • 根据优先级确定节点: 优先级为 100的为标准节点,端口号为 27017和27018  ,优先级为0 的为被动节点,端口号为27019;仲裁节点为27020

[root@localhost ~]# mongo
MongoDB shell version v3.6.7
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.6.7

> cfg={"_id":"true","members":[{"_id":0,"host":"192.168.195.137:27017","priority":100},                       

{"_id":1,"host":"192.168.195.137:27018","priority":100},{"_id":2,"host":"192.168.195.137:27019","priority":0},{"_id":3,"host":"192.168.195.137:27020","arbiterOnly":true}]}                       
{
"_id" : "true",
"members" : [
  {
   "_id" : 0,
   "host" : "192.168.195.137:27017",                #标准节点1,优先级为100
   "priority" : 100
  },
  {
   "_id" : 1,
   "host" : "192.168.195.137:27018",               #标准节点2,优先级为100
   "priority" : 100
  },
  {
   "_id" : 2,
   "host" : "192.168.195.137:27019",              #被动节点,优先级为0
   "priority" : 0
  },
  {
   "_id" : 3,
   "host" : "192.168.195.137:27020",                #仲裁节点
   "arbiterOnly" : true

> rs.initiate(cfg)                               #初始化配置
{
"ok" : 1,
"operationTime" : Timestamp(1537077618, 1),
"$clusterTime" : {
  "clusterTime" : Timestamp(1537077618, 1),
  "signature" : {
   "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
   "keyId" : NumberLong(0)
  }
}

  • 使用命令 rs.isMaster()  查看各节点身份

true:PRIMARY> rs.isMaster()
{
"hosts" : [
  "192.168.195.137:27017",               #标准节点
  "192.168.195.137:27018"
],
"passives" : [
  "192.168.195.137:27019"              #被动节点
],
"arbiters" : [
  "192.168.195.137:27020"              #仲裁节点
],
"setName" : "true",
"setVersion" : 1,
"ismaster" : true,
"secondary" : false,
"primary" : "192.168.195.137:27017",
"me" : "192.168.195.137:27017",

  • 在主节点上进行增,删,改。查操作

true:PRIMARY> use kfc
switched to db kfc
true:PRIMARY> db.info.insert({"id":1,"name":"tom"})
WriteResult({ "nInserted" : 1 })
true:PRIMARY> db.info.insert({"id":2,"name":"jack"})
WriteResult({ "nInserted" : 1 })
true:PRIMARY> db.info.find()
{ "_id" : ObjectId("5b9df3ff690f4b20fa330b18"), "id" : 1, "name" : "tom" }
{ "_id" : ObjectId("5b9df40f690f4b20fa330b19"), "id" : 2, "name" : "jack

true:PRIMARY> db.info.update({"id":2},{$set:{"name":"lucy"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
true:PRIMARY> db.info.remove({"id":1})
WriteResult({ "nRemoved" : 1 })

  • 查看主节点的oplog日志记录所有操作 ,在默认数据库 local 中的oplog.rs   查看

true:PRIMARY> use local
switched to db local
true:PRIMARY> show tables
me
oplog.rs
replset.election
replset.minvalid
startup_log
system.replset
system.rollback.id
true:PRIMARY> db.oplog.rs.find()                   #查看日志记录所有操作    

............                                                       # 通过日志记录,可以找到刚才的操作信息

{ "ts" : Timestamp(1537078271, 2), "t" : NumberLong(1), "h" : NumberLong("-5529983416084904509"), "v" : 2, "op" : "c", "ns" : "kfc.$cmd", "ui" : UUID("2de2277f-df99-4fb2-96ef-164b59dfc768"), "wall" : ISODate("2018-09-16T06:11:11.072Z"), "o" : { "create" : "info", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "kfc.info" } } }
{ "ts" : Timestamp(1537078271, 3), "t" : NumberLong(1), "h" : NumberLong("-1436300260967761649"), "v" : 2, "op" : "i", "ns" : "kfc.info", "ui" : UUID("2de2277f-df99-4fb2-96ef-164b59dfc768"), "wall" : ISODate("2018-09-16T06:11:11.072Z"), "o" : { "_id" : ObjectId("5b9df3ff690f4b20fa330b18"), "id" : 1, "name" : "tom" } }
{ "ts" : Timestamp(1537078287, 1), "t" : NumberLong(1), "h" : NumberLong("9052955074674132871"), "v" : 2, "op" : "i", "ns" : "kfc.info", "ui" : UUID("2de2277f-df99-4fb2-96ef-164b59dfc768"), "wall" : ISODate("2018-09-16T06:11:27.562Z"), "o" : { "_id" : ObjectId("5b9df40f690f4b20fa330b19"), "id" : 2, "name" : "jack" } }

...............

{ "ts" : Timestamp(1537078543, 1), "t" : NumberLong(1), "h" : NumberLong("-5120962218610090442"), "v" : 2, "op" : "u", "ns" : "kfc.info", "ui" : UUID("2de2277f-df99-4fb2-96ef-164b59dfc768"), "o2" : { "_id" : ObjectId("5b9df40f690f4b20fa330b19") }, "wall" : ISODate("2018-09-16T06:15:43.494Z"), "o" : { "$v" : 1, "$set" : { "name" : "lucy" } } }

模拟标准节点1故障

  • 如果主节点出现故障,另一个标准节点会选举成为新的主节点。

[root@localhost ~]# mongod -f /etc/mongod.conf --shutdown            #关闭主节点服务
killing process with pid: 52986
[root@localhost ~]# mongo --port 27018                               #登录另一个标准节点端口 27018

MongoDB shell version v3.6.7
connecting to: mongodb://127.0.0.1:27018/
MongoDB server version: 3.6.7

true:PRIMARY> rs.status()                              #查看状态,可以看到这台标准节点已经选举为主节点

"members" : [
  {
   "_id" : 0,
   "name" : "192.168.195.137:27017",
   "health" : 0,                                              #健康值为 0 ,说明端口27017 已经宕机了
            "state" : 8,
   "stateStr" : "(not reachable/healthy)",
   "uptime" : 0,
   "optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
   },
   "optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
   },

{
   "_id" : 1,
   "name" : "192.168.195.137:27018",
   "health" : 1,              
   "state" : 1,
   "stateStr" : "PRIMARY",                           #此时另一台标准节点被选举为主节点,端口为 27018
            "uptime" : 3192,
   "optime" : {
"ts" : Timestamp(1537080552, 1),
"t" : NumberLong(2)
   },

模拟标准节点2故障

  • 将标准节点服务全部关闭,查看被动节点是否会被选举为主节点

[root@localhost ~]# mongod -f /etc/mongod2.conf --shutdown         #关闭第二个标准节点服务
killing process with pid: 53018
[root@localhost ~]# mongo --port 27019                           #进入第三个被动节点实例
MongoDB shell version v3.6.7
connecting to: mongodb://127.0.0.1:27019/
MongoDB server version: 3.6.7

true:SECONDARY> rs.status()                            #查看复制集状态信息                      

..............

"members" : [
  {
   "_id" : 0,
   "name" : "192.168.195.137:27017",
   "health" : 0,
   "state" : 8,
   "stateStr" : "(not reachable/healthy)",
   "uptime" : 0,
   "optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
   },
   "optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
   },

.................

{
   "_id" : 1,
   "name" : "192.168.195.137:27018",
   "health" : 0,
   "state" : 8,
   "stateStr" : "(not reachable/healthy)",
   "uptime" : 0,
   "optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
   },
   "optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
   },

..................

{
   "_id" : 2,
   "name" : "192.168.195.137:27019",
   "health" : 1,
   "state" : 2,
   "stateStr" : "SECONDARY",                          #被动节点并没有被选举为主节点,说明被动节点不可能成为活跃节点
            "uptime" : 3972,
   "optime" : {
"ts" : Timestamp(1537081303, 1),
"t" : NumberLong(2)
   },

..................

{
  "_id" : 3,
  "name" : "192.168.195.137:27020",
  "health" : 1,
  "state" : 7,
  "stateStr" : "ARBITER",
  "uptime" : 3722,

另外我们可以通过启动标准节点的先后顺序,实现人为指定主节点,默认谁先启动,谁就是主节点。

允许从节点读取数据

  • 默认MongoDB复制集的从节点不能读取数据,可以使用rs.slaveOk()命令允许能够在从节点读取数据

  • 重新启动两个标准节点

[root@localhost ~]# mongod -f /etc/mongod.conf
about to fork child process, waiting until server is ready for connections.
forked process: 54685
child process started successfully, parent exiting
[root@localhost ~]# mongod -f /etc/mongod2.conf
about to fork child process, waiting until server is ready for connections.
forked process: 54773
child process started successfully, parent exiting

  • 进入复制集的其中一个从节点,配置其允许读取数据

[root@localhost ~]# mongo --port 27018
MongoDB shell version v3.6.7
connecting to: mongodb://127.0.0.1:27018/
MongoDB server version: 3.6.7

true:SECONDARY> rs.slaveOk()                           #允许默认从节点读取数据
true:SECONDARY> show dbs              #读取成功
admin   0.000GB
config  0.000GB
kfc     0.000GB
local   0.000GB

查看复制状态信息

  • 可以使用rs.printReplicationInfo()和rs.printSlaveReplicationInfo()命令来查看复制集状态

true:SECONDARY> rs.printReplicationInfo()         #查看日志文件能够使用的大小 默认oplog大小会占用64位实例5%的可用磁盘空间
configured oplog size:  990MB
log length start to end: 5033secs (1.4hrs)
oplog first event time:  Sun Sep 16 2018 14:00:18 GMT+0800 (CST)
oplog last event time:   Sun Sep 16 2018 15:24:11 GMT+0800 (CST)
now:                     Sun Sep 16 2018 15:24:13 GMT+0800 (CST)
true:SECONDARY> rs.printSlaveReplicationInfo()                #查看节点        
source: 192.168.195.137:27018
    syncedTo: Sun Sep 16 2018 15:24:21 GMT+0800 (CST)
0 secs (0 hrs) behind the primary
source: 192.168.195.137:27019
    syncedTo: Sun Sep 16 2018 15:24:21 GMT+0800 (CST)
0 secs (0 hrs) behind the primary

会发现仲裁节点并不具备数据复制

更改oplog大小

  • oplog即operations log简写,存储在local数据库中。oplog中新操作会自动替换旧的操作,以保证oplog不会超过预设的大小。默认情况下,oplog大小会占用64位的实例5%的可用磁盘

  • 在MongoDB复制的过程中,主节点应用业务操作修改到数据库中,然后记录这些操作到oplog中,从节点复制这些oplog,然后应用这些修改。这些操作是异步的。如果从节点的操作已经被主节点落下很远,oplog日志在从节点还没执行完,oplog可能已经轮滚一圈了,从节点跟不上同步,复制就会停下,从节点需要重新做完整的同步,为了避免此种情况,尽量保证主节点的oplog足够大,能够存放相当长时间的操作记录

  • (1)关闭mongodb

true:PRIMARY> use admin
switched to db admin
true:PRIMARY> db.shutdownServer()

  • (2)修改配置文件,注销掉replication相关设置,并修改端口号,目的使其暂时脱离复制集成为一个独立的单体,

vim /etc/mongod.conf

port: 27027

#replication:

# replSetName: true

  • (3)单实例模式启动,并将之前的oplog备份一下

mongod -f /etc/mongod.conf

mongodump --port=27028 -d local -c oplog.rs -o /opt/

  • (4)进入实例中,删除掉原来的oplog.rs,使用db.runCommand命令重新创建oplog.rs,并更改oplog大小

[root@localhost logs]# mongo --port 27027

> use local
> db.oplog.rs.drop()
> db.runCommand( { create: "oplog.rs", capped: true, size: (2 * 1024 * 1024 * 1024) } )

  • (5)关闭mongodb服务,重新将配置文件项改回原来设置,并添加设置oplogSizeMB: 2048


    > use admin
    > db.shutdownServer()

    另外有需要云服务器可以了解下创新互联cdcxhl.cn,海内外云服务器15元起步,三天无理由+7*72小时售后在线,公司持有idc许可证,提供“云服务器、裸金属服务器、高防服务器、香港服务器、美国服务器、虚拟主机、免备案服务器”等云主机租用服务以及企业上云的综合解决方案,具有“安全稳定、简单易用、服务可用性高、性价比高”等特点与优势,专为企业上云打造定制,能够满足用户丰富、多元化的应用场景需求。


    分享名称:MongoDB复制集选举原理及管理详解-创新互联
    新闻来源:http://csdahua.cn/article/cogdii.html
扫二维码与项目经理沟通

我们在微信上24小时期待你的声音

解答本文疑问/技术咨询/运营咨询/技术建议/互联网交流