K8s集群的Etcd数据库的备份与还原
etcd备份还原
目录
一、安装etcdctl 命令行命令
etcd不同版本的 etcdctl 命令不一样,但大致差不多,这里备份使用 napshot save进行快照备份。
需要注意几点:
1、备份操作在etcd集群的其中一个节点执行就可以。
2、这里使用的是etcd v3的api,因为从 k8s 1.13 开始,k8s不再支持 v2 版本的 etcd,即k8s的集群数据都存在了v3版本的etcd中。故备份的数据也只备份了使用v3添加的etcd数据,v2添加的etcd数据是没有做备份的(下面命令中的"ETCDCTL_API=3 etcdctl" 等同于 “etcdctl”)。
yum install -y etcd
二、Etcd数据备份及恢复
1、数据存放
etcd的数据默认会存放在 /var/lib/etcd/member/
,我们发现数据所在的目录,会被分为两个文件夹中:
snap: 存放快照数据,etcd防止WAL文件过多而设置的快照,存储etcd数据状态。
wal: 存放预写式日志,最大的作用是记录了整个数据变化的全部历程。在etcd中,所有数据的修改在提交前,都要先写入到WAL中。
2、准备工作:
#备份/etc/kubernetes目录
cp -r /etc/kubernetes/ /etc/kubernetes_bak/
#备份/var/lib/etcd目录
cp -r /var/lib/etcd/ /var/lib/etcd_bak/
#备份 /var/lib/kubelet目录
cp -r /var/lib/kubelet/ /var/lib/kubelet_bak/
三、单节点etcd数据备份和恢复
这种方式的备份和恢复,用基于文件的备份即可。Kubeadm的默认安装时,将etcd的存储数据落地到了宿主机的/var/lib/etcd/目录,将此目录下的文件定期备份起来,如果以后etcd的数据出现问题,需要恢复时,直接将文件还原到此目录下,就实现了单节点的etcd数据恢复。
注:如果etcd容器正在启动,是不能覆盖的,这时只需要将/etc/kubernetes/manifests文件夹重命名,数据文件替换后,将/etc/kubernetes/manifests改回来,过一会就会自动将etcd容器重启起来,可参见:四、Kubeadm安装的单master集群)
3.1、Kubeadm安装的单master集群
1、备份
V3版api:备份ETCDCTL_API为3的etcd数据到当前的备份目录下。
ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379" --cert="/etc/kubernetes/pki/etcd/server.crt" --key="/etc/kubernetes/pki/etcd/server.key" --cacert="/etc/kubernetes/pki/etcd/ca.crt" snapshot save ./snap-$(date +%Y%m%d%H%M).db
备注: 1)ETCDCTL_API=3,指定使用 Etcd 的 v3 版本的 API。 2)endponits可以通过下面的命令查找,一般会有两个IP,一个是127.0.0.1,另外一个本机的局域网IP,如: [root@app01 ~]# kubectl describe pod etcd-app01 -n kube-system| grep listen-client-urls --listen-client-urls=https://127.0.0.1:2379,https://192.168.180.45:2379
2、恢复
1、先暂停kube-apiserver和etcd容器
mv /etc/kubernetes/manifests /etc/kubernetes/manifests_bak
rm -rf /var/lib/etcd
2、恢复
ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379" --cert="/etc/kubernetes/pki/etcd/server.crt" --key="/etc/kubernetes/pki/etcd/server.key" --cacert="/etc/kubernetes/pki/etcd/ca.crt" snapshot restore snap-202208251559.db --data-dir=/var/lib/etcd
3、启动kube-apiserver和etcd容器
mv /etc/kubernetes/manifests.bak /etc/kubernetes/manifests
4、查看pod是否恢复正常了
kubectl get pod -n kube-system
3.2、二进制Etcd数据库的备份还原(未验证)
1、备份
V3版api:
ETCDCTL_API=3 etcdctl snapshot save snap.20220107.db --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://192.168.119.72:2379"
{"level":"info","ts":1630499882.9289303,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"snap.db.part"}
{"level":"info","ts":"2022-01-07T20:38:02.933+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1630499882.933808,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"https://192.168.119.72:2379"}
{"level":"info","ts":"2022-01-07T20:38:03.040+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1630499883.0697453,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"https://192.168.119.72:2379","size":"13 MB","took":0.140736973}
{"level":"info","ts":1630499883.0698237,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"snap.db"}
Snapshot saved at snap.20220107.db
2、还原
下面的二进制etcd集群数据库的还原操作没有在二进制集群实际验证,只是理论步骤,请勿在生产环境直接操作!
systemctl stop kube-apiserver
systemctl stop etcd
mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd.bak
--->如果不知道二进制集群的etcd数据库放在哪里了,可以这样查看
systemctl cat etcd.service
ETCDCTL_API=3 etcdctl snapshot restore /data/backup/snap.20220107.db --data-dir=/var/lib/etcd/default.etcd
chown -R etcd:etcd /var/lib/etcd
systemctl start kube-apiserver
systemctl start etcd.service
四、etcd集群数据的备份和恢复(未验证)
4.1、Kubeadm安装的多master集群
1、备份
V3版api:
备份ETCDCTL_API为3的etcd数据到之前的备份目录下。可以在一个master节点上执行备份操作
ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379" --cert="/etc/kubernetes/pki/etcd/server.crt" --key="/etc/kubernetes/pki/etcd/server.key" --cacert="/etc/kubernetes/pki/etcd/ca.crt" snapshot save /backup_$(date +%Y%m%d)/snap-$(date +%Y%m%d%H%M).db
2、恢复
1、停掉所有Master节点的kube-apiserver和etcd,需要分别在master1、master2、master3上进行同样的操作
mv /etc/kubernetes/manifests /etc/kubernetes/manifests_bak
rm -rf /var/lib/etcd
2、在master1上执行
ETCDCTL_API=3 etcdctl snapshot restore /backup_20220108/snap-202201081337.db \
--endpoints=192.168.100.171:2379 \
--name=master1 \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--initial-advertise-peer-urls=https://192.168.100.171:2380 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster=master1=https://192.168.100.171:2380,master2=https://192.168.100.172:2380,master3=https://192.168.100.173:2380 \
--data-dir=/var/lib/etcd
在master2上执行
ETCDCTL_API=3 etcdctl snapshot restore /backup_20220108/snap-202201081337.db \
--endpoints=192.168.100.172:2379 \
--name=master2 \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--initial-advertise-peer-urls=https://192.168.100.172:2380 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster=master1=https://192.168.100.171:2380,master2=https://192.168.100.172:2380,master3=https://192.168.100.173:2380 \
--data-dir=/var/lib/etcd
在master3上执行
ETCDCTL_API=3
etcdctl snapshot restore /backup_20220108/snap-202201081337.db \
--endpoints=192.168.100.173:2379 \
--name=master3 \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--initial-advertise-peer-urls=https://192.168.100.173:2380 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster=master1=https://192.168.100.171:2380,master2=https://192.168.100.172:2380,master3=https://192.168.100.173:2380 \
--data-dir=/var/lib/etcd
备注:
1)ETCDCTL_API=3,指定使用 Etcd 的 v3 版本的 API;
2)如果不知道 --name= 则可以用如下命令查看
集群列出成员
ETCDCTL_API=3 etcdctl --endpoints 192.168.100.171:2379,192.168.100.172:2379,192.168.100.173:2379 --cert="/etc/kubernetes/pki/etcd/server.crt" --key="/etc/kubernetes/pki/etcd/server.key" --cacert="/etc/kubernetes/pki/etcd/ca.crt" member list --write-out=table
返回结果:
+------------------+---------+---------+------------------------------+------------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+---------+------------------------------+------------------------------+------------+
| 442ee8f1d97e7dcd | started | master3 | https://192.168.100.173:2380 | https://192.168.100.173:2379 | false |
| 4972579f39eb9468 | started | master1 | https://192.168.100.171:2380 | https://192.168.100.171:2379 | false |
| 4bff6a42b677cc19 | started | master2 | https://192.168.100.172:2380 | https://192.168.100.172:2379 | false |
+------------------+---------+---------+------------------------------+------------------------------+------------+
3、在三台master节点上恢复manifests
mv /etc/kubernetes/manifests_bak /etc/kubernetes/manifests
4.2 二进制部署方式安装的多etcd节点集群(未验证)
1、备份
ETCDCTL_API=3 etcdctl \
snapshot save snap.db \
--endpoints=https://192.168.10.160:2379 \
--cacert=/opt/etcd/ssl/ca.pem \
--cert=/opt/etcd/ssl/server.pem \
--key=/opt/etcd/ssl/server-key.pem
2、恢复
1、先暂停kube-apiserver和etcd
systemctl stop kube-apiserver
systemctl stop etcd etcd
mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd.bak
2、在每个节点上恢复
ETCDCTL_API=3 etcdctl snapshot restore snap.db \
--name etcd-1 \
--initial-cluster= "etcd-1=https://192.168.10.160:2380,etcd-2=https://192.168.10.161:2380,etcd-3=https:192.168.10.162:2380" \
--initial-advertise-peer-url=https://192.168.10.160:2380 \
--data-dir=/var/lib/etcd/default.etcd
ETCDCTL_API=3 etcdctl snapshot restore snap.db \
--name etcd-2 \
--initial-cluster= "etcd-1=https://192.168.10.160:2380,etcd-2=https://192.168.10.161:2380,etcd-3=https:192.168.10.162:2380" \
--initial-advertise-peer-url=https://192.168.10.162:2380 \
--data-dir=/var/lib/etcd/default.etcd
ETCDCTL_API=3 etcdctl snapshot restore snap.db \
--name etcd-3 \
--initial-cluster= "etcd-1=https://192.168.10.160:2380,etcd-2=https://192.168.10.161:2380,etcd-3=https:192.168.10.162:2380" \
--initial-advertise-peer-url=https://192.168.10.162:2380 \
--data-dir=/var/lib/etcd/default.etcd
3、启动kube-apiserver和etcd
mv /var/lib/etcd/default.etcd.bak /var/lib/etcd/default.etcd
systemctl start kube-apiserver
systemctl start etcd.service
更多推荐
所有评论(0)