一、安装etcdctl 命令行命令

etcd不同版本的 etcdctl 命令不一样,但大致差不多,这里备份使用 napshot save进行快照备份。
需要注意几点:
1、备份操作在etcd集群的其中一个节点执行就可以。
2、这里使用的是etcd v3的api,因为从 k8s 1.13 开始,k8s不再支持 v2 版本的 etcd,即k8s的集群数据都存在了v3版本的etcd中。故备份的数据也只备份了使用v3添加的etcd数据,v2添加的etcd数据是没有做备份的(下面命令中的"ETCDCTL_API=3 etcdctl" 等同于 “etcdctl”)。

yum install -y etcd

二、Etcd数据备份及恢复

1、数据存放

etcd的数据默认会存放在 /var/lib/etcd/member/,我们发现数据所在的目录,会被分为两个文件夹中:

snap: 存放快照数据,etcd防止WAL文件过多而设置的快照,存储etcd数据状态。
wal: 存放预写式日志,最大的作用是记录了整个数据变化的全部历程。在etcd中,所有数据的修改在提交前,都要先写入到WAL中。

2、准备工作:

#备份/etc/kubernetes目录
cp  -r  /etc/kubernetes/ /etc/kubernetes_bak/
#备份/var/lib/etcd目录
cp -r /var/lib/etcd/ /var/lib/etcd_bak/ 
#备份 /var/lib/kubelet目录
cp -r /var/lib/kubelet/ /var/lib/kubelet_bak/ 

三、单节点etcd数据备份和恢复

这种方式的备份和恢复,用基于文件的备份即可。Kubeadm的默认安装时,将etcd的存储数据落地到了宿主机的/var/lib/etcd/目录,将此目录下的文件定期备份起来,如果以后etcd的数据出现问题,需要恢复时,直接将文件还原到此目录下,就实现了单节点的etcd数据恢复。

注:如果etcd容器正在启动,是不能覆盖的,这时只需要将/etc/kubernetes/manifests文件夹重命名,数据文件替换后,将/etc/kubernetes/manifests改回来,过一会就会自动将etcd容器重启起来,可参见:四、Kubeadm安装的单master集群)

3.1、Kubeadm安装的单master集群

1、备份

V3版api:备份ETCDCTL_API为3的etcd数据到当前的备份目录下。

ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379"  --cert="/etc/kubernetes/pki/etcd/server.crt"  --key="/etc/kubernetes/pki/etcd/server.key"  --cacert="/etc/kubernetes/pki/etcd/ca.crt"   snapshot save ./snap-$(date +%Y%m%d%H%M).db

备注: 1)ETCDCTL_API=3,指定使用 Etcd 的 v3 版本的 API。 2)endponits可以通过下面的命令查找,一般会有两个IP,一个是127.0.0.1,另外一个本机的局域网IP,如: [root@app01 ~]# kubectl describe pod etcd-app01 -n kube-system| grep listen-client-urls --listen-client-urls=https://127.0.0.1:2379,https://192.168.180.45:2379

2、恢复

1、先暂停kube-apiserver和etcd容器

mv /etc/kubernetes/manifests /etc/kubernetes/manifests_bak
rm -rf /var/lib/etcd

2、恢复

ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379"  --cert="/etc/kubernetes/pki/etcd/server.crt"  --key="/etc/kubernetes/pki/etcd/server.key"  --cacert="/etc/kubernetes/pki/etcd/ca.crt"   snapshot restore snap-202208251559.db --data-dir=/var/lib/etcd

3、启动kube-apiserver和etcd容器

mv /etc/kubernetes/manifests.bak /etc/kubernetes/manifests

4、查看pod是否恢复正常了

kubectl get pod -n kube-system

3.2、二进制Etcd数据库的备份还原(未验证)

1、备份

V3版api:

ETCDCTL_API=3  etcdctl snapshot save snap.20220107.db --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://192.168.119.72:2379"


{"level":"info","ts":1630499882.9289303,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"snap.db.part"}
{"level":"info","ts":"2022-01-07T20:38:02.933+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1630499882.933808,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"https://192.168.119.72:2379"}
{"level":"info","ts":"2022-01-07T20:38:03.040+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1630499883.0697453,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"https://192.168.119.72:2379","size":"13 MB","took":0.140736973}
{"level":"info","ts":1630499883.0698237,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"snap.db"}
Snapshot saved at snap.20220107.db

2、还原

下面的二进制etcd集群数据库的还原操作没有在二进制集群实际验证,只是理论步骤,请勿在生产环境直接操作!

systemctl stop kube-apiserver
systemctl stop etcd
mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd.bak
--->如果不知道二进制集群的etcd数据库放在哪里了,可以这样查看

systemctl cat etcd.service
 
ETCDCTL_API=3 etcdctl snapshot restore /data/backup/snap.20220107.db --data-dir=/var/lib/etcd/default.etcd

chown -R etcd:etcd /var/lib/etcd
systemctl start kube-apiserver
systemctl start etcd.service

四、etcd集群数据的备份和恢复(未验证)

4.1、Kubeadm安装的多master集群

1、备份

V3版api:
备份ETCDCTL_API为3的etcd数据到之前的备份目录下。可以在一个master节点上执行备份操作

ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379"  --cert="/etc/kubernetes/pki/etcd/server.crt"  --key="/etc/kubernetes/pki/etcd/server.key"  --cacert="/etc/kubernetes/pki/etcd/ca.crt"   snapshot save /backup_$(date +%Y%m%d)/snap-$(date +%Y%m%d%H%M).db

2、恢复

1、停掉所有Master节点的kube-apiserver和etcd,需要分别在master1、master2、master3上进行同样的操作

mv /etc/kubernetes/manifests  /etc/kubernetes/manifests_bak
rm -rf  /var/lib/etcd

2、在master1上执行

ETCDCTL_API=3 etcdctl snapshot restore /backup_20220108/snap-202201081337.db \
    --endpoints=192.168.100.171:2379 \
    --name=master1 \
    --cert=/etc/kubernetes/pki/etcd/server.crt \
    --key=/etc/kubernetes/pki/etcd/server.key \
    --cacert=/etc/kubernetes/pki/etcd/ca.crt \
    --initial-advertise-peer-urls=https://192.168.100.171:2380 \
    --initial-cluster-token=etcd-cluster-0 \
    --initial-cluster=master1=https://192.168.100.171:2380,master2=https://192.168.100.172:2380,master3=https://192.168.100.173:2380 \
    --data-dir=/var/lib/etcd

在master2上执行

ETCDCTL_API=3 etcdctl snapshot restore /backup_20220108/snap-202201081337.db \
    --endpoints=192.168.100.172:2379 \
    --name=master2 \
    --cert=/etc/kubernetes/pki/etcd/server.crt \
    --key=/etc/kubernetes/pki/etcd/server.key \
    --cacert=/etc/kubernetes/pki/etcd/ca.crt \
    --initial-advertise-peer-urls=https://192.168.100.172:2380 \
    --initial-cluster-token=etcd-cluster-0 \
    --initial-cluster=master1=https://192.168.100.171:2380,master2=https://192.168.100.172:2380,master3=https://192.168.100.173:2380 \
    --data-dir=/var/lib/etcd

在master3上执行

ETCDCTL_API=3
etcdctl snapshot restore /backup_20220108/snap-202201081337.db \
    --endpoints=192.168.100.173:2379 \
    --name=master3 \
    --cert=/etc/kubernetes/pki/etcd/server.crt \
    --key=/etc/kubernetes/pki/etcd/server.key \
    --cacert=/etc/kubernetes/pki/etcd/ca.crt \
    --initial-advertise-peer-urls=https://192.168.100.173:2380 \
    --initial-cluster-token=etcd-cluster-0 \
    --initial-cluster=master1=https://192.168.100.171:2380,master2=https://192.168.100.172:2380,master3=https://192.168.100.173:2380 \
    --data-dir=/var/lib/etcd

备注:
1)ETCDCTL_API=3,指定使用 Etcd 的 v3 版本的 API;
2)如果不知道 --name= 则可以用如下命令查看

集群列出成员

ETCDCTL_API=3 etcdctl --endpoints 192.168.100.171:2379,192.168.100.172:2379,192.168.100.173:2379 --cert="/etc/kubernetes/pki/etcd/server.crt"  --key="/etc/kubernetes/pki/etcd/server.key"  --cacert="/etc/kubernetes/pki/etcd/ca.crt" member list --write-out=table

返回结果:

+------------------+---------+---------+------------------------------+------------------------------+------------+
|        ID        | STATUS  |  NAME   |          PEER ADDRS          |         CLIENT ADDRS         | IS LEARNER |
+------------------+---------+---------+------------------------------+------------------------------+------------+
| 442ee8f1d97e7dcd | started | master3 | https://192.168.100.173:2380 | https://192.168.100.173:2379 |      false |
| 4972579f39eb9468 | started | master1 | https://192.168.100.171:2380 | https://192.168.100.171:2379 |      false |
| 4bff6a42b677cc19 | started | master2 | https://192.168.100.172:2380 | https://192.168.100.172:2379 |      false |
+------------------+---------+---------+------------------------------+------------------------------+------------+

3、在三台master节点上恢复manifests

mv /etc/kubernetes/manifests_bak  /etc/kubernetes/manifests

4.2 二进制部署方式安装的多etcd节点集群(未验证)

1、备份

ETCDCTL_API=3 etcdctl \
snapshot save snap.db \
--endpoints=https://192.168.10.160:2379 \
--cacert=/opt/etcd/ssl/ca.pem \
--cert=/opt/etcd/ssl/server.pem \
--key=/opt/etcd/ssl/server-key.pem

2、恢复

1、先暂停kube-apiserver和etcd

systemctl stop kube-apiserver
systemctl stop etcd etcd
mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd.bak

2、在每个节点上恢复

ETCDCTL_API=3 etcdctl snapshot restore snap.db \
--name etcd-1 \
--initial-cluster= "etcd-1=https://192.168.10.160:2380,etcd-2=https://192.168.10.161:2380,etcd-3=https:192.168.10.162:2380" \
--initial-advertise-peer-url=https://192.168.10.160:2380 \
--data-dir=/var/lib/etcd/default.etcd
ETCDCTL_API=3 etcdctl snapshot restore snap.db \
--name etcd-2 \
--initial-cluster= "etcd-1=https://192.168.10.160:2380,etcd-2=https://192.168.10.161:2380,etcd-3=https:192.168.10.162:2380" \
--initial-advertise-peer-url=https://192.168.10.162:2380 \
--data-dir=/var/lib/etcd/default.etcd
ETCDCTL_API=3 etcdctl snapshot restore snap.db \
--name etcd-3 \
--initial-cluster= "etcd-1=https://192.168.10.160:2380,etcd-2=https://192.168.10.161:2380,etcd-3=https:192.168.10.162:2380" \
--initial-advertise-peer-url=https://192.168.10.162:2380 \
--data-dir=/var/lib/etcd/default.etcd

3、启动kube-apiserver和etcd

mv /var/lib/etcd/default.etcd.bak /var/lib/etcd/default.etcd
systemctl start kube-apiserver
systemctl start etcd.service
Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐