部署Kubernetes集群故障案例


在使用 k8s的过程中,相信很多人都遇到过使用 kubeadm join命令,将 node加入 master时,出现 error execution phase preflight: couldn't validate the identity of the API Server: abort connecting to API servers after timeout of 5m0s错误,即节点纳入管理失败,五分钟后超时放弃连接。具体信息如下

```bash
[root@node1 ~]# kubeadm join 192.168.8.120:6443 --token m1xenp.ra8y9d28h88dyfe6     --discovery-token-ca-cert-hash sha256:7dab997afd127fc4c9921808d04be57953694cc694a99c46925ae0f2b50e4308 
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.12. Latest validated version: 19.03
	[WARNING Hostname]: hostname "node1.example.com" could not be reached
	[WARNING Hostname]: hostname "node1.example.com": lookup node1.example.com on 192.168.8.2:53: no such host
error execution phase preflight: couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID "m1xenp"
To see the stack trace of this error execute with --v=5 or higher

出现该问题的原因有很多,但主要有两个:

1. token 过期

此时需要通过kubedam重新生成token

//解决方法
//master主机上重新生成token
[root@master ~]# kubeadm token generate   #生成toke
fen9ed.98vjvkhle103ufht   #下面这条命令中会用到该结果
[root@master ~]# kubeadm token create fen9ed.98vjvkhle103ufht --print-join-command --ttl=0   #根据token输出添加命令
kubeadm join 192.168.8.120:6443 --token fen9ed.98vjvkhle103ufht     --discovery-token-ca-cert-hash sha256:7dab997afd127fc4c9921808d04be57953694cc694a99c46925ae0f2b50e4308 

然后用上面输出的kubeadm join命令放到想要添加的节点中执行

//问题完美解决
[root@node1 ~]# kubeadm join 192.168.8.120:6443 --token fen9ed.98vjvkhle103ufht     --discovery-token-ca-cert-hash sha256:7dab997afd127fc4c9921808d04be57953694cc694a99c46925ae0f2b50e4308 
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.12. Latest validated version: 19.03
	[WARNING Hostname]: hostname "node1.example.com" could not be reached
	[WARNING Hostname]: hostname "node1.example.com": lookup node1.example.com on 192.168.8.2:53: no such host
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

[root@node1 ~]# 

[root@master ~]# kubectl get nodes
NAME                 STATUS   ROLES                  AGE   VERSION
master.example.com   Ready    control-plane,master   45h   v1.20.0
node1.example.com    Ready    <none>                 20m   v1.20.0
node2.example.com    Ready    <none>                 44h   v1.20.0
[root@master ~]# 

2. k8s api server不可达

此时需要检查和关闭所有服务器firewalldselinux

[root@master ~]#setenforce 0
[root@master ~]#sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
[root@master ~]#systemctl disable firewalld --now
Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐