Error: ImagePullBackOff 故障排除

1. 起因

起因是要在一组k8s环境下做个Prometheus的测试,当时虚拟机用完直接暂停了.
启动完master和node节点后重启了这些节点.
当检查dashboard时候发现Pod处于ImagePullBackOff状态,使用命令查看详细情况

kubectl describe pods -n kubernetes-dashboard kubernetes-dashboard-6948fdc5fd-7szc9

发现镜像拉取失败

1.1 报错信息如下:

Events:
  Type     Reason          Age                From               Message
  ----     ------          ----               ----               -------
  Normal   Scheduled       25s                default-scheduler  Successfully assigned kubernetes-dashboard/kubernetes-dashboard-6948fdc5fd-7szc9 to 192.168.31.112
  Normal   SandboxChanged  23s                kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   BackOff         20s (x3 over 22s)  kubelet            Back-off pulling image "harbor.intra.com/baseimages/kubernetesui/dashboard:v2.4.0"
  Warning  Failed          20s (x3 over 22s)  kubelet            Error: ImagePullBackOff
  Normal   Pulling         9s (x2 over 24s)   kubelet            Pulling image "harbor.intra.com/baseimages/kubernetesui/dashboard:v2.4.0"
  Warning  Failed          9s (x2 over 24s)   kubelet            Failed to pull image "harbor.intra.com/baseimages/kubernetesui/dashboard:v2.4.0": rpc error: code = Unknown desc = Error response from daemon: Get "https://harbor.intra.com/v2/": x509: certificate signed by unknown authority
  Warning  Failed          9s (x2 over 24s)   kubelet            Error: ErrImagePull

2. 排查思路

这个故障明显是由于node节点拉取harbor中镜像失败造成.那么可能的原因就是以下几点,我们对此一一排查

  1. harbor.intra.com解析错误或者服务器没有启动,使用ping命令排查
  2. harbor.intra.com上harbor服务异常造成,使用网页浏览或者curl命令排查
  3. node节点到harbor身份验证异常,docker login检查.检查daemon.json和config.json

2.1 ping harbor

直接到node2节点ping

root@k8s-node-2:~# ping harbor.intra.com -c 3
PING harbor.intra.com (192.168.31.189) 56(84) bytes of data.
64 bytes from harbor.intra.com (192.168.31.189): icmp_seq=1 ttl=64 time=0.249 ms
64 bytes from harbor.intra.com (192.168.31.189): icmp_seq=2 ttl=64 time=1.36 ms
64 bytes from harbor.intra.com (192.168.31.189): icmp_seq=3 ttl=64 time=0.108 ms

现在确定ping是正常的,那么至少服务器我们开了

2.2 检查harbor服务是否正常

用curl测试下harbor是否通

root@k8s-node-2:~# curl https://harbor.intra.com/harbor -k
<!doctype html>
<html>

<head>
    <meta charset="utf-8">
    <title>Harbor</title>
    <base href="/">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <link rel="icon" type="image/x-icon" href="favicon.ico?v=2">
    <link rel="preload" as="style" href="./light-theme.css?buildTimestamp=1639627836207">
    <link rel="preload" as="style" href="./dark-theme.css?buildTimestamp=1639627836207">
<link rel="stylesheet" href="styles.e71e5822ddf4adf262c4.css"></head>
<body>
    <harbor-app>
        <div class="spinner spinner-lg app-loading app-loading-fixed">
            Loading...
        </div>
    </harbor-app>
<script src="runtime.5ed5a3869dd69991407a.js" defer></script><script src="polyfills.a5e9bc0ea6dbbbdc0878.js" defer></script><script src="scripts.fc1928a0f22676249790.js" defer></script><script src="main.8b949aee92f43fe7c3ab.js" defer></script></body>

这里我们确认了harbor服务是正常的,通过网页也能访问到
在这里插入图片描述

2.3 docker login harbor

此时发生了报错,明显是验证失败了.且失败原因是没有授权的仓库造成的.

root@k8s-node-2:~# docker login https://harbor.intra.com
Password: ting with existing credentials...
Error: Password Requiredrror: Error response from daemon: Get "https://harbor.intra.com/v2/": x509: certificate signed by unknown authority

那么我们到node1上试下是否登录harbor成功

root@k8s-node-1:~# docker login https://harbor.intra.com
Authenticating with existing credentials...
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

那么我们将node1的/etc/docker/daemon.json同步到node2上

root@k8s-node-1:~# scp /etc/docker/daemon.json 192.168.31.112:/etc/docker/daemon.json
root@192.168.31.112's password:
daemon.json      

再到node2上重启docker服务,此时可以看到授权的镜像仓库里有了https://harbor.intra.com/

root@k8s-node-2:~# systemctl restart docker
root@k8s-node-2:~# docker info |tail -10
WARNING: No swap limit support
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
 Insecure Registries:
  127.0.0.0/8
  192.168.31.0/24
 Registry Mirrors:
  https://docker.mirrors.ustc.edu.cn/
  http://hub-mirror.c.163.com/
  https://harbor.intra.com/
  https://192.168.31.189/
 Live Restore Enabled: true

3. 解决

尝试再次登录harbor并拉取镜像

root@k8s-node-2:~# docker login https://harbor.intra.com
Authenticating with existing credentials...
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
root@k8s-node-2:~# docker pull harbor.intra.com/baseimages/kubernetesui/dashboard:v2.4.0
v2.4.0: Pulling from baseimages/kubernetesui/dashboard
Digest: sha256:2d2ac5c357a97715ee42b2186fda39527b826fdd7df9f7ade56b9328efc92041
Status: Image is up to date for harbor.intra.com/baseimages/kubernetesui/dashboard:v2.4.0
harbor.intra.com/baseimages/kubernetesui/dashboard:v2.4.0

此时dashboard pod状态也变成了Running

root@k8s-master-01:~# kubectl get pod -n kubernetes-dashboard
NAME                                         READY   STATUS    RESTARTS      AGE
dashboard-metrics-scraper-6848d4dd7d-g7k6b   1/1     Running   4 (49m ago)   226d
kubernetes-dashboard-6948fdc5fd-7szc9        1/1     Running   0             6m2s
Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐