容器PaaS(Openshift)之机器学习:Kubeflow安装
Kubeflow支持desktop/server、Kubernetes集群以及公有云上安装,本文主要介绍在Kubernetes、Openshift上安装。准备Kubernetes v1.11以上或者Openshift v3.11以上集群能够连接gcr镜像库提前创建两个10G pv,两个20G pvKubeflow使用istio做ingress,建议提前安装isito (open...
·
Kubeflow支持desktop/server、Kubernetes集群以及公有云上安装,本文主要介绍在Kubernetes、Openshift上安装。
准备
- Kubernetes v1.11以上或者Openshift v3.11以上集群
- 能够连接gcr镜像库
- 提前创建两个10G pv,两个20G pv
- Kubeflow使用istio做ingress,建议提前安装isito (openshift上istio的安装,请参见我的另一篇文章)
如果无法从gcr下载镜像,可以从如下百度网盘下载镜像后load到集群的各个node中。
链接:百度网盘
提取码:7ra2
安装
下载kfctl
https://github.com/kubeflow/kubeflow/releases/
tar -xvf kfctl__.tar.gz
或者通过脚本下载:
opsys=linux # darwin for Mac
curl -s https://api.github.com/repos/kubeflow/kubeflow/releases/latest |\
grep browser_download |\
grep $opsys |\
cut -d '"' -f 4 |\
xargs curl -O -L && \
tar -zvxf kfctl_*_${opsys}.tar.gz
配置和部署Kubeflow
# Add kfctl to PATH, to make the kfctl binary easier to use.
export PATH=$PATH:"<path to kfctl>"
export KFAPP="<your choice of application directory name>"
# Installs Istio by default. Comment out Istio components in the config file to skip Istio installation. See https://github.com/kubeflow/kubeflow/pull/3663
export CONFIG="https://raw.githubusercontent.com/kubeflow/kubeflow/v0.6-branch/bootstrap/config/kfctl_k8s_istio.0.6.2.yaml"
kfctl init ${KFAPP} --config=${CONFIG} -V
cd ${KFAPP}
kfctl generate all -V
kfctl apply all -V
注意
- ${KFAPP}
要存储Kubeflow配置的目录名称,执行kfcl init会创建该目录。 - 关于istio
如果已经提前部署,需要修改kfctl_k8s_istio.0.6.2.yaml文件后再执行,修改方法如下:
wget https://raw.githubusercontent.com/kubeflow/kubeflow/v0.6-branch/bootstrap/config/kfctl_k8s_istio.0.6.2.yaml
编辑kfctl_k8s_istio.0.6.2.yaml,注掉如下内容:
# Istio install. If not needed, comment out istio-crds and istio-install.
# - kustomizeConfig:
# parameters:
# - name: namespace
# value: istio-system
# repoRef:
# name: manifests
# path: istio/istio-crds
# name: istio-crds
# - kustomizeConfig:
# parameters:
# - name: namespace
# value: istio-system
# repoRef:
# name: manifests
# path: istio/istio-install
# name: istio-install
确认安装结果
Kubenetes集群上执行如下命令:
kubectl -n kubeflow get pods
Openshift集群上执行如下命令:
oc get pods -n kubeflow
确认所有pod是running状态:
访问Kubeflow的dashboad
通过isito的istio-ingressgateway服务访问dashboad,通过route或者nodeport访问,openshift上如下:
Troubleshooting
- load镜像
如果无法从gcr下载镜像,而是采用的准备部分链接直接load的镜像,需要修改相关的Deployment以及Stateful sets的imagePullPolicy,将Always修改为IfNotPresent,否则会优先从gcr下载镜像,导致pod一直处于镜像下载失败状态,如下:
image: 'gcr.io/kubeflow-images-public/ingress-setup:latest'
imagePullPolicy: IfNotPresent
- 要确保pv提前分配,pvc能bound到pv,如下:
- Openshift上需要添加scc以及权限:
oc adm policy add-scc-to-user anyuid -z jupyter-hub
oc adm policy add-role-to-user cluster-admin -z tf-job-operator
更多推荐
已为社区贡献1条内容
所有评论(0)