【Docker系列】Docker Compose 服务依赖和健康检查
当依赖的服务,启动失败或者错误时,后续的服务哪怕成功启动,业务无法运行的。这时就需要健康检查!健康检查是容器运行状态的高级检查,主要是检查容器所运行的进程是否能正常的对外提供“服务”,比如一个数据库容器,我们不光需要这个容器是up的状态,我们还要求这个容器的数据库进程能够正常对外提供服务,这就是所谓的健康检查。
准备
不想再写一遍了,请看上篇文章的文件准备:【Docker系列】Docker Compose 环境变量
服务依赖
docker-compose.yml 添加depends_on
参数
启动顺序:
- redis-server
- flask
- nginx
version: "3.8"
services:
flask:
build:
context: ./flask
dockerfile: Dockerfile
image: flask-demo:latest
environment:
- REDIS_HOST=redis-server
- REDIS_PASS=${REDIS_PASSWORD}
depends_on: # 依赖redis-server启动后,再启动
- redis-server
networks:
- backend
- frontend
redis-server:
image: redis:latest
command: redis-server --requirepass ${REDIS_PASSWORD}
networks:
- backend
nginx:
image: nginx:stable-alpine
ports:
- 8000:80
depends_on: # 依赖flask启动后,再启动
- flask
volumes:
- ./nginx/nginx.conf:/etc/nginx/conf.d/default.conf:ro
- ./var/log/nginx:/var/log/nginx
networks:
- frontend
networks:
backend:
frontend:
健康检查
当依赖的服务,启动失败或者错误时,后续的服务哪怕成功启动,业务无法运行的。这时就需要健康检查!
健康检查是容器运行状态的高级检查,主要是检查容器所运行的进程是否能正常的对外提供“服务”,比如一个数据库容器,我们不光需要这个容器是up的状态,我们还要求这个容器的数据库进程能够正常对外提供服务,这就是所谓的健康检查。
容器本身有一个健康检查的功能,但是需要在Dockerfile里定义,或者在执行docker container run 的时候,通过下面的一些参数指定
--health-cmd string Command to run to check health
--health-interval duration Time between running the check
(ms|s|m|h) (default 0s)
--health-retries int Consecutive failures needed to
report unhealthy
--health-start-period duration Start period for the container to
initialize before starting
health-retries countdown
(ms|s|m|h) (default 0s)
--health-timeout duration Maximum time to allow one check to
HEALTHCHECK
官网参数说明:https://docs.docker.com/engine/reference/builder/#healthcheck
可以在Dockerfile里面,写健康检测的命令
Dockerfile文件,添加一行HEALTHCHECK
python:3.9.5-slim镜像里面,记得安装一下
curl
命令
FROM python:3.9.5-slim
RUN pip install flask redis && \
apt-get update && \
apt-get install -y curl && \ # 需要安装一下curl
groupadd -r flask && useradd -r -g flask flask && \
mkdir /src && \
chown -R flask:flask /src
USER flask
COPY app.py /src/app.py
WORKDIR /src
ENV FLASK_APP=app.py REDIS_HOST=redis
EXPOSE 5000
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost:5000/ || exit 1
CMD ["flask", "run", "-h", "0.0.0.0"]
会每隔30秒检查一次,如果失败就会退出,退出代码是1
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost:5000/ || exit 1
构建镜像和创建容器
构建镜像,创建一个bridge网络,然后启动容器连到bridge网络
设计场景:仅启动flask-demo容器,依赖的redis服务不启动,flask将无法成功连接redis数据库。
$ docker image build -t flask-demo .
$ docker network create mybridge
$ docker container run -d --network mybridge --env REDIS_PASS=abc123 flask-demo
查看容器状态
$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
059c12486019 flask-demo "flask run -h 0.0.0.0" 4 hours ago Up 8 seconds (health: starting) 5000/tcp dazzling_tereshkov
When a container has a healthcheck specified, it has a health status in addition to its normal status. This status is initially
starting
. Whenever a health check passes, it becomeshealthy
(whatever state it was previously in). After a certain number of consecutive failures, it becomesunhealthy
.
当容器指定了运行状况检查时,它除了正常状态外还具有运行状况。此状态处于starting
.。只要健康检查通过,它就会变得healthy
(无论以前处于何种状态)。经过一定数量的连续失败后,它变得unhealthy
。The options that can appear before
CMD
are:
--interval=DURATION
(default: 30s)
--timeout=DURATION
(default: 30s)
--start-period=DURATION
(default: 0s)
--retries=N
(default: 3) 默认3次
也可以通过docker container inspect 059
查看详情, 其中有有关health的
"Health": {
"Status": "starting",
"FailingStreak": 1,
"Log": [
{
"Start": "2021-07-14T19:04:46.4054004Z",
"End": "2021-07-14T19:04:49.4055393Z",
"ExitCode": -1, # 健康检查码,成功返回 0
"Output": "Health check exceeded timeout (3s)"
}
]
}
经过3次检查,一直是不通的,然后health的状态会从starting变为 unhealthy
docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
059c12486019 flask-demo "flask run -h 0.0.0.0" 4 hours ago Up 2 minutes (unhealthy) 5000/tcp dazzling_tereshkova
启动redis服务器
启动redis,连到mybridge上,--name=redis
,注意密码 --requirepass abc123
$ docker container run -d --network mybridge --name redis redis:latest redis-server --requirepass abc123
经过几秒钟,我们的 flask 变成了healthy
$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bc4e826ee938 redis:latest "docker-entrypoint.s…" 18 seconds ago Up 16 seconds 6379/tcp redis
059c12486019 flask-demo "flask run -h 0.0.0.0" 4 hours ago Up 6 minutes (healthy) 5000/tcp dazzling_tereshkova
docker-compose 健康检查 (重点)
官网示例:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost"]
interval: 1m30s
timeout: 10s
retries: 3
start_period: 40s
interval
, timeout
和start_period
指定为持续时间
docker-compose.yml
version: "3.8"
services:
flask:
build:
context: ./flask
dockerfile: Dockerfile
image: flask-demo:latest
environment:
- REDIS_HOST=redis-server
- REDIS_PASS=${REDIS_PASSWORD}
healthcheck: # 添加健康检测
test: ["CMD", "curl", "-f", "http://localhost:5000"]
interval: 30s
timeout: 3s
retries: 3
start_period: 40s
depends_on:
redis-server:
condition: service_healthy
networks:
- backend
- frontend
redis-server:
image: redis:latest
command: redis-server --requirepass ${REDIS_PASSWORD}
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 1s
timeout: 3s
retries: 30
networks:
- backend
nginx:
image: nginx:stable-alpine
ports:
- 8000:80
depends_on:
- flask
volumes:
- ./nginx/nginx.conf:/etc/nginx/conf.d/default.conf:ro
- ./var/log/nginx:/var/log/nginx
networks:
- frontend
networks:
backend:
frontend:
为什么仅有flask容器健康检测了?
PS C:\Users\柏杉\Downloads\compose-healthcheck-flask> docker-compose ps
Name Command State Ports
------------------------------------------------------------------------------------------------------------------------
compose-healthcheck-flask_flask_1 flask run -h 0.0.0.0 Up (health: starting) 5000/tcp
compose-healthcheck-flask_nginx_1 /docker-entrypoint.sh ngin ... Up 0.0.0.0:8000->80/tcp
compose-healthcheck-flask_redis-server_1 docker-entrypoint.sh redis ... Up 6379/tcp
在docker-compose.yml文件里面,去定义的Nginx是depends_on,没有healthcheck健康检测,
nginx 依赖的flask的状态是healthy
的时候,才会启动,否则等待。
nginx:
image: nginx:stable-alpine
ports:
- 8000:80
depends_on:
flask:
condition: service_healthy
volumes:
- ./nginx/nginx.conf:/etc/nginx/conf.d/default.conf:ro
- ./var/log/nginx:/var/log/nginx
networks:
- frontend
执行docker-compose up -d
会卡主,因为flask不是healthy
状态!
redis-server 添加一个healthcheck
redis-server:
image: redis:latest
command: redis-server --requirepass ${REDIS_PASSWORD}
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 1s
timeout: 3s
retries: 30
networks:
- backend
推荐一看 docker-compose healthcheck 的例子
GitHub地址:https://gist.github.com/phuysmans/4f67a7fa1b0c6809a86f014694ac6c3a
version: '2.1'
services:
php:
tty: true
build:
context: .
dockerfile: tests/Docker/Dockerfile-PHP
args:
version: cli
volumes:
- ./src:/var/www/src
- ./tests:/var/www/tests
- ./build:/var/www/build
- ./phpunit.xml.dist:/var/www/phpunit.xml.dist
depends_on:
couchbase:
condition: service_healthy
memcached:
condition: service_started
mysql:
condition: service_healthy
postgresql:
condition: service_healthy
redis:
condition: service_healthy
couchbase:
build:
context: .
dockerfile: tests/Docker/Dockerfile-Couchbase
healthcheck:
test: ["CMD", "curl", "-f", "http://Administrator:password@localhost:8091/pools/default/buckets/default"]
interval: 1s
timeout: 3s
retries: 60
memcached:
image: memcached
# not sure how to properly healthcheck
mysql:
image: mysql
environment:
- MYSQL_ALLOW_EMPTY_PASSWORD=yes
- MYSQL_ROOT_PASSWORD=
- MYSQL_DATABASE=cache
healthcheck:
test: ["CMD", "mysql" ,"-h", "mysql", "-P", "3306", "-u", "root", "-e", "SELECT 1", "cache"]
interval: 1s
timeout: 3s
retries: 30
postgresql:
image: postgres
environment:
- POSTGRES_PASSWORD=
- POSTGRES_DB=cache
healthcheck:
test: ["CMD", "pg_isready"]
interval: 1s
timeout: 3s
retries: 30
redis:
image: redis
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 1s
timeout: 3s
retries: 30
更多推荐
所有评论(0)