Nomad Nginx 暴露IP端口和重启/重调度验证
目录Nomad Nginx 暴露IP端口和重启/重调度验证1.准备工作搭建nomad集群pull nginx镜像2.nginx 暴露IP端口创建job文件运行验证3.重启/重调度验证restartrescheduleNomad Nginx 暴露IP端口和重启/重调度验证1.准备工作搭建nomad集群本测试使用三台ubuntu18.04虚拟机,IP地址分别为:虚拟机1:192.168.60.10虚拟
目录
Nomad Nginx 暴露IP端口和重启/重调度验证
1.准备工作
1.1 搭建nomad集群
本测试使用三台ubuntu18.04虚拟机,IP地址分别为:
虚拟机1:192.168.60.10
虚拟机2:192.168.60.11
虚拟机3:192.168.60.12
具体搭建方法见Nomad集群 自身高可用测试。
1.2 pull nginx镜像
在三个虚拟机中都执行sudo docker pull nginx
,最好复制一个本地镜像nginx:v1
,不然每次启动job都会重复pull增大延迟。
在三个虚拟机中都创建Dockerfile:
FROM nginx:latest
在三个虚拟机中都执行sudo docker build -t nginx:v1 .
2.测试driver=docker
2.1 nginx 暴露IP端口
创建job文件nginx.nomad
:
job "nginx" {
datacenters = ["dc1"]
group "nginxg" {
count = 1 #只运行1个实例
network {
port "nginxport" { #自定义端口名称
static = 8765 #设置运行容器的client暴露出的端口,即某个虚拟机的端口,可供访问
to = 80 #映射到容器内部80端口 8764:80
}
}
task "nginxt" {
driver = "docker"
config {
image = "nginx:v1"
ports = ["nginxport"]
}
}
}
}
注释部分可能需要删除才能运行
在任意一台虚拟机执行nomad job run nginx.nomad
,发现job在虚拟机1中运行。
ubuntu1@ubuntu1:~$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7ecc30f20c63 nginx:latest "/docker-entrypoint.…" 19 seconds ago Up 18 seconds 192.168.60.10:8765->80/tcp, 192.168.60.10:8765->80/udp nginxt-b2eff4f3-20d6-a666-7771-e9c8bd4bd73f
这时可以打开浏览器访问192.168.60.10:8765
,出现以下界面,说明IP端口可访问。
2.2 重启/重调度验证
参考:restart, reschedule
修改上述nginx.nomad
文件如下:
job "nginx" {
datacenters = ["dc1"]
group "nginxg" {
count = 1 #只运行1个实例
network {
port "nginxport" { #自定义端口名称
static = 8765 #设置运行容器的client暴露出的端口,即某个虚拟机的端口,可供访问
to = 80 #映射到容器内部80端口 8764:80
}
}
restart {
attempts = 1 #interval时间内允许重启的最大次数
interval = "30m" #如果规定时间内重启次数超过attempts,则处理方式由mode控制
delay = "2s" #每次重启前的延迟,最小值为0s
mode = "fail" #超过次数则进入失败状态,进行重调度reschedule;其他模式见参考
}
reschedule {
attempts = 15 #interval时间内允许重调度的最大次数
interval = "1h" #如果规定时间内重调度次数超过attempts,则不再调度
delay = "5s" #每次重调度前的延迟,最小值为5s
delay_function = "constant" #延迟的增长方式,constant为恒定值,其他见参考
unlimited = false #关闭无限调度模式,若开启则attempts和interval失效
}
task "nginxt" {
driver = "docker"
config {
image = "nginx:v1"
ports = ["nginxport"]
}
}
}
}
重复第2节中的示例进行验证,此时在虚拟机1中运行了nginx容器。
2.2.1 restart
执行sudo docker stop [container ID]
或sudo kill -9 [container process ID]
来关闭nginx,两者效果相同。
container ID可以用sudo docker ps
查看,container process ID可以用ps -ef
查看。
关闭之后,再次执行sudo docker ps
,发现container ID和创建时间发生变化,说明nginx被重启。
执行nomad job status nginx
如下:
ubuntu1@ubuntu1$ nomad job status nginx
ID = nginx
Name = nginx
Submit Date = 2021-08-30T10:59:50+08:00
Type = service
Priority = 50
Datacenters = home
Namespace = default
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
nginxg 0 0 1 0 0 0
Latest Deployment
ID = 3f292432
Status = successful
Description = Deployment completed successfully
Deployed
Task Group Desired Placed Healthy Unhealthy Progress Deadline
nginxg 1 1 1 0 2021-08-30T11:10:00+08:00
Allocations
ID Node ID Task Group Version Desired Status Created Modified
a90bf17e d7dc7cb2 nginxg 0 run running 21s ago 10s ago
获取Allocations ID 768c9fbb
,执行nomad alloc status a90bf17e
如下:
ubuntu1@ubuntu1$ nomad alloc status a90bf17e
ID = a90bf17e-7bff-73e8-63c8-3daa2bbb0b65
Eval ID = ed640bc9
Name = nginx.nginxg[0]
Node ID = a90bf17e
Node Name = ubuntu1
Job ID = nginx
Job Version = 0
Client Status = failed
Client Description = Failed tasks
Desired Status = stop
Desired Description = alloc was rescheduled because it failed
Created = 7m42s ago
Modified = 38s ago
Deployment ID = 3f292432
Deployment Health = healthy
Replacement Alloc ID = 768c9fbb
Allocation Addresses
Label Dynamic Address
*nginxport yes 192.168.60.10:8765 -> 80
Task "nginxt" is "running"
Task Resources
CPU Memory Disk Addresses
0/100 MHz 1.9 MiB/300 MiB 300 MiB
Task Events:
Started At = 2021-08-30T03:00:28Z
Finished At = 2021-08-30T03:06:49Z
Total Restarts = 1
Last Restart = 2021-08-30T11:00:27+08:00
Recent Events:
Time Type Description
2021-08-30T11:00:28+08:00 Started Task started by client
2021-08-30T11:00:27+08:00 Restarting Task restarting in 1ns
2021-08-30T11:00:27+08:00 Terminated Exit Code: 0
2021-08-30T10:59:50+08:00 Started Task started by client
2021-08-30T10:59:50+08:00 Task Setup Building Task Directory
2021-08-30T10:59:50+08:00 Received Task received by client
在Recent Events中可以看到nginx确实被重启了。
2.2.2 reschedule
接续上述验证,再次将nginx关闭,发现虚拟机1不再运行nginx,而是延迟5秒后转移到了虚拟机2中运行。
执行nomad job status nginx
如下:
ubuntu1@ubuntu1$ nomad job status nginx
ID = nginx
Name = nginx
Submit Date = 2021-08-30T10:59:50+08:00
Type = service
Priority = 50
Datacenters = home
Namespace = default
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
nginxg 0 0 1 1 0 0
Latest Deployment
ID = 3f292432
Status = successful
Description = Deployment completed successfully
Deployed
Task Group Desired Placed Healthy Unhealthy Progress Deadline
nginxg 1 1 1 0 2021-08-30T11:10:00+08:00
Allocations
ID Node ID Task Group Version Desired Status Created Modified
768c9fbb d7dc7cb2 nginxg 0 run running 21s ago 10s ago
a90bf17e a9470c7f nginxg 4 stop failed 7m25s ago 21s ago
发现新增加了Allocations 768c9fbb,之前的a90bf17e已经失败停止。
执行nomad alloc status a90bf17e
如下:
...
Recent Events:
Time Type Description
2021-08-30T11:06:49+08:00 Not Restarting Exceeded allowed attempts 1 in interval 30m0s and mode is "fail"
2021-08-30T11:06:49+08:00 Terminated Exit Code: 0
2021-08-30T11:00:28+08:00 Started Task started by client
2021-08-30T11:00:27+08:00 Restarting Task restarting in 1ns
2021-08-30T11:00:27+08:00 Terminated Exit Code: 0
2021-08-30T10:59:50+08:00 Started Task started by client
2021-08-30T10:59:50+08:00 Task Setup Building Task Directory
2021-08-30T10:59:50+08:00 Received Task received by client
执行nomad alloc status 768c9fbb
如下:
...
Recent Events:
Time Type Description
2021-08-30T11:06:54+08:00 Started Task started by client
2021-08-30T11:06:54+08:00 Task Setup Building Task Directory
2021-08-30T11:06:54+08:00 Received Task received by client
说明nginx确实发生了重调度。
3.测试driver=raw_exec
3.1 重启/重调度验证
参考:restart, reschedule
编写简单程序main.c
:
#include <stdio.h>
int main()
{
while(1){sleep(10);}
return 0;
}
gcc main.c -o main.out
编译。
sudo cp main.out /usr/
将main.out拷贝到/usr下面。
在/etc/nomad.d/nomad.hcl
文件中添加以下内容:
plugin "raw_exec" {
config {
enabled = true
}
}
三个虚拟机都要完成以上操作。
创建exec.nomad
文件如下:
job "exec" {
datacenters = ["home"]
type = "batch"
group "execg" {
count = 1
restart {
attempts = 1
interval = "30m"
delay = "0s"
mode = "fail"
}
reschedule {
attempts = 15
interval = "1h"
delay = "5s"
delay_function = "constant"
unlimited = false
}
task "exect" {
driver = "raw_exec"
config {
command = "/usr/main.out"
}
}
}
}
nomad job run exec.nomad
,此时在虚拟机1中运行了main.out进程。
3.2.1 restart
执行sudo kill -9 [PID]
来关闭main.out进程,PID可以用ps -ef
查看。
关闭之后,再次执行ps -ef
,发现main.out的PID发生变化,说明进程被重启。
执行nomad job status exec
如下:
ubuntu1@ubuntu1$ nomad job status exec
ID = exec
Name = exec
Submit Date = 2021-08-30T16:31:06+08:00
Type = batch
Priority = 50
Datacenters = dc1
Namespace = default
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
execg 0 0 1 0 0 0
Allocations
ID Node ID Task Group Version Desired Status Created Modified
e18abb7e d7dc7cb2 execg 0 run running 26m47s ago 44s ago
获取Allocations ID e18abb7e
,执行nomad alloc status e18abb7e
如下:
ubuntu1@ubuntu1$ nomad alloc status e18abb7e
ID = e18abb7e-4da8-44e6-17e3-2dc7fee4cac7
Eval ID = 10353d31
Name = exec.execg[0]
Node ID = d7dc7cb2
Node Name = ubuntu1
Job ID = exec
Job Version = 0
Client Status = running
Client Description = Tasks are running
Desired Status = run
Desired Description = <none>
Created = 28m3s ago
Modified = 2m ago
Task "exect" is "running"
Task Resources
CPU Memory Disk Addresses
0/100 MHz 34 MiB/300 MiB 300 MiB
Task Events:
Started At = 2021-08-30T08:57:33Z
Finished At = N/A
Total Restarts = 1
Last Restart = 2021-08-30T16:57:33+08:00
Recent Events:
Time Type Description
2021-08-30T16:57:33+08:00 Started Task started by client
2021-08-30T16:57:33+08:00 Restarting Task restarting in 1ns
2021-08-30T16:57:33+08:00 Terminated Exit Code: 137, Signal: 9
2021-08-30T16:31:31+08:00 Started Task started by client
2021-08-30T16:31:31+08:00 Task Setup Building Task Directory
2021-08-30T16:31:31+08:00 Received Task received by client
在Recent Events中可以看到main.out进程确实被重启了。
3.2.2 reschedule
接续上述验证,再次将main.out进程关闭,发现虚拟机1不再运行main.out,而是延迟5秒后转移到了虚拟机2中运行。
执行nomad job status exec
如下:
ubuntu1@ubuntu1$ nomad job status exec
ID = exec
Name = exec
Submit Date = 2021-08-30T16:31:06+08:00
Type = batch
Priority = 50
Datacenters = dc1
Namespace = default
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
execg 0 0 1 1 0 0
Allocations
ID Node ID Task Group Version Desired Status Created Modified
55eeaf58 a9470c7f execg 2 run running 46s ago 46s ago
e18abb7e d7dc7cb2 execg 0 stop failed 30m34s ago 46s ago
发现新增加了Allocations 55eeaf58,之前的 e18abb7e 已经失败停止。
执行nomad alloc status e18abb7e
如下:
...
Recent Events:
Time Type Description
2021-08-30T16:57:36+08:00 Not Restarting Exceeded allowed attempts 1 in interval 30m0s and mode is "fail"
2021-08-30T16:57:36+08:00 Terminated Exit Code: 137, Signal: 9
2021-08-30T16:57:33+08:00 Started Task started by client
2021-08-30T16:57:33+08:00 Restarting Task restarting in 1ns
2021-08-30T16:57:33+08:00 Terminated Exit Code: 137, Signal: 9
2021-08-30T16:31:31+08:00 Started Task started by client
2021-08-30T16:31:31+08:00 Task Setup Building Task Directory
2021-08-30T16:31:31+08:00 Received Task received by client
执行nomad alloc status 55eeaf58
如下:
...
Recent Events:
Time Type Description
2021-08-30T17:01:19+08:00 Started Task started by client
2021-08-30T17:01:19+08:00 Task Setup Building Task Directory
2021-08-30T17:01:19+08:00 Received Task received by client
说明main.out进程确实发生了重调度。
更多推荐
所有评论(0)