Neutron是OpenStack核心项目之一,提供云计算环境下的虚拟网络功能。Neutron的功能日益强大,并在Horizon面板中已经集成该模块。作为Neutron的核心开发者之一,个人觉得Neutron完全代替Nova Network模块作为云计算网络管理中心是必然趋势。要使用好OpenStack,了解Neutron概念及其相应操作就显得格外重要。为此目的,这篇博客主要讲述Neutron网络的一些基本概念,网络规划和Horizon中如何使用Neutron中的网络功能。

Neutron基本概念

  1. 网络

在普通人的眼里,网络就是网线和供网线插入的端口,一个盒子会提供这些端口。对于网络工程师来说,网络的盒子指的是交换机和路由器。所以在物理世界中,网络可以简单地被认为包括网线,交换机和路由器。当然,除了物理设备,我们还有软的物件:IP地址,交换机和路由器的配置和管理软件以及各种网络协议。要管理好一个物理网络需要非常深的网络专业知识和经验。

Neutron网络目的是(为OpenStack云更灵活地)划分物理网络,在多租户环境下提供给每个租户独立的网络环境。另外,Neutron提供API来实现这种目标。Neutron中“网络”是一个可以被用户创建的对象,如果要和物理环境下的概念映射的话,这个对象相当于一个巨大的交换机,可以拥有无限多个动态可创建和销毁的虚拟端口。

  1. 端口

在物理网络环境中,端口是用于连接设备进入网络的地方。Neutron中的端口起着类似的功能,它是路由器和虚拟机挂接网络的着附点。

  1. 路由器

和物理环境下的路由器类似,Neutron中的路由器也是一个路由选择和转发部件。只不过在Neutron中,它是可以创建和销毁的软部件。

  1. 子网

简单地说,子网是由一组IP地址组成的地址池。不同子网间的通信需要路由器的支持,这个Neutron和物理网络下是一致的。Neutron中子网隶属于网络。

 

Neutron网络的一种典型结构

有这么多的基本概念,使用起来是不是很麻烦?其实只要我们好好梳理一下,neutron的基本操作还是很简单的。

首先我们规划一下我们的网络。Neutron典型的网络结构如下图所示:

1

在这个图中,我们有一个和互联网(互联网可以由其它网络代替,比如办公网络)连接的外部网络。这个外部网络是租户虚拟机访问互联网或者互联网访问虚拟机的途径。外部网络有一个子网,它是一组在互联网上可寻址的IP地址。一般情况下,外部网络只有一个(neutron是支持多个外部网络的),且由管理员创建。租户网络可由租户任意创建。当一个租户的网络上的虚拟机需要和外部网络以及互联网通信时,这个租户就需要一个路由器。路由器有两种臂,一种是gateway网关臂,另一种是interfaces接口臂。网关臂只有一个,连接外部网。接口臂可以有多个,连接租户网络的子网。经过这样的网络规划,我们有下面的步骤来实施:

  1. 首先管理员拿到一组可以在互联网上寻址的IP地址,并且创建一个外部网络和子网
  2. 租户创建一个网络和子网
  3. 租户创建一个路由器并且连接租户子网和外部网络
  4. 租户创建虚拟机

Horizon上操作网络

现在我们规划好了网络,我们就可以在Horizon上进行操作了。

管理员创建外网

我们说过外网要由管理员创建。假设我们拿到的外部网络的IP地址是20.0.2.0/24,可用的地址段是20.0.2.10-20.0.2.20,网关是20.0.2.1。我们现在就以管理员的身份在horzion上创建一个外网和子网:

  1. 以管理员身份登录后,选择管理员面板,点击’Networks’项后显示当前网络列表:1
  2. 在上图中点击“Create Network”按钮弹出“Create Network”窗口:2在上面的弹出窗口中,填入网络名:ExternalNet,而且勾选External Network然后点击“Create Network”。
  3. 下面是我们刚才创建好的网络:3
  4. 现在我们点击“ExternalNet”查看这个网络的详细情况:4
  5. 在网络详细情况页面中,我们点击“Create Subnet”弹出窗口创建外部网络的子网:5

这是一个多标签页面,在上面的“Subnet”页面中填入子网名,子网CIDR(Network Address)。在下面的“Subnet Detail”页中填入地址范围:

6 

注意:因为外部网络不挂接虚拟机,可以不选泽“Enable DHCP“。最后点击子网创建窗口上的“Create”完成子网的创建。

租户demo创建租户网络

我们现在以普通租户登录horizon,并点击“Netowrks”项:7

点击上图的“Create Network”弹出租户网络创建窗口:8

显然,这是一个多Tab的界面,和我们先前管理员创建网络的界面不一样。在这个界面里,我们可以一气完成网络和子网的创建。我们在“Network”页中填入网络名,在“Subnet”页中填入CIDR (Network Address):9

“Subnet Datail”页面可以让我们填入IP地址池,缺省情况是Neutron使用整个CIDR规定的地址范围。所以我们不关心“Subnet Datail”页面。我们点击“Create”按钮完成租户子网的创建。

租户创建路由器

选择“Routers”项,系统显示如下页面:12

  1. 创建路由器

点击右面的“Create Router”弹出路由器创建对话框:13

这个对话框及其简单,我们只需输入所有的路由器的名字,然后点击“Create Router”创建路由器。

  1. 配置网关臂

路由器创建出来之后,会显示在路由器列表中:13

我们点击“Set Gateway”来设置这个路由器的网关臂:

如上图所示,我们选择管理员创建好的外部网络“ExternalNet”,然后点击“Set Gateway”。

  1. 配置接口臂

下图显示了租户刚才创建的路由器及其网关臂:14

现在我们点击路由器名字查看路由器详情:15

为了设置路由器的接口臂,我们点击“Add Interface”按钮弹出“Add Interface”窗口:16

上图显示的窗口,我们只需选择我们所要连接的子网,然后点击“Add Interface”。

现在这个租户的路由器以及连接了外网和租户的子网,接下来这个租户可以创建虚拟机,这个虚拟机借助路由器就可以访问外部网络甚至互联网。

租户创建虚拟机

这里不想详述创建虚拟机的过程,我们着重叙述一下这个过程中如何选择网络:17

从上面的图中我们可以看出,租户可以选择的网路包括了租户刚才创建的网络。点击网络右边的“+”符号,或者拖拽这个网络进入“Selected Networks”框中,租户所创建的虚拟机就会使用这个网络。

租户查看网络拓扑

最后我想要说的是Horizon工程师费尽心思实现的网络拓扑图:18

从这个拓扑图可以看到我们前面所有操作的结果:

  1. 绿色的是管理员创建的外部网络
  2. 跨在外部网络和租户网络上的是具有网关臂和接口臂的路由器
  3. 蓝色的是租户网路及其子网
  4. 最后是租户创建的虚拟机

点击拓扑图上的虚拟机图像,我们可以使用VNC访问虚拟机的控制台,如下图所示:19

我们可以看出,这个虚拟机从租户网络中获取到了IP地址:10.0.1.3。它通过租户自己的路由器能访问互联网。

总结

是不是有了Neutron就不需要物理网络的网络工程师了呢?不是的。虚拟网络的负载最终还需要跑到物理网络上去传递,Neutron只不过是把部分传统网络管理的功能推到了租户方,比如租户可以创建一个虚拟网络及其子网,创建一个路由器等等,这个从上面的操作中可以看出来。在虚拟网络功能的帮助下,基础物理网络就可以向外提供额外的网络服务了,比如租户完全可以创建一个属于自己的类似于数据中心网络的虚拟网络。

Neutron提供了比较完善的多租户环境下的虚拟网络模型以及API。像部署物理网络一样,使用Neutron创建虚拟网络时也需要做一些基本的规划和设计。本文提出了一个Neutron环境下的典型网络结构和Horzion下的操作流程。在后续的博客中,我会继续介绍Neutron的其它功能,比如负载均衡,防火墙,VPN等。


openstack网络模式之vlan分析

 openstack neutron中定义了四种网络模式:


# tenant_network_type = local


# tenant_network_type = vlan


# Example: tenant_network_type = gre


# Example: tenant_network_type = vxlan


本文主要以vlan为例,并结合local来详细的分析下openstack的网络模式。


1. local模式


此模式主要用来做测试,只能做单节点的部署(all-in-one),这是因为此网络模式下流量并不能通过真实的物理网卡流出,即neutron的integration bridge并没有与真实的物理网卡做mapping,只能保证同一主机上的vm是连通的,具体参见RDO和neutron的配置文件。


(1)RDO配置文件(answer.conf)


主要看下面红色的配置项,默认为空。


CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS


openswitch默认的网桥的映射到哪,即br-int映射到哪。 正式由于br-int没有映射到任何bridge或interface,所以只能br-int上的虚拟机之间是连通的。


CONFIG_NEUTRON_OVS_BRIDGE_IFACES


流量最后从哪块物理网卡流出配置项


# Type of network to allocate for tenant networks (eg. vlan, local,


# gre)


CONFIG_NEUTRON_OVS_TENANT_NETWORK_TYPE=local


# A comma separated list of VLAN ranges for the Neutron openvswitch


# plugin (eg. physnet1:1:4094,physnet2,physnet3:3000:3999)


CONFIG_NEUTRON_OVS_VLAN_RANGES=


# A comma separated list of bridge mappings for the Neutron


# openvswitch plugin (eg. physnet1:br-eth1,physnet2:br-eth2,physnet3


# :br-eth3)


CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=


# A comma separated list of colon-separated OVS bridge:interface


# pairs. The interface will be added to the associated bridge.


CONFIG_NEUTRON_OVS_BRIDGE_IFACES=


(2)neutron配置文件(/etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini)


[ovs]


# (StrOpt) Type of network to allocate for tenant networks. The


# default value 'local' is useful only for single-box testing and


# provides no connectivity between hosts. You MUST either change this


# to 'vlan' and configure network_vlan_ranges below or change this to


# 'gre' or 'vxlan' and configure tunnel_id_ranges below in order for


# tenant networks to provide connectivity between hosts. Set to 'none'


# to disable creation of tenant networks.


#


tenant_network_type = local


RDO会根据answer.conf中local的配置将neutron中open vswitch配置文件中配置为local


2. vlan模式


大家对vlan可能比较熟悉,就不再赘述,直接看RDO和neutron的配置文件。


(1)RDO配置文件


# Type of network to allocate for tenant networks (eg. vlan, local,


# gre)


CONFIG_NEUTRON_OVS_TENANT_NETWORK_TYPE=vlan //指定网络模式为vlan


# A comma separated list of VLAN ranges for the Neutron openvswitch


# plugin (eg. physnet1:1:4094,physnet2,physnet3:3000:3999)


CONFIG_NEUTRON_OVS_VLAN_RANGES=physnet1:100:200 //设置vlan ID value为100~200


# A comma separated list of bridge mappings for the Neutron


# openvswitch plugin (eg. physnet1:br-eth1,physnet2:br-eth2,physnet3


# :br-eth3)


CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=physnet1:br-eth1 //设置将br-int映射到桥br-eth1(会自动创建phy-br-eth1和int-br-eth1来连接br-int和br-eth1)


# A comma separated list of colon-separated OVS bridge:interface


# pairs. The interface will be added to the associated bridge.


CONFIG_NEUTRON_OVS_BRIDGE_IFACES=br-eth1:eth1 //设置eth0桥接到br-eth1上,即最后的网络流量从eth1流出 (会自动执行ovs-vsctl add br-eth1 eth1)


此配置描述的网桥与网桥之间,网桥与网卡之间的映射和连接关系具体可结合 《图1 vlan模式下计算节点的网络设备拓扑结构图》和 《图2 vlan模式下网络节点的网络设备拓扑结构图 》来理解。


思考:很多同学可能会碰到一场景:物理机只有一块网卡,或有两块网卡但只有一块网卡连接有网线


此时,可以做如下配置


(2)单网卡:


CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=physnet1:br-eth0 //设置将br-int映射到桥br-eth10


# A comma separated list of colon-separated OVS bridge:interface


# pairs. The interface will be added to the associated bridge.


CONFIG_NEUTRON_OVS_BRIDGE_IFACES= //配置为空


这个配置的含义是将br-int映射到br-eth0,但是br-eth0并没有与真正的物理网卡绑定,这就需要你事先在所有的计算节点(或网络节点)上事先创建好br-eth0桥,并将eth0添加到br-eth0上,然后在br-eth0上配置好ip,那么RDO在安装的时候,只要建立好br-int与br-eth0之间的连接,整个网络就通了。


此时如果网络节点也是单网卡的话,可能就不能使用float ip的功能了。


(3)双网卡,单网线


CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=physnet1:br-eth1 //设置将br-int映射到桥br-eth1


# A comma separated list of colon-separated OVS bridge:interface


# pairs. The interface will be added to the associated bridge.


CONFIG_NEUTRON_OVS_BRIDGE_IFACES=eth1 //配置为空


还是默认都配置到eth1上,然后通过iptables将eth1的流量forward到eth0(没有试验过,不确定是否可行)


3. vlan网络模式详解






图1 vlan模式下计算节点的网络设备拓扑结构图


首先来分析下vlan网络模式下,计算节点上虚拟网络设备的拓扑结构。


(1)qbrXXX 等设备


前面已经讲过,主要是因为不能再tap设备vnet0上配置network ACL rules而增加的


(2)qvbXXX/qvoXXX等设备


这是一对veth pair devices,用来连接bridge device和switch,从名字猜测下:q-quantum, v-veth, b-bridge, o-open vswitch(quantum年代的遗留)。


(3) int-br-eth1和phy-br-eth1


这也是一对veth pair devices,用来连接br-int和br-eth1, 另外,vlan ID的转化也是在这执行的,比如从int-br-eth1进来的packets,其vlan id=101会被转化成1,同理,从phy-br-eth1出去的packets,其vlan id会从1转化成101


(4)br-eth1和eth1


packets要想进入physical network最后还得到真正的物理网卡eth1,所以add eth1 to br-eth1上,整个链路才完全打通




图2 vlan模式下网络节点的网络设备拓扑结构图


网络节点与计算节点相比,就是多了external network,L3 agent和dhcp agent。


(1)network namespace


每个L3 router对应一个private network,但是怎么保证每个private的ip address可以overlapping而又不相互影响呢,这就利用了linux kernel的network namespace


(2)qr-YYY和qg-VVV等设备 (q-quantum, r-router, g-gateway)


qr-YYY获得了一个internal的ip,qg-VVV是一个external的ip,通过iptables rules进行NAT映射。


思考:phy-br-ex和int-br-ex是干啥的?


坚持"所有packets必须经过物理的线路才能通"的思想,虽然 qr-YYY和qg-VVV之间建立的NAT的映射,归根到底还得通过一条物理链路,那么phy-br-ex和int-br-ex就建立了这条物理链路。


原文链接:http://blog.csdn.net/ustc_dylan/article/details/17224943#1536434-tsina-1-92100-66a1f5d8f89e9ad52626f6f40fdeadaa





Openstack Neutron using VXLAN


The networking part of Openstack provides several models for the L2 networks to be used as L2 tenant broadcast domains. An overlay network can be used to separate the MAC addresses and “vlans” of the tenant networks from the transport layer network.

Openstack neutron in Havana supports two overlay network technologies, GRE and VXLAN. VXLAN is my preferred solution, because it provides more entropie on the receiving NIC, which results in a higher performance, because multiple CPU cores are used to process ingress packets.

In this article I’ll show the implementation of VXLAN on three nodes. Two nodes are used as compute hosts, one node is used as the network node. On the network node (NN) several Neutron agents are running:

  • L3 agent: This one is responsible to build tenant routers using Linux network namespaces
  • DHCP agent: This one is responsible to build DHCP servers for tenant networks using Linux network namespaces
  • L2 (OVS) agent: This one configures and provisions the OVS

The Neutron metadata services is also deployed to provide cloudinit support for started VMs.

On the compute nodes (CN), only the L2 (OVS) agent is necessary.

A typical Openstack deployment is using one instance of the OVS,  br-int, as the point to connect all VMs, DHCP servers and the “non default gateway” side of all routers. br-int is using classic Vlans to separate the broadcast domains.

br-tun is a second OVS instance, and is used to provide the VXLAN function. br-tun is connected to br-int via an internal link. This links is a trunking port, it is using dot1q tagging to transport vlan ids-

When configuring Openstack Neutron (Havana not using ML2 ) I recommend to change the value of the tunnel ids in the neutron config to:

tunnel_id_ranges = 65537:69999

Changing this from the default values, which are below 4096, has the great advantage, that it easy to distinguish vlans ids from vxlan tunnel ids. This helps to understand the Openflow rules provisioned by neutron on br-tun. Why using 65537 as the first tunnel id? Well, 65537 in hex is 0×10001 and the OVS shows tunnel ids as hex values. It’s easy to read….

When using Openstack Icehouse on Ubuntu with ML2, the openvswitch plugin is not longer used. Any openvswitch config must be put in the ml2 plugin config file, using the section “ovs”.

/etc/neutron/plugins/ml2/ml2_conf.ini

[ml2]

type_drivers = vxlan,local

tenant_network_types = vxlan

mechanism_drivers = openvswitch

[ml2_type_vxlan]

vni_ranges = 65537:69999

[ml2_type_gre]

tunnel_id_ranges = 32769:34000

[ovs]

local_ip = <set IP for the tunnelinterface>

tunnel_type = vxlan

tunnel_bridge = br-tun

integration_bridge = br-int

tunnel_id_ranges = 65537:69999

tenant_network_type = vxlan

enable_tunneling = true

[agent]

root_helper = sudo neutron-rootwrap /etc/neutron/rootwrap.conf

tunnel_types = vxlan

vxlan_udp_port = 4789

l2_population = False

[securitygroup]

enable_security_group = True

firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

My recommendation is, to use different ranges for vlans, vxlan tunnelid’s (vni’s) and gre id’s.

This leads to the following setup as shown in the drawing (without showing the second compute node), which is using eth1 as the IP interface to transport the VXLAN traffic and br-ex to attach the routers to the public address pool 198.18.1.0/24.

Openstack Basic Setup

On the VXLAN transport network it is very important to use jumbo frames. If you choose 1600 instead of the default 1500 you’re on the safe side. The VXLAN tunnel overlay requires an additional UDP header on top of a full sized ip packet. Without jumbo frames the performance of GRE and VXLAN overlays drop to less than 100 MBit/s – instead of using 10 GBit/s and more. This performance drop is caused by the required packet fragmentation and defragmentation CPU processing needed on the sending and receiving side of the tunnel.

br-tun Openflow rules

Now we take a look at br-tun. First we get the ports using the command ovs-ofctl show br-tun and a following grep to filter unnecessary output.

ovs-ofctl show br-tun | grep -v -E "REPLY|n_tables|capabilities:|actions:|config:|state:"

1(patch-int): addr:6e:72:61:71:d9:02

speed: 0 Mbps now, 0 Mbps max

2(vxlan-2): addr:ee:bb:80:d1:90:0a

speed: 0 Mbps now, 0 Mbps max

3(vxlan-3): addr:f2:be:77:6f:66:e6

speed: 0 Mbps now, 0 Mbps max

LOCAL(br-tun): addr:1e:8c:fc:76:b2:4b

speed: 0 Mbps now, 0 Mbps max

br-tun has three ports:

  • port 1 is the connection to br-int
  • port 2 is the connection to compute node 1
  • port 2 is the connection to compute node 2
  • port LOCAL is an internal OVS port

The openflow rules can be shown using the command ovs-ofctl dump-flows br-tun:

ovs-ofctl dump-flows br-tun

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=1806.214s, table=0, n_packets=0, n_bytes=0, idle_age=1806, priority=1,in_port=1 actions=resubmit(,1)

cookie=0x0, duration=1044.587s, table=0, n_packets=0, n_bytes=0, idle_age=1044, priority=1,in_port=3 actions=resubmit(,3)

cookie=0x0, duration=1320.063s, table=0, n_packets=0, n_bytes=0, idle_age=1320, priority=1,in_port=2 actions=resubmit(,3)

cookie=0x0, duration=1806.18s, table=0, n_packets=0, n_bytes=0, idle_age=1806, priority=0 actions=drop

cookie=0x0, duration=1806.114s, table=1, n_packets=0, n_bytes=0, idle_age=1806, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,21)

cookie=0x0, duration=1806.146s, table=1, n_packets=0, n_bytes=0, idle_age=1806, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)

cookie=0x0, duration=1806.082s, table=2, n_packets=0, n_bytes=0, idle_age=1806, priority=0 actions=drop

cookie=0x0, duration=1806.049s, table=3, n_packets=0, n_bytes=0, idle_age=1806, priority=0 actions=drop

cookie=0x0, duration=1806.017s, table=10, n_packets=0, n_bytes=0, idle_age=1806, priority=1

actions=learn(table=20,hard_timeout=300,priority=1,

NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],

load:0-&gt;NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]-&gt;NXM_NX_TUN_ID[],

output:NXM_OF_IN_PORT[]),output:1

cookie=0x0, duration=1805.985s, table=20, n_packets=0, n_bytes=0, idle_age=1805, priority=0 actions=resubmit(,21)

cookie=0x0, duration=1805.953s, table=21, n_packets=0, n_bytes=0, idle_age=1805, priority=0 actions=drop

The Openflow rules shown, use different tables. table 0 is processed as the first one. Each table has rules with different priorities, the rules with the highest priority are checked first. At this stage, the rules do not make much sense.

The output below shows the checks done for an unicast packet entering from port=1.

Openflow rules after the installation

Openflow rules after the installation or br-tun

As we have seen above, port 1 is the port to br-int. Any packet entering port 1 is resubmitted to table 1. The first rule of table 1 checks if the packet is a multicast or broadcast ethernet packet. As we have an unicast packet, the rule does not  match. The next rule is true for all unicast ethernet packets, so the packet is resubmitted to table 20. We have at this stage only one rule in table 20 and the action is resubmit to table 21. In table 21 the packet is dropped, because  the action is drop.

At this stage, br-tun drops all traffic entering on one of the three ports. This is fine, as we did not configure anything as a tenant within Openstack.

Add a router, one network and one subnet.

Now it ‘s time to add networking from the tenant’s perspective within Openstack. The tasks are:

  • Add a router
  • Connect the router to the public network by setting a default gateway
  • Add a network as a broadcast domain
  • Add a subnet to the network. Within Openstack a subnet is an IP network.

The router is deployed as a Linux network namespace on the network node, as soon as there is an interfaces connected to the router. Do not use any agent stuff without enabled network namespaces. The first interface we added to the router in our example is the default gateway.

VXLAN to Vlan mapping

After creating the tenant network, the Openflow rules on the nodes for br-tun looks like:

Openflow rules on br-int after creating one tenant network

Openflow rules on br-int after creating one tenant network

Two rules have been added:

  • In Openflow table 3 a rule has been added to map the globally unique tunnel_id to the vlan_id, which has only node local scope. Any traffic which matches this rule is forwarded to Openflow table 10. The action of the first rule of Openflow table 10 has two actions. The first one performs MAC learning and inserts the learned MAC addresses together with a vlan_id to tunnel_id mapping and the egress vlxlan port to Openflow table 20. Openflow table 20 is used for egress traffic. The second action is to output the traffic to Openflow port 1, which is the port to br-int.
  • In Openflow table 21 a rule has been added to map the local vlan_id to the global tunnel_id. This rule is only used for broadcast, multicast or unknown unicast traffic. Unknown unicast traffic is flooded by default. These packets are sent by default to ALL nodes in the vxlan mesh. This is a very bad behaviour. The network implementation does not support VLAN pruning.
Start a VM on compute node 1

After starting a VM on compute node 1, the Openflow table on br-tun looks like.

Openflow rules on the compute node after learning the MAC addresses of the DHCP server and the router

Openflow rules on the compute node after learning the MAC addresses of the DHCP server and the router

Openflow table 20 contains two new rules, which have been inserted by the learning action of the Openflow rule in table 10. The two new rules contain the MAC addresses of the router and the DHCP server. Port 2 is the VXLAN tunnel to Compute node 1.

On the network node, a similar learned Openflow rule appears in table 20. This rule contains the MAC address of the VM.

Unicast packet transmission from the VM on the compute node

When the started VM on the compute node transmits a packet with the destination “router MAC address”, the walk through the Openflow rules on br-tun looks like:

Openflow rule walktrough when the VM transmits an unicast packet to the router MAC

Openflow rule walktrough when the VM transmits an unicast packet to the router MAC

  • The packet enters br-tun on port 1, which is the connection to br-int. The first prio=1 rule of table 0 matches and the packet is resubmitted to table 1.
  • In table 1 the second rule matches, because the packet is an unicast packet. The packet is resubmitted to table 20, which is the table for learned MAC addresses.
  • In table 20 the first rule of priority=1 matches, because of the destination MAC address. The action sets the tunnel_id to 0×10001, sets the vlan_id to zero (which removes the vlan dot1q tag) and transmits the packet through port 2, which is the vxlan virtual port to the network node.

Any packet entering table 20, which does NOT match any of the rules with priority=1 is processed by the priority=0 rules of table 20 and resubmitted to table 21. The prio=1 rule in table 21 “floods” the packet to ALL nodes in the mesh using unicast.

Packet sent to the VM

When the started VM on the compute node receives a packet, the walk through the Openflow rules on br-tun looks like:

Openflow walkthrough

Openflow rule walkthrough  when a packet is transmitted to the VM

  • The packet enters br-tun on port 2, which is the vxlan connection to the network node. The third prio=1 rule of table 0 matches and the packet is resubmitted to table 3.
  • In table 3 the first rule matches, because the tunnel_id is 0×10001. The vlan_id of the packet is set to 1 (which is the local vlan_id for the global tunnel_id). The packet is resubmitted to table 10
  • In table 10 the only rule is processed. This rule has two actions. Action one performs MAC learning. If the MAC address of the sender is not in the MAC learning Openflow table 20, it is added to the table including the mapping tunnel_id to vlan_id. The second action is to transmit the packet out of port 1, which is the port to br-int.
Conclusion

The implementation of the vxlan mesh in Openstack Havana is straight forward. It provides a working L2 connectivity when running IPV4. Anyhow, it misses features, which are delivered even with cheap hardware switches. These are:

  • no Vlan pruning. It is quite simple to overflow the whole infrastructure using broadcast traffic (or multicast traffic). Traffic replication for target systems reachable only via vxlan tunnels, is done on the egress Openvswitch. This does not scale in large environments (>> 30 nodes)
  • no IGMP and MLDv2 snooping. Together with the missing Vlan pruning, this makes it impossible to run applications which require IP multicast.

To optimize the traffic handling other features must be implemented to allow scaling to high node numbers.

  • MAC preprovisioning (addressed by ML2 and/or Openstack Icehouse) to avoid flooding of traffic
  • ARP preprovisioning to avoid flooding of traffic
  • preprovisioned IPV6 neighbor entries to avoid flooding of traffic
  • use a multicast address to flood traffic (addressed by Openstack Icehouse), but still no Vlan pruning
  • support IGMP and MLDv2 snooping on the OVS
  • support the mapping of multicast traffic in the overlay network to PIM-SSM in the underlay network

原文:http://www.cnblogs.com/popsuper1982/p/3894986.html

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐