The physical network
A cloud needs an „physical“ network using real „switching routers“ to transport data between the physical nodes.
The physical network has the following layout:
Each floating pool is implemented using one vlan on the physical network infrastructure. Each flat network requires also one vlan on the physical network. VXLAN requires also one vlan on the physical network.
VXLAN adds additional headers (IP and UDP) to packets. You must increase the mtu of the transport VXLAN vlan to a higher value, let’s take 1600. Do not follow the „recommendations“ to reduce the MTU of VMs using the DHCP server. Such a „recommendation“ only leads to big problems in a production environment — do not use a reduced MTU!
The configuration of the physical switching router could be (using a pseudo config language):
vlan 100 name Floating-Pool-1 vlan 101 name Floating-Pool-2 vlan 200 name Flat-Net-1 vlan 201 name Flat-Net-2 # set the mtu to 1600 for vlan 4000 vlan 4000 name vxlan mtu 1600 # do not use vlan 1 for untagged packets vlan 4090 name native vlan # ############ # interface vlan 100 description Floating-Network-1 ip address 198.18.0.1/20 interface vlan 101 description Floating-Network-2 ip address 198.18.16.1/20 interface vlan 200 description Flat-Network-1 ip address 198.19.1.1/24 interface vlan 201 description Flat-Network-2 ip address 198.19.2.1/24 # a L3 interface for the vxlan vlan may be added # ############# # # the ports to the nodes (network and compute) # They use the same config ! # just one link to each node - multiple links using LACP may also be used interface port1 description to-network-node mode trunk trunk native vlan 4090 trunk vlan 100,101,200,201,4000,4090 mtu 1600 interface port2 description to-compute-node mode trunk trunk native vlan 4090 trunk vlan 100,101,200,201,4000,4090 mtu 1600
The network set up of the nodes
The basic network set up of compute and network nodes is exactly the same. On the nodes, a basic network configuration of the Operating system must be provided by the user. This configuration connects the transport network for VMs on the nodes to the physical network. The network setup of a node is shown in the drawing below:
The cool part of the network setup is, that only ONE physical interface (or a LACP bundle) is used to transport VM traffic. This interface is connected to br-uplink. If you read the Openstack manuals and believe in everything which is written there, you would end up using FOUR physical interfaces:
- one physical interface for VXLAN transport. This is not true – VXLAN transport requires an IP address in the default network namespace of the operating system. We choose an Openvswitch internal port on the Openvswitch bridge br-uplink and attach an IP address to it. The default MTU must be set to a higher value. We use 1600.
- one link to the physical network via br-vlan (or br-eth1 in the Openstack documentation) using a physical interface for the Openstack ML2 „vlan“ type driver. This is not true – we take an Openvswitch patch port as the uplink. This port connects br-vlan to be use by the Openstack ML2 mechanism driver to br-uplink.
- two interfaces to map the two floating pools to two L3 agents, each using an Openvswitch bridge, which is connected to the physical network using a physical ethernet interface. This is not true – L3 agents are clever enough to avoid the usage of any additional bridges. And one L3 agent may manage multiple floating pools (or external networks if you are talking „Openstack“).
A part of the network configuration must be provided by the admin of the nodes during system startup. This configuration must be placed in the network configuration of the Operating system. The configuration depends on the Linux distribution.
The shell code for the network setup is (the management network is omitted):
# # the bridge, which connects the nodes to the transport network ovs-vsctl add-br br-uplink # the bridge used by Openstack Neutron to connect vlans and flatdhcp networks ovs-vsctl add-br br-vlan # the integration bridge used by Openstack ovs-vsctl add-br br-int # # add the uplink (with dot1q tags 101,102,...) # we assume, that eth1 is the uplink interface ip link set dev eth1 up # set the mtu of the physical uplink to the switch ip link set dev eth1 mtu 1600 # # disable gro and lro !! on the uplink ethtool -K eth1 gro off ethtool -K eth1 lro off # # enable for intel NICs udp port hashing to distribute traffic to different queues ethtool -N eth1 rx-flow-hash udp4 sdfn # ovs-vsctl add-port br-uplink eth1 -- set port eth1 vlan_mode=trunk trunk=100,101,200,201,4000 # # patch ports between br-uplink and br-vlan ovs-vsctl add-port br-vlan patch-to-uplink -- set Interface patch-to-uplink type=patch options:peer=patch-to-vlan ovs-vsctl add-port br-uplink patch-to-vlan -- set Interface patch-to-vlan type=patch options:peer=patch-to-uplink # # !! on br-uplink the allowed vlan tags on the patch port from br-vlan must be filtered using Openflow rules # !! if this is not done, there is a risk that vlans from the infrastructure may get mixed with local vlans # !! of br-int, if the neutron Openvswitch agent fails to set up the vlan mapping on br-vlan or br-int # TBD ### # create the Linux IP interface required for VXLAN transport # this interface is attached to vlan 4000 of br-uplink # XXX = Last octet of the VXLAN interface ip address of the node ovs-vsctl add-port br-uplink l3vxlan tag=4000 -- set Interface l3vxlan type=internal ip addr add 10.255.255.XXX/24 dev l3vxlan ip link set dev l3vxlan up # set the mtu of the logical vxlan interface ip link set dev l3vxlan mtu 1600
Do not enable ip_forwarding on any of the nodes. This is not necessary, even if the Openstack documentation declares this as a prerequisite.
Neutron in Openstack Liberty does not require to use br-ex to get a functional L3 agent providing router services. The default setup in the Openstack documentation is using br-ex, but there are other ways to implement networking, e.g. the one shown in this article series.
Continue reading (part 3) the Openstack Neutron setup