VXLAN
This page provides some information and examples for the FreeBSD VXLAN pseudo device.
Contents
Specification
The latest draft of the VXLAN specification (rfc7348)
Repository
The original code is in the projects/vxlan branch of the FreeBSD Subversion repository. It was merged to -CURRENT in r273331.
Examples
Each example below consists of two hosts, vxlan1 and vxlan2, both running FreeBSD, with an em(4) interface on the same subnet.
root@vxlan1:~ # ifconfig em0 em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC> ether 52:54:00:ff:06:25 inet 192.168.100.1 netmask 0xffffff00 broadcast 192.168.100.255 inet6 fe80::5054:ff:feff:625%em0 prefixlen 64 scopeid 0x2 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active
root@vxlan2:~ # ifconfig em0 em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC> ether 52:54:00:4c:d0:fb inet 192.168.100.2 netmask 0xffffff00 broadcast 192.168.100.255 inet6 fe80::5054:ff:fe4c:d0fb%em0 prefixlen 64 scopeid 0x2 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active
Since VXLAN encapsulates an existing Ethernet frame in an Ethernet/IP/UDP header, the resulting frame may be larger than that standard Ethernet frame size of 1500 bytes. The specification recommends the physical network be configured to use jumbo frames. Alternatively, the MTU on the vxlan pseudo device can be reduced to accommodate the encapsulation.
In these examples, the em(4) interfaces are configured to use jumbo frames.
root@vxlan1:~ # ifconfig em0 mtu 9000
root@vxlan2:~ # ifconfig em0 mtu 9000
Point to Point Example
VXLAN can be configured to create a tunnel between two hosts. In this example, we will create a 10.10.99/24 network on top of the physical 192.168.100/24 network.
Create the vxlan device on the first host.
root@vxlan1:~ # ifconfig vxlan create vxlanid 42 vxlanlocal 192.168.100.1 vxlanremote 192.168.100.2 inet 10.10.99.1/24 vxlan0 root@vxlan1:~ # ifconfig vxlan0 vxlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 5a:19:09:5e:2b:e3 inet 10.10.99.1 netmask 0xffffff00 broadcast 10.10.99.255 inet6 fe80::5819:9ff:fe5e:2be3%vxlan0 prefixlen 64 scopeid 0x4 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> vxlan 42 local 192.168.100.1:4789 remote 192.168.100.2:4789
And then create the vxlan device on the second host.
root@vxlan2:~ # ifconfig vxlan create vxlanid 42 vxlanlocal 192.168.100.2 vxlanremote 192.168.100.1 inet 10.10.99.2/24 vxlan0 root@vxlan2:~ # ifconfig vxlan0 vxlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether e6:a3:8a:20:72:30 inet 10.10.99.2 netmask 0xffffff00 broadcast 10.10.99.255 inet6 fe80::e4a3:8aff:fe20:7230%vxlan0 prefixlen 64 scopeid 0x4 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> vxlan 42 local 192.168.100.2:4789 remote 192.168.100.1:4789
We can use sockstat to see the UDP listening socket created.
root@vxlan1:~ # sockstat -4l USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS ? ? ? ? udp4 192.168.100.1:4789 *:*
We can tcpdump on both the em(4) and vxlan(4) interfaces. From the first host, we will ping the other endpoint of the tunnel.
root@vxlan1:~ # ping -c 2 10.10.99.2 PING 10.10.99.2 (10.10.99.2): 56 data bytes 64 bytes from 10.10.99.2: icmp_seq=0 ttl=64 time=1.523 ms 64 bytes from 10.10.99.2: icmp_seq=1 ttl=64 time=1.087 ms --- 10.10.99.2 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 1.087/1.305/1.523/0.218 ms
root@vxlan1:~ # tcpdump -i vxlan0 00:45:19.350517 ARP, Request who-has 10.10.99.2 tell 10.10.99.1, length 28 00:45:19.351275 ARP, Reply 10.10.99.2 is-at e6:a3:8a:20:72:30 (oui Unknown), length 28 00:45:19.351302 IP 10.10.99.1 > 10.10.99.2: ICMP echo request, id 23555, seq 0, length 64 00:45:19.351971 IP 10.10.99.2 > 10.10.99.1: ICMP echo reply, id 23555, seq 0, length 64 00:45:20.393181 IP 10.10.99.1 > 10.10.99.2: ICMP echo request, id 23555, seq 1, length 64 00:45:20.394217 IP 10.10.99.2 > 10.10.99.1: ICMP echo reply, id 23555, seq 1, length 64
root@vxlan1:~ # tcpdump -i em0 00:45:19.350880 IP 192.168.100.1.13489 > 192.168.100.2.4789: UDP, length 50 00:45:19.351220 IP 192.168.100.2.61867 > 192.168.100.1.4789: UDP, length 50 00:45:19.351343 IP 192.168.100.1.20721 > 192.168.100.2.4789: UDP, length 106 00:45:19.351951 IP 192.168.100.2.26259 > 192.168.100.1.4789: UDP, length 106 00:45:20.393388 IP 192.168.100.1.11733 > 192.168.100.2.4789: UDP, length 106 00:45:20.394197 IP 192.168.100.2.23364 > 192.168.100.1.4789: UDP, length 106
root@vxlan2:~ # tcpdump -i em0 01:45:19.428100 IP 192.168.100.1.13489 > 192.168.100.2.4789: UDP, length 50 01:45:19.428642 IP 192.168.100.2.61867 > 192.168.100.1.4789: UDP, length 50 01:45:19.428722 IP 192.168.100.1.20721 > 192.168.100.2.4789: UDP, length 106 01:45:19.429049 IP 192.168.100.2.26259 > 192.168.100.1.4789: UDP, length 106 01:45:20.470829 IP 192.168.100.1.11733 > 192.168.100.2.4789: UDP, length 106 01:45:20.471277 IP 192.168.100.2.23364 > 192.168.100.1.4789: UDP, length 106
root@vxlan2:~ # tcpdump -i vxlan0 01:45:19.428261 ARP, Request who-has 10.10.99.2 tell 10.10.99.1, length 28 01:45:19.428284 ARP, Reply 10.10.99.2 is-at e6:a3:8a:20:72:30 (oui Unknown), length 28 01:45:19.428849 IP 10.10.99.1 > 10.10.99.2: ICMP echo request, id 23555, seq 0, length 64 01:45:19.428864 IP 10.10.99.2 > 10.10.99.1: ICMP echo reply, id 23555, seq 0, length 64 01:45:20.471072 IP 10.10.99.1 > 10.10.99.2: ICMP echo request, id 23555, seq 1, length 64 01:45:20.471100 IP 10.10.99.2 > 10.10.99.1: ICMP echo reply, id 23555, seq 1, length 64
Multicast Example
A more common configuration is to use an IP multicast group. In this example, we will create a 10.10.99/24 network using a 224/8 group address.
In lieu of a more complex multicast setup, I'll just manually add the appropriate routes to the em0 interfaces.
root@vxlan1:~ # route add -net 224/8 -interface em0 add net 224: gateway em0
root@vxlan2:~ # route add -net 224/8 -interface em0 add net 224: gateway em0
Create the vxlan device on the first host.
root@vxlan1:~ # ifconfig vxlan create vxlanid 42 vxlanlocal 192.168.100.1 vxlangroup 224.0.2.6 vxlandev em0 inet 10.10.99.1/24 vxlan0 root@vxlan1:~ # ifconfig vxlan0 vxlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 92:ef:01:c0:6d:c5 inet 10.10.99.1 netmask 0xffffff00 broadcast 10.10.99.255 inet6 fe80::90ef:1ff:fec0:6dc5%vxlan0 prefixlen 64 scopeid 0x4 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> vxlan 42 local 192.168.100.1:4789 group 224.0.2.6:4789
And then create the vxlan device on the second host.
root@vxlan2:~ # ifconfig vxlan create vxlanid 42 vxlanlocal 192.168.100.2 vxlangroup 224.0.2.6 vxlandev em0 inet 10.10.99.2/24 up vxlan0 root@vxlan2:~ # ifconfig vxlan0 vxlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether ea:df:bb:35:37:28 inet6 fe80::e8df:bbff:fe35:3728%vxlan0 prefixlen 64 scopeid 0x4 inet 10.10.99.2 netmask 0xffffff00 broadcast 10.10.99.255 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> vxlan 42 local 192.168.100.2:4789 group 224.0.2.6:4789
Again, we can use sockstat to see the UDP listening socket.
root@vxlan1:~ # sockstat -4l USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS ? ? ? ? udp4 *:4789 *:*
And we can use ifmcast to view the group membership on the em0 interface.
root@vxlan1:~ # ifmcstat -i em0 -f inet em0: inet 192.168.100.1 igmpv3 rv 2 qi 125 qri 10 uri 3 group 224.0.0.1 mode exclude mcast-macaddr 01:00:5e:00:00:01
Future Work
- VIMAGE support.
IPv6 Support. Most of the code for this already exists, but need to add FreeBSD support for RFC6935.
- Checksum offload of the encapsulated (inner) packet.
- Hardware offload.
Software segmentation and batching. Not just for vxlan, but it can push at most 1500 bytes through the stack, and has to do it basically twice. Also see if this RFC draft gets any traction.