Overlay networking enables to implement tunnels to interconnect networks defined inside a host (such as Docker/Podman private networks): for example flannel based Kubernetes uses VxLANs to interconnect the Minion’s private networks. Anyway VxLAN is only one of the available technologies: other technologies such as GENEVE, STT or NVGRE are available.

In this post we setup a GENEVE tunnel with OpenVSwitch and Podman - the described set up goes beyond the simple interconnection on of layer 3 network segments, interconnecting two Podman’s private networks configured with the same IP subnet (so they share the same broadcast domain) - the layer 2 data are exchange between the OpenVSwitch bridges on the two hosts through the GENEVE tunnel.

Overlay Networks

Overlay networking is a technology that enables interconnecting network segments by encapsulating their traffic (from layer 2 to the top of the network stack) into layer 4 packages of an existing TCP/IP network (the underlay network). This means that the two network segments, interconnected by the Overlay networking technology, share the same broadcast domain.

As you can infer, since the underlay networks are layer 3, their traffic can be routed: this enables them to implement network segments (sharing the same broadcast domain) scattered even across different data centers interconnected by long distance links. This of course opens to network scenarios that before were impossible to implement, but it also leads to some security risks: since these protocols do not have Authentication neither Confidentiality or Integrity (CIA), you must absolutely avoid routing the underlying networks transporting them out of the datacenter – anyway be wary that even routing inside the data center can be risky, so mind to protect everything with strict firewalling.
Another benefit of Overlay Networks is enabling to exceed the limit to a maximum of 4096 networks imposed by the 802.1Q (VLAN) standard.

The most  broadly used Overlay networks technologies are:


Mostly sponsored by Cisco, VMware, Citrix, Red Hat, Arista and Broadcom, it relies on UDP protocol using port 4789. It lets you define up to 6 millions of virtual networks that are identified by their own VNID (Virtual Network Identifier). A VxLAN distributed vSwitch consists of a whole big distributed switch that spreads across several switches: all of the switches that compose the distributed vSwitch are connected through the VxLAN tunnels that connect Virtual Tunnel EndPoints (VTEPs) to each other. This is achieved by subscribing to the same IGMP multicast group. It is worth noting that it does not require the Spanning Tree Protocol (STP), since it implements loop prevention by itself.

Mind that the switching tables of each switch that is part of the distributed vSwitch are not shared: each distributed switch element has its own - that is: the control plane is not implemented.

A switching table entry looks like:


A VxLAN switch behaves exactly like traditional switches: when a packet with an unknown source MAC address is received, it stores it as belonging to the port it's coming from. When an unknown destination MAC address is reached, it floods it except through the source port.

Be wary that, unless you enable jumbo frames in the underlay network (that raises MTU to 9000), you must reduce the MTU to 1524, since part of the standard ethernet MTU is consumed by the underlay network's overhead.


Network Virtualization using Generic Routing Encapsulation is mostly sponsored by Microsoft, Arista Networks, Intel, Dell, Hewlett-Packard, Broadcom and Emulex. It relies on the GRE protocol. Same way as VxLAN, it allows up to 16 millions of virtual networks that are identified by their own TNI (Tenant Network Identifier), each one with its own GRE tunnel.

Conversely from VxLAN, NVGRE supports MTU discovery to dynamically reduce packet MTU of intra-virtual-network packet sizes.

In addition to that, conversely from VxLAN, NVGRE does not rely on flood and learn behavior over IP multicast, which makes it a more scalable solution for broadcasts, but this is a two edged sword, since this makes it an hardware/vendor dependent.

The main disadvantage of NVGRE over VxLAN probably is that in order to provide flow-level granularity (needed to take advantage of all bandwidth) the transport network (for example the router) should lookup for the Key ID in the GRE header: this is why in order to enhance load-balancing the draft suggests the use of multiple IP addresses per NVGRE host, which will allow for more flows to be load balanced. This is difficult to implement, so probably this is the main disadvantage of NVGRE over VxLAN.


Stateless Transport Tunneling has been supported by VMWare: it offers good performances – since we are talking about Linux, talking about this technology would be off topic.

VxLAN is probably the most broadly used protocol: besides Linux and VMWare, it has been implemented in several routers and switches by Arista, Brocade, Cisco, Cumulus, DELL, HP, Huawei, Juniper, OpenvSwitch, Pica8 and there are several Network Interface Cards (NICs) that implement TCP Segmentation Offload such as Broadcom, Intel (Fulcrum), HPE Emulex (be2net), Mellanox (mlx4_en and mlx5_core) and Qlogic (qlcnic).

TCP Segmentation Offload (TSO), it's the NIC that encapsulate packets, sparing the operating system from this tasks – this dramatically reduces context switches, and so interrupts too.This let the operating system send up to 64K byte packets that are directly handled by the NIC.


To address the perceived limitations of VxLAN and NVGRE VMWare, Microsoft, Red Hat and Intel proposed the Generic Network Virtualization Encapsulation (GENEVE): it has been designed taking cue from many mature and long-living protocols such as BGP, IS-IS and LLDP.

The outcome is an extensible protocol (thanks to the Variable length Options field), that can transport whatever protocol (thanks to the Protocol Type field) so that it can fit every future need.

Note that parsing the option field is mandatory: OAM information and Critical flags are stored here indeed. Its header format has been carefully designed to let NIC perform TSO, although a few of them already implemented it for GENEVE.

NICs that have TSO for VxLAN play nice also with GENEVE: some commonly used offload capabilities are not actually VXLAN specific – an example is“tx-udp_tnl-segmentation”. This means that having such a kind of NIC can improve performances a lot also with GENEVE.

GENEVE has its own registered protocol number and uses UDP/6081 IANA registered port. Same way as VxLAN, it uses Virtual Network Identifier (VNI).

It is worth mentioning that Wireshark dissectors are already available.

The following example shows the statements necessary to setup a GENEVE Virtual Network that spans across two different hosts using OpenVSwitch (as-ca-ut1a001 - IP and as-ca-ut1a002 - IP

On the as-ca-ut1a001 host:

ovs-vsctl add-port ovs_net1 to_as-ca-ut1a002 -- set interface to_as-ca-ut1a002 \
type=geneve options:remote_ip=

On the as-ca-ut1a002 host:

ovs-vsctl add-port ovs_net1 to_as-ca-ut1a001  -- set interface to_as-ca-ut1a001 \
type=geneve options:remote_ip=

We will see the above statements in action in the Lab we are about to set up.

Provision The Lab

The following table summarizes the networks we are about to set up:


Subnet CIDR



Management Network

depends on your setup


We use the default network the VMs gets attached to as a fictional management network - the actual configuration of this network depends on the Hypervisor you are using

Core Network Testing Security Tier 1



This is a trunked network used to transport the Testing VLANs: Vagrant will set-up it as "", but the VM will use it only as a network segment to transport VLANs.

Application Servers Network Testing Security Tier 1


This network used for attaching Application Servers VMs of the Testing environment.

In real life, having a dedicated management network provides several security and availability benefits: it provides you a trusted network you can use to always reach your hosts, either physical or virtual, enabling to operate using SSH and Datacenter Automation tools, PXE boot them or running backups using dedicated networking policies (for example traffic shaping) as well as security policies and even dedicated firewalls. Mind that it is necessary to have a dedicated management network for each Security Tier - if security is a concern, don't forget to have a couple of jump hosts for each management network. In my personal experience using dedicated management networks is absolutely a best practice.

This table summarizes the VM's homing on the above networks:


Services Subnet/Domain(s)

Management Subnet/Domain








the first Test Security Tier 1 environment's Application Server - in this post we only install Podman on it and setup a Podman's private network. We then set up a GENEVE tunnel to link this private network to Podman's private network on the as-ca-ut1a002 host..




the second Test Security Tier 1 environment's Application Server  - also here we only install Podman on it and setup a Podman's private network. We then set up a GENEVE tunnel to link this private network to Podman's private network on the as-ca-ut1a001 host.

When dealing with multi-homed hosts scenarios, the best practice is to register every FQDN in the DNS - for example "as-ca-ut1a001.as-t1.carcano.local"  and "as-ca-ut1a001.mgmt-t1.carcano.local".

Deploying Using Vagrant

In order to add the two above VM, it is necessary to extend the Vagrantfile shown in the previous post a by adding the "as-ca-ut1a002" VM to the "host_vms" list of dictionaries. This can be accomplished by adding the following snippet:

    :hostname => "as-ca-ut1a002",
    :domain => "netdevs.carcano.local",
    :core_net_temporary_ip => "",
    :services_net_ip => "",
    :services_net_mask => "24",
    :services_net_vlan => "100",
    :summary_route => "",
    :box => "grimoire/ol92",
    :ram => 2048,
    :cpu => 2,
    :service_class => "ws"

finally provision the VMs by simply running:

vagrant up as-ca-ut1a001 as-ca-ut1a002

Update Everything

As suggested by best practices, it is always best provisioning something that is as current as possible.

SSH connect to the "as-ca-ut1a001" VM as follows:

vagrant ssh as-ca-ut1a001

then switch to the "root" user again:

sudo su -

update the system using DNF

dnf -y update

reboot the VM

shutdown -r now
SSH connect to the as-ca-ut1a002 host and repeat all the above steps.

Install The OpenVSwitch (OVS) Kernel Module

Since the Vagrant box provided by Oracle is missing the OpenVSwitch kernel module, we must install it - SSH connect to the "as-ca-ut1a001" VM as follows:

vagrant ssh as-ca-ut1a001

then switch to the "root" user again:

sudo su -

the OpenVSwitch kernel module is provided by two different RPM packages, depending on you are using the Unbreakable Enterprise Kernel (UEK) or the Red Hat Compatible Kernel (RHCK), so first we have to check the of the currently used kernel's flavor:

uname -r  |grep --color 'el[a-z0-9_]*'

if the outcome string, as the following one, contains the "uek" word, then it is an Unbreakable Enterprise Kernel (UEK):


in this case, install the "kernel-uek-modules" RPM package as follows:

dnf install -y kernel-uek-modules

otherwise, if the outcome string, as the following one, does not contain the "uek" word, then it is an Red Hat Compatible Kernel (RHCK):


in this case, install the "kernel-modules" RPM package as follows:

dnf install -y kernel-modules
SSH connect to the as-ca-a-ut1a002 host and repeat all the above steps.

Install The Software

OpenVSwitch is not shipped with Oracle Linux, but since Oracle Linux is binary compatible with Red Hat Enterprise Linux, and the same is for CentOS Linux, we can download the pre-built RPM packages freely shipped by CentOS

Mind you can browse the available packages from cbs.centos.org.

SSH connect to the "as-ca-ut1a001" VM as follows:

vagrant ssh as-ca-ut1a001

then switch to the "root" user again:

sudo su -

to be tidy, we create a directory tree where to download the RPM packages we are about to install:

mkdir -m 755 /opt/rpms /opt/rpms/3rdpart
cd /opt/rpms/3rdpart

let's start by downloading every OpenVSwitch RPM packages of our desired version and build:

ARCH=$(uname -i)
Of course we actually need only the openvswitch3.1 RPM package, but it is always best to have them so as to be able to achieve future needs.

the OpenVSwitch RPM package depends on "openvswitch-selinux-extra-policy" RPM package, so let's download it as well:


we can now install the software as follows:


start OpenVswitch and enable it at boot:

systemctl enable --now openvswitch

we need also to restart also NetworkManager, to make it load also the OpenVSwitch (OVS) module we just installed:

systemctl restart NetworkManager

some of the packages we are about to install are provided by the EPEL repo - enable it as follows:

dnf -y install oracle-epel-release-el9

now let's install Podman, along the bridge-utils and jq RPM packages

dnf install -y podman bridge-utils jq

since Podman delays the creation of the bridges used by its networks until at least one container is started, we need bridge-utils to be able to manually create that bridge in advance, so to be able to link it to the OpenVSwitch bridge with the GENEVE port used to link to the other host. 

SSH connect to the as-ca-ut1a001 host and repeat all the above installation steps.

Configure The Private Networks

Podman's Net1 Private Network

On both the application servers, we create a Podman's private network called "net1": the subnet must be the same on both hosts ( but we must set different IP ranges to assign to the containers, so as to avoid IP collisions. In addition to that, packages must be able to fit the GENEVE tunnel's MTU - in this example we are not using Jumbo frames, so we lower the MTU to 1450.

Connect to the "as-ca-ut1a001" host and, as the root user, type:

podman network create --internal \
--subnet \
--ip-range=  \
--opt mtu=1450 \

the above statement created the "net1" private network - Let's spawn an Alpine Linux container on it:

podman run -ti --rm --net=net1 alpine sh

let's check the container's network configuration:

ip -4 -o addr show dev eth0

the outcome is as follows:

2: eth0    inet brd scope global eth0\       valid_lft forever preferred_lft forever

leave the container open and, from another terminal, connect to the "as-ca-ut1a002" host.

Once logged on, switch to the root user and type:

podman network create --internal \
--subnet \
--ip-range=  \
--opt mtu=1450 \

then spawn an Alpine Linux container also on this host:

podman run -ti --rm --net=net1 alpine sh

and also here, let's check the container's network configuration:

ip -4 -o addr show dev eth0

the outcome is as follows:

2: eth0    inet brd scope global eth0\       valid_lft forever preferred_lft forever

as expected, both containers have an IP address from the network.

Now, from the current container (the one with IP we launched on the "as-ca-ut1a001" application server, let's try to ping the container on the "as-ca-ut1a002" application server:

ping -c 1

the outcome is:

PING ( 56 data bytes

--- ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss

this must not surprise us: of course on both the applications servers we assigned the same subnet to the "net1" Podman's network, but they are actually two distinct networks, running on two distinct hosts.

Exit both containers:


We are about to interconnect both the Podman’s net1 private networks using a GENEVE tunnel.

First, on both the application servers, create the ethernet bridge used by the Podman's "net1" private network as follows:

PODMAN_BRIDGE=$(podman network inspect net1 | jq -r ".[]|.network_interface")
brctl addbr ${PODMAN_BRIDGE}
ip link set ${PODMAN_BRIDGE} up

VEth's Used to Connect the Podman's bridge to the OVS bridge

We are going to create a GENEVE tunnel using OpenVSwitch: this of course requires creating an OVS bridge and interconnecting it with the bridge we just created to backup Podman's "net1" private net.

To interconnect these two bridges, we need a VEth interfaces pair - let's create them on both the application servers as follows:

nmcli connection add type veth con-name veth0 ifname veth0  veth.peer veth1
Beware that some old Red Hat based versions are affected by the "Bug 1915284 - veth device profiles activation is not reboot persistent" that prevents the veths from being recreated at system boot.

on both the application servers, we can now link the veth0  interface to the Podman's bridge:

brctl addif ${PODMAN_BRIDGE} veth0

OpenVSwitch Bridge

It is now necessary to create the "ovs_net1" OpenVSwitch bridge - on both the application servers type the following statements:

ovs-vsctl add-br ovs_net1

and link the "veth1" interface to it:

ovs-vsctl add-port ovs_net1 veth1

last but not least, turn both the VEths interfaces up:

ip link set veth1 up
ip link set veth0 up

GENEVE Tunnels

We are finally ready to set up the GENEVE tunnels, but before that, on both the application servers, we need to create its related firewalld service and enable it to permit the GENEVE traffic between the hosts of the service subnet (

Create the "/etc/firewalld/services/geneve.xml" file with the following contents:

<?xml version="1.0" encoding="utf-8"?>
  <description>Enable GENEVE incoming traffic</description>
  <port protocol="udp" port="6081"/>

then create the firewall rule that accepts GENEVE traffic only from the "" network:

firewall-cmd --permanent --zone=services1 --add-rich-rule='rule family="ipv4" source address="" service name="geneve" accept'

finally reload firewalld to apply the configuration:

firewall-cmd --reload

let's make sure that it actually loaded it:

firewall-cmd --list-all --zone=services1

the outcome must be as follows:

services1 (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp0s6.100
  forward: no
  masquerade: no
  rich rules: 
	rule family="ipv4" source address="" service name="geneve" accept

we can now create the GENEVE tunnels, let's start from the "as-ca-ut1a001" host - connect to it, and as the root user, type:

ovs-vsctl add-port ovs_net1 to_as-ca-ut1a002 -- set interface to_as-ca-ut1a002 \
type=geneve options:remote_ip=

then connect to the "as-ca-ut1a002" host and, as the root user, type:

ovs-vsctl add-port ovs_net1 to_as-ca-ut1a001  -- set interface to_as-ca-ut1a001 \
type=geneve options:remote_ip=

since the GENEVE endpoint uses port UDP/6081, we can easily make sure the above statements are actually implemented - just type:

ss -lnup |grep 6081

the outcome must be as follows:

UNCONN 0      0                      *                                             
UNCONN 0      0                                   [::]:6081         [::]:*  

we are almost ready to launch again a container and perform a connectivity check, but before we must turn the "ovs_net1" bridge up:

ip link set ovs_net1 up

then, on both the application servers, launch an instance of the Alpine container:

podman run -ti --net=net1 alpine sh

now, on both containers, print the IP address: 

ip -4 -o addr show dev eth0

on my system, the one running on the "as-ca-ut1a001" host shows:

2: eth0    inet brd scope global eth0\       valid_lft forever preferred_lft forever

whereas the one running on the "as-ca-ut1a002" host shows:

2: eth0    inet brd scope global eth0\       valid_lft forever preferred_lft forever

let's try to ping the container running on the "as-ca-ut1a002" host from the container running on the "as-ca-ut1a001" host - on the container running on the "as-ca-ut1a001" host type:

ping -c 1

this time the outcome must be as follows:

PING ( 56 data bytes
64 bytes from seq=0 ttl=42 time=1.982 ms

--- ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 1.982/1.982/1.982 ms

it is now working thanks to the GENEVE tunnel that is now interconnecting the two Podman's private networks.

Exit the container:


Persisting The Networking Setup

As every nice thing, even this set up won't last long, ... at the first reboot, the very most of it will get gone.

We can however make it persistent by creating a script and a Systemd unit to trigger it at boot.

Create the directory tree for storing the script:

mkdir -m 755 /opt/grimoire /opt/grimoire/bin

then create the "/opt/grimoire/bin/podman-ovs.sh" script with the following contents:

PODMAN_BRIDGE=$(podman network inspect net1 | jq -r ".[]|.network_interface")
brctl addbr ${PODMAN_BRIDGE}
sleep 2
ip link set ${PODMAN_BRIDGE} up
brctl addif ${PODMAN_BRIDGE} veth0
ip link set veth1 up
ip link set veth0 up
ip link set ovs_net1 up

If your system is affected the "Bug 1915284 - veth device profiles activation is not reboot persistent" that prevents the veths from being recreated at system boot, you need to add the following statement at line 6, right before the "brctl addif ${PODMAN_BRIDGE} veth0" statement.

ip link add veth0 type veth peer name veth1

and set it executable:

chmod 755 /opt/grimoire/bin/podman-ovs.sh

create the "/etc/systemd/system/podman-ovs.service" Systemd unit we use to trigger the script at boot time:

Description=Link Podman To OpenVSwitch
After=network.target network.service openvswitch.service podman.service



reload Systemd to make it aware of the new "podman-ovs.service" unit:

systemctl daemon-reload

and enable the "podman-ovs.service" unit to start at boot:

systemctl enable podman-ovs.service

Reboot the VM:

shutdown -r now

Test: download a file from a web service

Now that everything is up and running we can mock a web service and test.

On the "as-ca-ut1a001" application server, launch a container with the official Python container image:

podman run -ti --net=net1 python bash

Since there are neither the "ip" nor the "ipconfig" command line tools on this container image, we use Python to detect the IP configuration - launch Python as follows:


then run the following code snippet to get the IP address:

import socket
hostname = socket.gethostname()

on my system it detects "".

Exit back to the shell:


then change to the "/etc" directory:

cd /etc

and launch the Python HTTP server:

python3 -m http.server 8080

on "as-ca-ut1a002" application server, start again an instance of the Alpine container:

podman run -ti --net=net1 alpine sh

then download the "motd" file from the python container running on the "as-ca-ut1a001" host:

wget -q -O -

the contents are printed right to the standard output.

While we are at it, ... just to demonstrate once again the importance of security, type the following statement:

wget -q -O -

again, the contents are printed right to the standard output.

But even worse:

wget -q -O -

so we are getting the contents of the "/etc/shadow" file: this is a lab and we are inside a container, ... so no worries, but I wanted to show how easy it is to weaken security if you don't take enough care of everything.


Here it ends this post dedicated to GENEVE and Podman: I hope that having seen everything in action helped you to get a good understanding of how Overlay networks do work. Having a good understanding of overlay networking is a really valuable skill, since they are not only bricks, but even pillars of modern and resilient infrastructures. I hope the content shown in this post is enough to let you continue by yourself to explore this amazing topic.

I hope you enjoyed it, and if you liked it please share this post on Linkedin: if I see it arouses enough interest, we can go on this topic spending some time writing a post explaining how to set OpenFlow rules.

I hate blogs with pop-ups, ads and all the (even worse) other stuff that distracts from the topics you're reading and violates your privacy. I want to offer my readers the best experience possible for free, ... but please be wary that for me it's not really free: on top of the raw costs of running the blog, I usually spend on average 50-60 hours writing each post. I offer all this for free because I think it's nice to help people, but if you think something in this blog has helped you professionally and you want to give concrete support, your contribution is very much appreciated: you can just use the above button.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>