Battle of the Bare Metal Routers

Battle of the Bare Metal Routers

Continuing from my previous post, I wanted to see how a variety of software routing and firewall platforms performed when put to the test on an equal playing field.


As before, these tests were all run on my Xeon D-1521. This server is running 10Gb links to the rest of my network via the onboard Intel X550/552 chipset, 32GB of RAM, and an SSD boot drive.

The final results are HERE if this is all TLDR.

Here is the server dangling halfway out of the rack while I was doing these tests:


Network Layout

The network layout is identical to virtual routing tests, though in this case, I just used DHCP to get the “WAN” address, since I wanted to test NAT performance too.


The Victims

For these tests, I wanted to avoid wasting time on duplicated results. So for example, I’ll just be testing pfSense instead of pfSense and OPNSense. Similarly just VyOS instead of Debian. Also, based on feedback on my prior post, there were people that felt like I missed a few platforms. So for these tests, the platforms will be:

  • VyOS 1.2.3
  • pfSense 2.4.4-p3
  • Sophos XG SFOS v17.5.8 (not fully sure how the versioning works, but that was the ISO)
  • Untangle 14.2.2

The goal of these tests is to show how these platforms perform when installed bare metal to this D-1521. So they will be configured mostly out of box in a basic WAN/LAN Firewall/Router setup.


VyOS

I use a lot of VyOS, and I’m pretty familiar with how it performs. It’s worth mentioning that I’m also a VyOS maintainer now.

Basic Config:

firewall {
    name OUTSIDE-IN {
        default-action drop
        rule 10 {
            action accept
            state {
                established enable
                related enable
            }
        }
    }
    name OUTSIDE-LOCAL {
        default-action drop
        rule 10 {
            action accept
            state {
                established enable
                related enable
            }
        }
        rule 20 {
            action accept
            icmp {
                type-name echo-request
            }
            protocol icmp
            state {
                new enable
            }
        }
        rule 31 {
            action accept
            destination {
                port 22
            }
            protocol tcp
            state {
                new enable
            }
        }
    }
}
interfaces {
    ethernet eth0 {
        address dhcp
        firewall {
            in {
                name OUTSIDE-IN
            }
            local {
                name OUTSIDE-LOCAL
            }
        }
    }
    ethernet eth1 {
        vif 2222 {
            address 10.222.222.1/24
        }
        vif 2223 {
            address 10.223.223.1/24
        }
    }
    loopback lo {
    }
}
nat {
    source {
        rule 10 {
            outbound-interface eth0
            source {
                address 10.0.0.0/8
            }
            translation {
                address masquerade
            }
        }
    }
}
service {
    dhcp-server {
        shared-network-name vlan2222 {
            authoritative
            subnet 10.222.222.0/24 {
                default-router 10.222.222.1
                dns-server 10.53.53.53
                lease 1200
                range 0 {
                    start 10.222.222.10
                    stop 10.222.222.100
                }
            }
        }
        shared-network-name vlan2223 {
            authoritative
            subnet 10.223.223.0/24 {
                default-router 10.223.223.1
                dns-server 10.53.53.53
                lease 1200
                range 0 {
                    start 10.223.223.10
                    stop 10.223.223.100
                }
            }
        }
    }
    ssh {
        port 22
    }
}

Results

In this setup, the results were “basically 10Gb”, whether for inter-vlan or LAN->WAN with NAT.

Inter-vlan (9.19 Gbps):

NAT (9.17 Gbps):

While I was testing it, I was curious to see if using a simple firewall was any different than using zone-based firewalls.

So adding the zone config:

zone LAN {
    default-action drop
    from LOCAL {
        firewall {
            name LOCAL-LAN
        }
    }
    from WAN {
        firewall {
            name WAN-LAN
        }
    }
    interface eth1.2222
    interface eth1.2223
}
zone LOCAL {
    from LAN {
        firewall {
            name LAN-LOCAL
        }
    }
    from WAN {
        firewall {
            name WAN-LOCAL
        }
    }
    local-zone
}
zone WAN {
    from LAN {
        firewall {
            name LAN-WAN
        }
    }
    from LOCAL {
        firewall {
            name LOCAL-WAN
        }
    }
    interface eth0
}

zone LAN {
    default-action drop
    from LOCAL {
        firewall {
            name LOCAL-LAN
        }
    }
    from WAN {
        firewall {
            name WAN-LAN
        }
    }
    interface eth1.2222
    interface eth1.2223
}
zone LOCAL {
    from LAN {
        firewall {
            name LAN-LOCAL
        }
    }
    from WAN {
        firewall {
            name WAN-LOCAL
        }
    }
    local-zone
}
zone WAN {
    from LAN {
        firewall {
            name LAN-WAN
        }
    }
    from LOCAL {
        firewall {
            name LOCAL-WAN
        }
    }
    interface eth0
}

The results were pretty much identical.

NAT zone firewall (9.12Gbps):


pfSense

In my previous post, I caught a surprising amount of flak saying that my review of pfSense was somehow unfair and biased, despite feeling like I was OVERLY fair. Hopefully the results here will vindicate me.

With pfSense, I didn’t really make any changes beyond the stock install. I added an OPT interface for the second LAN, and opened up the firewall rules for that interface.

Results

First, I tested some inter-vlan routing with all hardware offload disabled. This is a common troubleshooting step as virtual pfSense and a lot of network cards don’t properly support the functionality under FreeBSD.

Inter-vlan, disabled hardware offload (7.58Gbps):

But guess what. All those hardware offloading checkboxes exist for a reason, and enabling it can have some dramatic results if the NICs actually support it (and the drivers aren’t broken).

Inter-vlan, hardware offload (9.19Gbps):

pfSense does slightly less better when it is NATing, though still pretty close to 10Gb.

NAT, hardware offload (8.71Gbps):

Now this is where I need to eat some crow.

For literally YEARS I’ve been preaching that pfSense won’t go 10 gigabit. As these results show, that was way wrong. Though in my defense, this was also a sentiment echoed by the pfSense devs themselves. But I’ll take my medicine and admit that I was wrong.

I think are a few reasons that in my all extensive testing, I never saw 10Gb.

  • When I was running pfSense on bare metal (2.3.x and 2.4.0 beta days), the Xeon Ds were REALLY new. To make a long story short, they were not very stable and I don’t think the drivers were fully compatible. As the results show, enabling the hardware offload features can make a fairly large difference.
  • Ever since, every pfSense install I’ve done has generally been virtualized, either on KVM or ESXi. As my prior post shows, it just doesn’t do well virtualized.

So yes, pfSense will do 10Gb assuming you have NICs that support hardware offload correctly.


Sophos XG

After my last post, a few people specifically mentioned that I should have done tests with Sophos XG.

I’m not going to lie. Sophos XG is actually quite fancy looking. It was enough with all the options and lack of familiarity, my head spun a bit trying to set it up.

My first impression wasn’t very good. Sophos makes it VERY difficult to position the WAN/LAN where I wanted it. It wanted the first NIC port to be LAN or something like that. I think I eventually had to just swap the VLANs on the switch the server is plugged into.

Results

I imagine if I used Sophos more, I would become more comfortable with it. But for the few hours I had it spun up for testing, I felt a little like:

https://knowyourmeme.com/photos/234739-i-have-no-idea-what-im-doing

I’m pretty sure I got all the security and stuff configured, and the results were pretty straightforward, full 10 gigabit.

Inter-vlan (9.27Gbps):

NAT (9.24Gbps):

Outwardly, the results were marginally better than pfSense or VyOS, but I’m not sure I would want to use Sophos on a regular basis. It was just confusing and non-intuitive.


Untangle

This was another suggestion in response to my last post. As with Sophos, it’s another platform that I have basically no familiarity with.

It didn’t take much to get installed, and unlike Sophos, the initial interface configuration was a breeze.

I’m not going to lie, even though I’m completely a CLI junkie, using Untangle is NICE. Everything was intuitive, and the various graphs and reporting make complete sense from the perspective of a new user.

Results

Unfortunately, the performance was comparatively abysmal compared to the other platforms here. NAT and inter-vlan results were identical.

NAT/inter-vlan (2.7Gbps):

I have to believe there was some performance tuning I could do to get better performance, but I wasn’t able to track down anything specific.

Despite the lackluster performance, I really enjoyed setting up and playing with Untangle and it’s a platform I might use again in the future.


Final Results

In conclusion, most platforms came out fairly even. The outlier was Untangle, but I suspect with some tuning it could be brought up to the level of the other platforms.

RouterInter-vlanNAT
VyOS9.19 Gbps9.17 Gbps regular firewall
9.12 Gbps zone firewall
pfSense9.19 Gbps hardware offload
7.58 Gbps no hardware offload
8.71 Gbps
Sophos XG9.27 Gbps9.24 Gbps
Untangle2.70 Gbps2.70 Gbps

The takeaway here is that almost any platform can be an extremely performant firewall and router. I think with pfSense, the right hardware selection could potentially impact results if you are interested higher speeds. Whereas with VyOS and probably the other two, the hardware selection would matter a bit less due to better driver support.

As far as Untangle is concerned, it’s a platform I might be taking another crack at and see if I can figure out why the performance was so low. Given that the tests were IDENTICAL, there was almost certainly some artificial limitation occurring somewhere.

Please follow and like us: