SD-WAN Design – Static or Dynamic Routing in the Underlay

Introduction

In traditional networks, dynamic routing is often preferred over static routing because it helps :

When dealing with numerous routes, maintaining static routes can be a burden
Prevent blackhole scenarios: Avoid situations where a port remains active, and routes stay in the routing table even if the next-hop is unreachable. Additionally, prevent Customer Edge (CE) devices from losing peering with the provider network, which could result in traffic blackholes.
- These challenges typically require tracking customization or scripting, adding operational complexity (BGP can face a similar issue when static routes are advertised unconditionally, but this can be resolved by configuring them as ‘conditional routes.’)
Implicitly monitor link state and enhance convergence time by utilizing mechanisms like keepalive signals between nodes.

But how about an SD-WAN network ? What are the differences between using static and dynamic routes on the underlay ?

Static routing on the underlay

Unlike traditional networks, static routes are well-suited for SD-WAN networks, as their primary role is merely to reach other endpoints and, potentially, the controllers, enabling them to manage the devices and establish the tunnel.

The SD-WAN solution, typically built using protocols like IPsec or GRE, often incorporates keepalive mechanisms and SLA monitoring (which involves assessing the performance of a link based on latency, jitter, and packet loss), whether in terms of failure or performance, additionally, in certain SD-WAN solutions, a routing protocol may also operate inside the tunnel. Lastly, some SD-WAN solution implement some gateway tracking (

These features allow failure detection along the path, effectively addressing the limitations of static routing.

Additionally, in SD-WAN, we typically receive a simple default route from the underlay provider. Depending on the configuration, SD-WAN devices attempt to establish tunnels through each available default route. As a result, a single default route per transport is often sufficient

So the question is: why would someone choose dynamic routing if static routing is enough?

Why running dynamic routing on the underlay ?

1- Announce Loopback

Dynamic routing in the underlay is important when establishing IPSEC tunnels using loopback interfaces, which might need to be redistributed into BGP, for example. While this setup is uncommon, it can be essential for certain specific requirements.

2- Announce Inter-SD-WAN link (transport extension)

For dual-SD-WAN router sites, a transport extension in networking refers to the logical sharing of transport connections (such as internet or MPLS circuits) between devices, enabling one device to utilize the transport links of another. In Cisco terminology, this is known as a ‘TLOC Extension’.

The goal here is to advertise the transport extension IP prefix, enabling other SD-WAN routers to reach it and establish their IPsec tunnels by using the neighboring router as a transit point.

3- Convergence

In an SD-WAN network, failure detection can occur in several ways :

Physical link
Dynamic routing on Underlay : This is where dynamic routing can add value
IPSEC DPD/Idle or GRE Tunnels Keepalive
SLA performance health check (SD-WAN monitoring the quality of each transport link)

Then, after detection, the SD-WAN control-plane protocol (which may operate inside or outside the tunnel, depending on the solution) is notified and propagates the information across the entire fabric.

So, with dynamic routing, when a failure is detected via dynamic routing (assuming all SD-WAN routers receive the full set of specific endpoint IP prefixes from each other, rather than just a default route), they will perform the switchover immediately in case of failure, even before receiving the information from the control plane.

Suppose we have two SD-WAN sites, each with two underlays: MPLS and Internet, both establishing tunnels with one another.

In case of an underlay failure at site 1, the CE detects it dynamically, assuming the SLA health monitoring of the SD-WAN is slower than the underlay routing timers (e.g., BGP/BFD peering going down), and propagates the route withdrawal across the underlay. This ensures that all SD-WAN devices using dynamic routing (on that underlay) and receiving the affected prefix automatically switch to available links, even before the SD-WAN control plane updates them.

When an underlay failure occurs at site 1, the CE dynamically detects the issue. The CE propagates route withdrawals across the underlay and if the underlay routing timers (e.g., BGP/BFD) are faster than SLA health monitoring, SD-WAN nodes will automatic switchover to available links before the SD-WAN control plane updates the network.

What are the convergence benefits of implementing dynamic routing on the hub in a private MPLS network?

To quickly detect direct failures between the Hub and the CE underlay (using BFD on the underlay), particularly when there is no direct physical connection between the SD-WAN router and the underlay CE.
To quickly detect remote site failures on a private transport underlay and propagate information faster by receiving all remote end IP prefixes from the private underlay (e.g., MPLS or others).
- This allows the Hub to identify route withdrawals and send this information immediatly
- Remember, this is only relevant on private networks, as on the Internet, it is less likely that a failure of an SD-WAN node’s underlay will be noticeable via the underlay routing protocol.

Once detected, the Hub can then inform the Fabric (directly or via the controllers depending on the solution) to ensure that all SD-WAN hops process the information and switch over to an available link.

Is there an impact on the Local Breakout ?

With local breakout, one might assume that dynamic routing is preferable for automatically removing unreachable Internet service IPs. However, relying on the SD-WAN vendor’s SLA monitoring is typically more reliable, as it enables seamless switching between available underlays like MPLS or alternative ISP links. This is especially true since SD-WAN routers usually only receive a default route from their ISP and probably won’t detect changes when there is a failure toward the Breakout IP prefixes.

In conclusion

Using static routing for the underlay on SD-WAN devices is simpler than implementing dynamic routing with the service provider CE router, especially at branch locations where faster convergence times are not critical. This approach is particularly beneficial when it is possible to adjust the SLA health check monitoring timers on the SD-WAN to detect failures more quickly, even if these timers may not match the convergence speed of dynamic routing protocols like IGP or BGP.

When implementing transport extension (using a neighboring SD-WAN router as a transit to access its overlay) or leveraging a loopback interface to establish IPsec tunnels, dynamic routing is essential to eliminate the complexity and maintenance challenges associated with static routes.

It may also be beneficial to enable dynamic routing on the private underlay (e.g., MPLS) at the Hubs, which can slightly improve convergence (but do you really need to reduce convergence time by just a few seconds?).

SD-WAN Design – Static or Dynamic Routing in the Underlay

Submit a Comment Cancel reply

All posts

Latest posts

Quick REVIEW

CONFIGURATION REVIEW

Deep-DIVE REVIEW

Interview a Candidate

Our Team

Join Us

Contact us