You might face a situation where you’re considering establishing a LAG port (whether using LACP, manual configuration, etc.), or perhaps you’re planning a maintenance window and considering shutting down a link within a LAG port, assuming it won’t affect the routing protocol due to the typically rapid convergence time of LAG. Well, it might be wise to think twice !

Imagine the following scenarios, in which, we can consider two use cases:

  • Direct BFD sessions between two routers using a LAG port.
  • BFD sessions established between two routers, with an L2 switch between them forming a LAG port.

The BFD protocol operates within Layer 4 of the network stack, employing common IP addresses and UDP ports. Therefore, when the router initializes a session, the ‘Load Balancing’ algorithm determines the appropriate link for transmitting this session.

To make things more interesting, each router may select a different link for its session (in a “non echo”/asynchronous mode) :

You probably began to guess where the issue lies here ^^

The BFD protocol is generaly configured with more aggressive timers compared to LACP, enabling quicker detection of link abnormalities. Thus, if there’s an issue on link “3”, BFD might detect it and initiate a timeout prior to LACP’s detection of the problem. Consequently, the BFD client protocol could report adjacency down.

Also, deploying a BFD session across the aggregation without internal insight into the member links would render BFD incapable of ensuring detection of physical member link failures especialy if the LACP timer is high (30s for example). The objective is to confirm link continuity for each member link.

  1. Simply accept it as it is; this may not pose a significant issue for you, especially if you have sufficient capacity on the backup link to handle traffic switching
  2. Remove the BFD if you don’t need it
  3. Increase the single interface capacity if feasible, and remove the LAG bundle
  4. Using ECMP involves employing multiple distinct interconnections instead of a single LACP link. However, depending on the scale and requirements of your network, it may be more advantageous to establish LAG bundle instead of relying on ECMP. This approach may helps mitigate platform-specific scaling challenges and prevents unnecessary interference with your IGP
  5. Set a more aggressive setting for LACP (.i.e. fast or short LACP), ensuring faster periodic updates, while allowing BFD timers to be slightly longer; however, this is sometimes not available or not tunable enough depending on the router vendor/OS

Micro-BFD : Integrating BFD across all LAG member links which necessitates the BFD code’s understanding of LAG configurations, enabling it to verify connectivity independently across each constituent link. It will depend on vendor, and their OS, if they did develop this functionality Consulting RFC 7130 offers additional information.)

Should even a single micro BFD session be Up, the port-channel session will reflect its state as Up.

Some vendors support the RFC 7130 :

  • Juniper : https://www.juniper.net/documentation/us/en/software/junos/high-availability/topics/task/bfd-protocol-configuring-micro-bfd-for-lag.html
  • Cisco NX-OS and XE : https://www.cisco.com/c/en/us/td/docs/routers/ios/config/17-x/ip-routing/b-ip-routing/m-micro-bfd-sessions.html
  • Cisco XR (named BLB & BOB) : https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r6-9/routing/configuration/guide/b-routing-cg-asr9000-69x/implementing-bfd.html
  • Arista : https://www.arista.com/en/um-eos/eos-bidirectional-forwarding-detection

Other still not :

  • Palo Alto Firewalls : https://docs.paloaltonetworks.com/pan-os/11-1/pan-os-networking-admin/bfd/bfd-overview/non-supported-rfc-components-of-bfd
  • Fortinet Firewalls not supported

Configuring BFD is simple, configuring LAG bundle is simple, but configuring both need some intereaction and anticipation, the more protocols you add together in your design, the more complexity and unexpected results you may have.

So regarding micro-bfd, at the end, depending on your specific requirements, this may not be necessary, and simplicity could be prioritized since enabling micro-BFD may come with certain limitations, which can vary depending on the vendor (make sure to refer to the vendor configuration guide for specific details). Ultimately, as always, you have to consider the required resilience level and adherence to your SLA objectives.

Are you considering tuning your network or do you have any questions about it? We can assist you. You can reach out to us through the CONTACT-US page or by emailing us at [email protected] for advise assistance.

Mehdi SFAR (CCDE 2021:3, CCIE #51583)