Symptoms
You pull up
show bgp evpnand the route table is sparse — or completely empty. Remote hosts aren't reachable across the fabric. Maybe the MAC/IP routes from a remote VTEP never showed up. Maybe the Type-5 IP prefix routes are missing from the L3 VRF. The underlay BGP sessions look fine, pings to loopbacks work, but the overlay is dark.
I've seen this surface in several ways depending on how far the control plane got before failing:
show vxlan address-table
shows only local MACs — no remote entries at all- Traffic between VTEPs drops silently with no ICMP unreachable, just black holes
show bgp evpn route-type mac-ip
returns nothing for a VNI that should have remote learnings- Type-2 or Type-5 routes appear in
show bgp evpn
but aren't installed in the RIB or hardware table
The frustrating part about EVPN troubleshooting is that BGP sessions can look perfectly healthy — all neighbors established, prefixes counted — and the overlay still doesn't work. Route targets, VNIs, RDs, and EVPN address family activation all have to line up exactly. Miss one piece and you get silence.
Let me walk through the causes I hit most often, in roughly the order I check them.
Root Cause 1: BGP EVPN Not Enabled
Why It Happens
BGP EVPN requires explicit activation of the L2VPN EVPN address family on each neighbor. A BGP session established for the underlay (IPv4 unicast) carries zero EVPN NLRIs unless you explicitly activate it. If the EVPN address family isn't turned on, the BGP session exists and looks healthy, but no EVPN routes are ever exchanged. This catches people who set up the underlay, confirm BGP is up, and assume EVPN is running when it isn't.
How to Identify It
Run
show bgp evpn summaryand compare it against
show bgp summary. If a neighbor shows
Establishedin the IPv4 table but
Idleor
Activein the EVPN table, the address family isn't activated.
sw-infrarunbook-01# show bgp evpn summary
BGP summary information for VRF default
Router identifier 10.0.0.1, local AS number 65001
Neighbor Status Codes: m - Under maintenance
Description Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc
sw-infrarunbook-02 10.0.0.2 4 65002 0 0 0 0 00:05:23 Idle 0 0Then look at the BGP config to confirm the EVPN address family stanza is absent:
sw-infrarunbook-01# show running-config section router bgp
router bgp 65001
router-id 10.0.0.1
neighbor 10.0.0.2 remote-as 65002
!
address-family ipv4
neighbor 10.0.0.2 activateNo
address-family evpnblock. That's the problem right there.
How to Fix It
Activate the EVPN address family on the neighbor and make sure extended communities are being sent — EVPN route targets ride as BGP extended communities, and without that knob they get stripped in transit.
sw-infrarunbook-01(config)# router bgp 65001
sw-infrarunbook-01(config-router-bgp)# neighbor 10.0.0.2 send-community extended
sw-infrarunbook-01(config-router-bgp)# address-family evpn
sw-infrarunbook-01(config-router-bgp-af)# neighbor 10.0.0.2 activateAfter committing, confirm the session comes up and prefixes begin flowing:
sw-infrarunbook-01# show bgp evpn summary
Description Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc
sw-infrarunbook-02 10.0.0.2 4 65002 142 138 0 0 00:01:04 Estab 24 24In my experience, using a BGP peer group for all VTEP neighbors makes this consistent. Define the peer group once with
send-community extendedand EVPN activated, and every new VTEP you add inherits it automatically.
Root Cause 2: Route Target Mismatch
Why It Happens
EVPN uses BGP extended communities called route targets to control which routes get imported into which VRFs and VNIs. The export RT on the sending VTEP must match the import RT on the receiving VTEP. If they don't match, routes get advertised and received correctly at the BGP level — but they're never imported. They sit in the BGP table marked as not best or rejected by import policy. This is the most common cause I encounter in manually configured fabrics or environments that evolved over time from multiple engineers touching the config.
How to Identify It
Compare the EVPN instance configuration on each VTEP side by side:
sw-infrarunbook-01# show bgp evpn instance vlan 10
EVPN instance: VLAN 10
Route distinguisher: 10.0.0.1:10010
Route target import: Route-Target-AS:65001:10010
Route target export: Route-Target-AS:65001:10010
Service interface: VLAN-based
Local VXLAN IP address: 10.0.0.1
VNI: 10010sw-infrarunbook-02# show bgp evpn instance vlan 10
EVPN instance: VLAN 10
Route distinguisher: 10.0.0.2:10010
Route target import: Route-Target-AS:65001:1010
Route target export: Route-Target-AS:65001:1010sw-infrarunbook-01 exports
65001:10010but sw-infrarunbook-02 is only importing
65001:1010— a transposition. They'll never match. You can confirm by looking at the detailed route entry on the receiving side:
sw-infrarunbook-02# show bgp evpn route-type mac-ip detail
BGP routing table entry for mac-ip 10010 5000.0003.0001, Route Distinguisher: 10.0.0.1:10010
Paths: 1 available
Path from: 10.0.0.1 (65001)
Extended community: Route-Target-AS:65001:10010 TunnelEncap:tunnelTypeVxlan
Not selected: extended community not matching import policyThat
Not selected: extended community not matching import policyline is the definitive tell.
How to Fix It
Align the route targets across the fabric. On Arista, this is set under the per-VLAN BGP EVPN stanza:
sw-infrarunbook-02(config)# router bgp 65002
sw-infrarunbook-02(config-router-bgp)# vlan 10
sw-infrarunbook-02(config-macvrf-10)# route-target both 65001:10010For VRF-based Type-5 routing, fix it under the VRF:
sw-infrarunbook-02(config-router-bgp)# vrf PROD
sw-infrarunbook-02(config-router-bgp-vrf)# route-target import evpn 65001:50000
sw-infrarunbook-02(config-router-bgp-vrf)# route-target export evpn 65001:50000After the fix,
show bgp evpn route-type mac-ipshould show routes moving from
not bestto
valid, active.
Root Cause 3: VNI Not Advertised
Why It Happens
Even with BGP EVPN activated and route targets aligned, a VLAN's MAC/IP routes won't be advertised unless the VLAN is explicitly included in the BGP EVPN process. On Arista, this means the VLAN needs a
vlan <id>stanza under
router bgp(or membership in a vlan-aware-bundle) with
redistribute learnedconfigured. If that stanza is missing, the switch learns MACs on that VLAN locally but never puts them into BGP. From the perspective of the rest of the fabric, that VLAN is silent.
How to Identify It
Start by verifying the VXLAN VNI mapping exists on the interface:
sw-infrarunbook-01# show vxlan vni
VNI to VLAN Mapping for Vxlan1
VNI VLAN Source Interface 802.1Q Tag
---------- ---------- ------------ --------------- ----------
10010 10 static Ethernet5 untagged
Vxlan1 10The VNI-to-VLAN mapping exists. Now check whether BGP knows about it:
sw-infrarunbook-01# show bgp evpn instance
EVPN instance: VLAN 20
Route distinguisher: 10.0.0.1:10020
Route target import: Route-Target-AS:65001:10020
Route target export: Route-Target-AS:65001:10020
VNI: 10020
EVPN instance: VLAN 30
Route distinguisher: 10.0.0.1:10030
...VLAN 10 is completely absent from the EVPN instance list despite having a VNI mapping. Check the BGP running config:
sw-infrarunbook-01# show running-config section router bgp
router bgp 65001
vlan-aware-bundle PROD
vlan 20-30
route-target both 65001:20000
redistribute learnedVLAN 10 isn't in any bundle or standalone stanza. The VNI is mapped at the interface level but BGP has no instructions to advertise it.
How to Fix It
Add the VLAN to the BGP EVPN configuration. If you're using per-VLAN mode:
sw-infrarunbook-01(config)# router bgp 65001
sw-infrarunbook-01(config-router-bgp)# vlan 10
sw-infrarunbook-01(config-macvrf-10)# rd 10.0.0.1:10010
sw-infrarunbook-01(config-macvrf-10)# route-target both 65001:10010
sw-infrarunbook-01(config-macvrf-10)# redistribute learnedIf you're using vlan-aware-bundle, add the VLAN to the existing bundle:
sw-infrarunbook-01(config-router-bgp)# vlan-aware-bundle PROD
sw-infrarunbook-01(config-macvrf-PROD)# vlan add 10Verify routes begin propagating within a few seconds:
sw-infrarunbook-01# show bgp evpn route-type mac-ip vni 10010
Network Next Hop Metric LocPref Weight Path
* > RD: 10.0.0.1:10010 mac-ip 5000.0003.0001
- - - 0 i
* > RD: 10.0.0.2:10010 mac-ip 5000.0004.0001
10.0.0.2 - 100 0 65002 iRoot Cause 4: MAC Mobility Issue
Why It Happens
EVPN Type-2 routes carry a MAC mobility extended community that includes a sequence number. This is the mechanism EVPN uses to handle host moves — when a host migrates from one VTEP to another, the new VTEP advertises the MAC with a higher sequence number and all other VTEPs should prefer it and withdraw the stale remote entry. Problems arise when a MAC is being actively learned at two different VTEPs simultaneously — either because of a loop, a misconfigured dual-homing setup, or a VM migration where the hypervisor brought up the new instance before the old NIC went fully silent. When VTEPs see conflicting advertisements with equal sequence numbers, some routes end up stuck or oscillating.
How to Identify It
Look at the MAC mobility sequence numbers in the detailed EVPN route output:
sw-infrarunbook-01# show bgp evpn route-type mac-ip detail
BGP routing table entry for mac-ip 10010 5000.0005.0001 10.1.10.50, Route Distinguisher: 10.0.0.1:10010
Paths: 2 available
Path from: 10.0.0.1 (65001)
Extended community: Route-Target-AS:65001:10010 TunnelEncap:tunnelTypeVxlan EvpnMacMobility:Seq:5
Origin IGP, valid, redistributed, best, ECMP head
Path from: 10.0.0.2 (65002)
Extended community: Route-Target-AS:65001:10010 TunnelEncap:tunnelTypeVxlan EvpnMacMobility:Seq:3
Origin IGP, valid, not bestHere sw-infrarunbook-01 holds seq 5 and sw-infrarunbook-02 holds seq 3, so sw-infrarunbook-01 wins — that's correct behavior after a move. But if you see both VTEPs advertising the same sequence number, or if syslog shows the mobility conflict message repeating rapidly, you have a genuine conflict:
sw-infrarunbook-01# show logging | grep "5000.0005.0001"
Apr 14 22:01:03 sw-infrarunbook-01 Bgp: %BGP-5-NOTIFICATION: MAC mobility conflict for MAC 5000.0005.0001 VNI 10010, seq 7 from 10.0.0.2 equals local seq 7
Apr 14 22:01:04 sw-infrarunbook-01 Bgp: %BGP-5-NOTIFICATION: MAC mobility conflict for MAC 5000.0005.0001 VNI 10010, seq 7 from 10.0.0.2 equals local seq 7Rapid repetition of that log line means the MAC is actively oscillating between two VTEPs without either winning.
How to Fix It
First, track down where the MAC is physically learned. On each VTEP involved:
sw-infrarunbook-01# show mac address-table address 5000.0005.0001
MAC Address Type Vlan Interface
5000.0005.0001 Dynamic 10 Ethernet3If the same MAC is showing as locally learned on two different VTEPs simultaneously, you have a loop or a dual-active forwarding problem — not a control plane issue. Trace the physical topology and confirm the host has a single active uplink or is properly using MLAG with a consistent ESI. If the host genuinely migrated and the old VTEP just hasn't timed out yet, flush it manually:
sw-infrarunbook-02(config)# clear mac address-table dynamic address 5000.0005.0001For MLAG-based dual-homing, both MLAG peers should advertise under a shared MLAG VTEP IP rather than their individual loopbacks. Advertising from two separate VTEP IPs with different RDs for the same MAC is what triggers the mobility conflict in the first place. Confirm the MLAG shared IP is configured under
interface Vxlan1:
sw-infrarunbook-01# show running-config section interface Vxlan1
interface Vxlan1
vxlan source-interface Loopback0
vxlan virtual-router encapsulation mac-address mlag-system-id
vxlan udp-port 4789
vxlan vlan 10 vni 10010
vxlan mlag vtep source-interface Loopback1The
vxlan mlag vtep source-interface Loopback1line tells the switch to use the MLAG shared loopback (Loopback1) for VTEP advertisements, ensuring both peers appear as a single VTEP to the rest of the fabric.
Root Cause 5: RD Conflict
Why It Happens
The Route Distinguisher makes EVPN routes globally unique within the BGP control plane. Arista auto-generates RDs in the format
<loopback-ip>:<vni>or
<loopback-ip>:<vlan-id>based on the switch's Loopback0 address. If two VTEPs share the same loopback IP — which I've seen happen when switches are cloned from a golden config template and someone forgets to update the loopback — their auto-generated RDs will be identical. BGP then treats routes from both VTEPs as if they came from the same source. One set of routes overwrites the other, or a route reflector starts rejecting duplicates, and you get sporadic or total loss of EVPN routes from one VTEP.
How to Identify It
In
show bgp evpn route-type mac-ip detail, look at which RDs are present and which BGP peer they originated from:
sw-infrarunbook-01# show bgp evpn route-type mac-ip detail
BGP routing table entry for mac-ip 10010 5000.0006.0001, Route Distinguisher: 10.0.0.1:10010
Paths: 1 available
Path from: 10.0.0.2 (65002)
Route Distinguisher: 10.0.0.1:10010
Extended community: Route-Target-AS:65001:10010 TunnelEncap:tunnelTypeVxlanThe path arrived from peer 10.0.0.2 (sw-infrarunbook-02 based on its BGP router-id) but is carrying RD
10.0.0.1:10010. That's the collision — sw-infrarunbook-02 is generating RDs as if its loopback is 10.0.0.1. Confirm by SSHing to the offending switch:
sw-infrarunbook-02# show interfaces Loopback0
Loopback0 is up, line protocol is up (connected)
Hardware is Loopback
Internet address is 10.0.0.1/32Duplicate loopback. The cloned config was never updated.
How to Fix It
Assign the correct unique loopback to sw-infrarunbook-02 and update the BGP router-id to match:
sw-infrarunbook-02(config)# interface Loopback0
sw-infrarunbook-02(config-if-Lo0)# ip address 10.0.0.2/32
sw-infrarunbook-02(config)# router bgp 65002
sw-infrarunbook-02(config-router-bgp)# router-id 10.0.0.2If RDs were manually configured rather than auto-generated, update each one:
sw-infrarunbook-02(config-router-bgp)# vlan 10
sw-infrarunbook-02(config-macvrf-10)# rd 10.0.0.2:10010Then reset the EVPN sessions to force RD re-advertisement with the corrected values:
sw-infrarunbook-02# clear bgp evpn allAfter recovery, verify the RDs are now unique across the fabric:
sw-infrarunbook-01# show bgp evpn route-type mac-ip detail | grep "Route Distinguisher"
BGP routing table entry for mac-ip 10010 5000.0006.0001, Route Distinguisher: 10.0.0.1:10010
BGP routing table entry for mac-ip 10010 5000.0007.0001, Route Distinguisher: 10.0.0.2:10010Each VTEP now owns a distinct RD namespace. Routes from both VTEPs coexist cleanly in the BGP table.
Root Cause 6: VTEP Underlay Reachability
Why It Happens
This one is subtle because it looks like a control plane success but a data plane failure. EVPN routes can be correctly exchanged in BGP, show as best, and even get installed in the RIB — but traffic still black-holes if the VTEP loopback addresses aren't reachable via the underlay IGP or BGP. VXLAN encapsulation requires actual IP reachability between VTEP source IPs. If the underlay isn't advertising a VTEP's loopback, the remote VTEP tunnel destination is unreachable and the route doesn't get programmed into the forwarding ASIC even if it looks fine in software.
How to Identify It
sw-infrarunbook-01# show vxlan vtep
Remote VTEPS for Vxlan1:
VTEP Tunnel Type(s)
--------------- --------------Empty VTEP table despite having EVPN routes is the signature. Confirm the underlay has a gap:
sw-infrarunbook-01# show ip route 10.0.0.2
% Network not in tableFix the underlay routing protocol to ensure all loopbacks are reachable before troubleshooting the EVPN overlay further. These two layers must be solved independently and in order — underlay first, overlay second.
Prevention
Most of these problems share a common thread: they're configuration inconsistencies that slip in during manual provisioning. The best prevention is reducing the surface area where humans can introduce those inconsistencies.
Automate RD and RT assignment. Don't let engineers manually type route distinguishers and route targets during provisioning. Build a consistent scheme —
<loopback>:<vni>for RDs,
<ASN>:<vni>for RTs — into your configuration templates and enforce it through CVP (CloudVision Portal) or a provisioning pipeline. Manually entered values are where typos and duplicates live.
Use BGP peer groups for EVPN neighbors. Define a single peer group that has
send-community extendedand EVPN address family activation baked in. Every new VTEP added to that group inherits the correct EVPN config automatically, rather than depending on someone remembering to configure four separate lines.
sw-infrarunbook-01(config)# router bgp 65001
sw-infrarunbook-01(config-router-bgp)# peer group VTEP-PEERS
sw-infrarunbook-01(config-router-bgp)# neighbor VTEP-PEERS send-community extended
sw-infrarunbook-01(config-router-bgp)# address-family evpn
sw-infrarunbook-01(config-router-bgp-af)# neighbor VTEP-PEERS activateValidate loopback uniqueness before deploying. Before any EVPN config is applied to a new switch, confirm its Loopback0 is unique across the fabric and reachable from every existing VTEP. A simple pre-flight check — ping each VTEP loopback from the management network or via the underlay — catches duplicate IPs before they cause RD collisions. Starting EVPN configuration before the underlay is solid is a reliable way to end up chasing ghosts.
Monitor MAC mobility counters with telemetry. Set up streaming telemetry or periodic polling on MAC mobility sequence numbers. A MAC whose sequence number is incrementing rapidly is a host that's flapping between VTEPs — either a loop, a dual-active problem, or a hypervisor bug. Catching this early through monitoring prevents it from escalating into a fabric-wide forwarding incident.
Validate VNI changes in a staging environment first. Additions or modifications to VNI-to-VLAN mappings and route target changes can silently drop traffic on a production fabric if applied incorrectly. Arista's CloudVision change control workflow is genuinely useful here: it captures the config diff, stages the change, and can roll back automatically if health checks fail post-change. Don't push VNI config changes to production switches directly from a terminal window.
Make show bgp evpn summary
, show bgp evpn instance
, and show vxlan vtep
your baseline trio. These three commands give you the full picture: whether the BGP EVPN control plane is active, whether EVPN instances are correctly defined, and whether the data-plane VTEP tunnels are resolving. Run them before and after any fabric change as a sanity check. If all three look right, you're in good shape. If any one of them is off, you know exactly which layer to dig into.
