Arista VXLAN Not Forwarding Traffic

Symptoms

You've built your VXLAN fabric, BGP sessions are up, the topology looks clean — and then traffic just doesn't flow. Hosts on the same VLAN sitting behind different leaf switches can't reach each other. Pings time out. The underlay looks healthy, OSPF or BGP shows all loopbacks reachable, but the overlay is a black hole.

I've seen this exact scenario more times than I'd like to admit. The frustrating part is that VXLAN failures are usually silent. There's no loud syslog error pointing you at the broken piece. The fabric looks fine from a control plane perspective, but the data plane quietly drops everything.

Common symptoms include:

VMs or hosts on the same VLAN across different leaf switches cannot ping each other
Traffic works fine when both hosts are behind the same leaf (local switching), but fails across the fabric
BGP EVPN sessions show as Established but zero MAC/IP routes are being exchanged
The MAC address table on a leaf shows no remote entries
Intermittent packet loss on flows that cross the VXLAN fabric
ARP requests go unanswered across the overlay
Large frames fail while small pings succeed — the classic MTU tell

Let's go through the causes one by one. Most of these take under five minutes to confirm once you know where to look.

Root Cause 1: VNI Not Configured or Mismatched

Why It Happens

VXLAN maps each VLAN to a VXLAN Network Identifier (VNI). If that mapping is missing on one VTEP, or if the VNI numbers don't match between the local and remote leaf switches, traffic won't traverse the tunnel. This is probably the most common day-one mistake in VXLAN deployments. You configure ten VLANs, forget to add the VNI mapping for one of them, and then spend an hour wondering why that specific VLAN doesn't work while everything else does. The other direction might work fine too — which makes the failure asymmetric and doubly confusing.

How to Identify It

On sw-infrarunbook-01, check the VNI-to-VLAN mappings:

sw-infrarunbook-01# show vxlan vni
VNI to VLAN Mapping for Vxlan1
VNI         VLAN       Source       Interface       802.1Q Tag
----------- ---------- ------------ --------------- ----------
10100       100        static       Vxlan1          10100
10200       200        static       Vxlan1          10200

If VLAN 300 should be in the fabric but VNI 10300 isn't in this table, that's your problem. Cross-check with the remote leaf:

sw-infrarunbook-02# show vxlan vni
VNI to VLAN Mapping for Vxlan1
VNI         VLAN       Source       Interface       802.1Q Tag
----------- ---------- ------------ --------------- ----------
10100       100        static       Vxlan1          10100
10200       200        static       Vxlan1          10200
10300       300        static       Vxlan1          10300

Here sw-infrarunbook-02 has VNI 10300 but sw-infrarunbook-01 does not. Traffic will fail from one side only.

How to Fix It

Add the missing VNI mapping on sw-infrarunbook-01:

sw-infrarunbook-01(config)# interface Vxlan1
sw-infrarunbook-01(config-if-Vx1)# vxlan vlan 300 vni 10300

Verify the mapping is now present:

sw-infrarunbook-01# show vxlan vni
VNI to VLAN Mapping for Vxlan1
VNI         VLAN       Source       Interface       802.1Q Tag
----------- ---------- ------------ --------------- ----------
10100       100        static       Vxlan1          10100
10200       200        static       Vxlan1          10200
10300       300        static       Vxlan1          10300

Root Cause 2: VTEP Unreachable in the Underlay

Why It Happens

Every VTEP needs to reach the other VTEPs' loopback addresses through the underlay routing protocol — whether that's OSPF, IS-IS, or an underlay BGP session. If a remote VTEP's loopback isn't in the routing table, no VXLAN tunnel can form to it. In my experience, this usually follows a link failure that wasn't properly reflected in the routing protocol, or a misconfigured

redistribute connected

statement that omitted the loopback subnet. Sometimes it's as simple as a typo in the OSPF network statement.

How to Identify It

First, check what VTEPs the local switch knows about:

sw-infrarunbook-01# show vxlan vtep
Remote VTEPS for Vxlan1:
VTEP             Tunnel Type(s)
---------------- --------------
10.0.0.2         unicast, flood
10.0.0.3         unicast, flood

Now verify the underlay can actually reach those IPs:

sw-infrarunbook-01# ping 10.0.0.3 source 10.0.0.1 repeat 5
PING 10.0.0.3 (10.0.0.3) from 10.0.0.1 : 72(100) bytes of data.
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3

If ping fails, check the routing table:

sw-infrarunbook-01# show ip route 10.0.0.3
% Network not in table

That's the smoking gun. The loopback of the remote VTEP isn't being advertised through the underlay protocol.

How to Fix It

Check the OSPF configuration on the remote node and ensure the loopback is included in the correct area:

sw-infrarunbook-03(config)# router ospf 1
sw-infrarunbook-03(config-router-ospf)# network 10.0.0.3/32 area 0.0.0.0
sw-infrarunbook-03(config-router-ospf)# network 192.168.100.0/30 area 0.0.0.0

If you're using BGP as the underlay, make sure the loopback is being advertised:

sw-infrarunbook-03(config)# router bgp 65003
sw-infrarunbook-03(config-router-bgp)# network 10.0.0.3/32

After fixing, ping should succeed, and the VTEP tunnel should come up. Confirm with

show vxlan vtep

on both sides.

Root Cause 3: BGP EVPN Not Advertising Routes

Why It Happens

BGP EVPN is the control plane for VXLAN in modern Arista deployments. Even when underlay BGP or OSPF sessions are healthy, the EVPN address family might not be exchanging MAC/IP routes. The typical culprits are: a missing

neighbor activate

statement in the EVPN address family, incorrect Route Distinguisher (RD) or Route Target (RT) configuration, or a missing

redistribute learned

under the MAC-VRF. The BGP session will show as Established, which is why this trips people up — everything looks fine at the session level, but zero EVPN prefixes are moving.

How to Identify It

Check the BGP EVPN summary first:

sw-infrarunbook-01# show bgp evpn summary
BGP summary information for VRF default
Router identifier 10.0.0.1, local AS number 65001
Neighbor          AS       MsgRcvd   MsgSent  InQ OutQ  Up/Down State   PfxRcd PfxAcc
10.0.0.10         65100    1432      1219     0   0     1d02h   Estab   0      0

Established session but zero prefixes received — that's a red flag. Check if any EVPN routes exist locally:

sw-infrarunbook-01# show bgp evpn route-type mac-ip
BGP routing table information for VRF default
Router identifier 10.0.0.1, local AS number 65001

(no routes found)

Now inspect the running config to see if the EVPN address family is properly activating the neighbor:

sw-infrarunbook-01# show running-config section router bgp
router bgp 65001
   router-id 10.0.0.1
   neighbor 10.0.0.10 remote-as 65100
   neighbor 10.0.0.10 update-source Loopback0
   !
   address-family evpn
      no neighbor 10.0.0.10 activate

There it is —

no neighbor activate

in the EVPN address family. The session formed under the base BGP config, but EVPN route exchange is explicitly disabled.

How to Fix It

sw-infrarunbook-01(config)# router bgp 65001
sw-infrarunbook-01(config-router-bgp)# address-family evpn
sw-infrarunbook-01(config-router-bgp-af)# neighbor 10.0.0.10 activate

Also verify the MAC-VRF has a correct RD, RT, and redistribution enabled:

sw-infrarunbook-01(config)# router bgp 65001
sw-infrarunbook-01(config-router-bgp)# vlan 100
sw-infrarunbook-01(config-macvrf-100)# rd 10.0.0.1:100
sw-infrarunbook-01(config-macvrf-100)# route-target both 65001:10100
sw-infrarunbook-01(config-macvrf-100)# redistribute learned

After this change, BGP should start exchanging MAC/IP Type-2 routes. Confirm:

sw-infrarunbook-01# show bgp evpn route-type mac-ip
BGP routing table information for VRF default
Router identifier 10.0.0.1, local AS number 65001
Route status codes: * - valid, > - active, S - Stale

          Network                Next Hop              Metric  LocPref Weight  Path
 * >      RD: 10.0.0.2:100 mac-ip aabb.cc00.0200
                                 10.0.0.2              -       100     0       65100 65002 i

Root Cause 4: MAC Address Not Learned

Why It Happens

Even with a properly configured BGP EVPN control plane, you can end up with no MAC entries in the forwarding table if the host hasn't generated any traffic, if the flood VTEP list is empty so BUM traffic never reaches the remote leaf, or if ARP suppression is holding back ARP requests that would otherwise trigger MAC learning. Without a MAC entry, the switch doesn't know where to send unicast traffic for a given destination and either floods it (if BUM forwarding works) or drops it entirely.

How to Identify It

Check the MAC address table for the VLAN in question:

sw-infrarunbook-01# show mac address-table vlan 100
          Mac Address Table
------------------------------------------------------------------
Vlan    Mac Address       Type        Ports      Moves   Last Move
----    -----------       ----        -----      -----   ---------
 100    aabb.cc00.0100    DYNAMIC     Et3        1       0:05:21 ago
Total Mac Addresses for this criterion: 1

Only one MAC entry — the local host. The remote host's MAC (aabb.cc00.0200) is missing. Check if it arrived via BGP EVPN:

sw-infrarunbook-01# show bgp evpn route-type mac-ip detail
(no routes found)

If BGP EVPN appears configured but routes aren't arriving, check the flood VTEP list — BUM traffic needs somewhere to go before initial MAC learning can occur:

sw-infrarunbook-01# show vxlan flood vtep
          VXLAN Flood VTEP Table
--------------------------------------------------------------------------------

VLANS                            Ip Address
-----------------------------   ------------------------------------------------

An empty flood list in ingress replication mode means ARP broadcasts and unknown unicast frames never leave this leaf. No ARP reaching the remote side means no ARP reply, which means no traffic.

How to Fix It

For static ingress replication, add the remote VTEPs to the flood list:

sw-infrarunbook-01(config)# interface Vxlan1
sw-infrarunbook-01(config-if-Vx1)# vxlan flood vtep 10.0.0.2
sw-infrarunbook-01(config-if-Vx1)# vxlan flood vtep 10.0.0.3

If you're relying on BGP EVPN Type-3 IMET routes for BUM replication instead of static flood lists, verify those routes are being exchanged:

sw-infrarunbook-01# show bgp evpn route-type imet
BGP routing table information for VRF default
Router identifier 10.0.0.1, local AS number 65001

          Network                Next Hop              Metric  LocPref Weight  Path
 * >      RD: 10.0.0.2:100 imet 10.0.0.2
                                 10.0.0.2              -       100     0       65100 65002 i

Once BUM traffic can flow, MAC addresses will be learned and the table will populate. You can trigger this by having the remote host send a ping or gratuitous ARP.

Root Cause 5: MTU Mismatch on the Underlay

Why It Happens

This one bites people constantly, and I'd argue it's the hardest to catch because symptoms are misleading. VXLAN encapsulation adds overhead: 8 bytes for the VXLAN header, 8 bytes for UDP, 20 bytes for the outer IP header, and 14 bytes for the outer Ethernet header — 50 bytes minimum. A standard 1500-byte inner frame becomes 1550 bytes on the wire. If your underlay interfaces are still at the default 1500-byte MTU, those frames get silently dropped at the first congested or fragmentation-unaware hop.

The insidious thing about MTU issues is that small pings work perfectly — they fit within the limit — but larger frames die silently. You'll ping successfully and think the fabric is healthy, then watch your application connections fail or perform terribly. TCP tends to negotiate down its MSS eventually, so you might even see intermittent success that masks the underlying problem.

How to Identify It

First check the MTU on uplink interfaces:

sw-infrarunbook-01# show interfaces Ethernet1
Ethernet1 is up, line protocol is up (connected)
  Hardware is Ethernet, address is 001c.7300.0101
  Description: uplink-to-spine
  MTU 1500 bytes, BW 10000000 kbit/s, DLY 10 usec
  ...

MTU of 1500 on a VXLAN uplink is a problem. Confirm the failure with a sized ping using the df-bit set — this prevents fragmentation so you can pinpoint exactly where frames are getting dropped:

sw-infrarunbook-01# ping 10.0.0.2 source 10.0.0.1 size 1472 df-bit repeat 5
PING 10.0.0.2 (10.0.0.2) from 10.0.0.1 : 1480(1508) bytes of data.
1480 bytes from 10.0.0.2: icmp_seq=1 ttl=63 time=1.43 ms
1480 bytes from 10.0.0.2: icmp_seq=2 ttl=63 time=1.31 ms

That passes because you're testing loopback to loopback — the ICMP packet is 1508 bytes on the wire, still under 1500 after accounting for the inner headers... wait, it passed here because there's no VXLAN encapsulation on a loopback ping. Now try an overlay ping from a host on VLAN 100 to the remote host with a large frame:

sw-infrarunbook-01# ping vrf PROD 10.10.100.20 source 10.10.100.10 size 1450 df-bit repeat 5
PING 10.10.100.20 (10.10.100.20) from 10.10.100.10 : 1458(1486) bytes of data.
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3

This fails because the 1486-byte inner packet becomes ~1536 bytes after VXLAN encapsulation — exceeding the 1500-byte underlay MTU. The difference in behavior between underlay and overlay pings at the same size is the tell.

How to Fix It

Increase the MTU on all underlay-facing interfaces. For most deployments, 9214 bytes (jumbo) is the right answer and eliminates this problem class permanently:

sw-infrarunbook-01(config)# interface Ethernet1
sw-infrarunbook-01(config-if-Et1)# mtu 9214

sw-infrarunbook-01(config)# interface Ethernet2
sw-infrarunbook-01(config-if-Et2)# mtu 9214

Do this on every device in the underlay path — spine switches included. A single 1500-byte hop anywhere in the path kills jumbo frame forwarding across the entire fabric. After the change:

sw-infrarunbook-01# show interfaces Ethernet1 | grep MTU
  MTU 9214 bytes, BW 10000000 kbit/s, DLY 10 usec

Re-run the large frame overlay ping. It should now succeed.

Root Cause 6: ARP Suppression Enabled Without a Populated Binding Table

Why It Happens

Arista's ARP suppression feature — enabled with

vxlan arp-proxy

— caches ARP mappings locally and responds on behalf of remote hosts, reducing BUM traffic in the fabric. It's a great optimization when it's working. When it's not, it actively breaks things. If ARP suppression is enabled and the EVPN control plane hasn't yet learned a remote host's MAC/IP binding, the local switch has no answer for ARP requests targeting that host. It won't flood the ARP either — that's the whole point of suppression — so the requesting host never gets a reply and connectivity stalls entirely.

How to Identify It

Check if ARP suppression is configured and whether the binding table has the expected entries:

sw-infrarunbook-01# show vxlan arp-suppression-vtep
Vxlan1 ARP Suppression Table
--------------------------------------------------------------------------------
VNI  Interface    IP Address      MAC Address     VTEP           Age
-----+------------+---------------+---------------+---------------+-------
10100  Vxlan1    10.10.100.10    aabb.cc00.0100  local          00:12:04

Only the local host is in the table. The remote host at 10.10.100.20 is missing. Any ARP request for that address from the local host will go unanswered because ARP suppression intercepts it but has nothing to reply with. Check the running config to confirm the feature is active:

sw-infrarunbook-01# show running-config section interface Vxlan1
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vlan 100 vni 10100
   vxlan arp-proxy

How to Fix It

If BGP EVPN Type-2 MAC/IP routes are flowing correctly, the binding table should populate as hosts generate traffic. The fix is usually to make sure EVPN redistribution is working (see Root Cause 3), then trigger the remote host to send traffic. Alternatively, if you need things working immediately during initial bring-up, temporarily disable ARP proxy and let ARP flood through:

sw-infrarunbook-01(config)# interface Vxlan1
sw-infrarunbook-01(config-if-Vx1)# no vxlan arp-proxy

For a permanent fix, don't enable ARP suppression until your EVPN control plane is fully operational and the binding table has had time to populate via Type-2 routes. The

redistribute learned

command in the MAC-VRF configuration is what drives those bindings into BGP and therefore into the ARP suppression table on remote leaves.

Root Cause 7: VTEP Source Interface Misconfigured or Down

Why It Happens

The VTEP source interface — typically Loopback0 — is used as the outer source IP for all VXLAN encapsulated packets. If this loopback doesn't exist, has no IP address, isn't advertised into the underlay, or if the Vxlan1 interface references the wrong loopback, no tunnels will form. I've seen engineers create Loopback1 for VTEP use but reference Loopback0 in the Vxlan1 config. The result is a VTEP source of 0.0.0.0 — and nothing works.

How to Identify It

sw-infrarunbook-01# show vxlan interface
Vxlan1 Interface Information
  Tunnel Source        : 0.0.0.0 (Loopback0 is down or has no IP)
  Tunnel Destination   : N/A
  Source interface     : Loopback0 (DOWN)

Or the loopback is up but unaddressed:

sw-infrarunbook-01# show interfaces Loopback0
Loopback0 is up, line protocol is up (connected)
  Hardware is Loopback
  MTU 65535 bytes, BW 8000000 kbit/s, DLY 5000 usec
  Internet address is unassigned

How to Fix It

sw-infrarunbook-01(config)# interface Loopback0
sw-infrarunbook-01(config-if-Lo0)# ip address 10.0.0.1/32

Then ensure this address is redistributed or explicitly advertised in your underlay routing protocol. After committing the change,

show vxlan interface

should reflect the correct source IP and tunnel state should come up cleanly.

Prevention

Most VXLAN forwarding issues are configuration problems that surface during initial deployment or after a change window. A few habits will eliminate most of this pain before it starts.

Always set jumbo MTU (9214) on underlay-facing interfaces as part of your base switch build. Make this a requirement in your provisioning template, not an afterthought. Apply it to spine switches too — a single standard-MTU hop anywhere in the path breaks jumbo frame forwarding across the entire fabric. Include a large-frame df-bit ping as a mandatory post-change validation step so MTU regressions are caught immediately.

Use a configuration validation tool or CloudVision to enforce VNI-to-VLAN consistency across all leaves. A single switch with a missing VNI entry causes asymmetric failures that are hard to diagnose without cross-comparing multiple switches. CVP configuration templates make this class of inconsistency nearly impossible if you're managing your fleet properly.

Keep your BGP EVPN neighbor configuration in a peer group. All leaves pointing to the same route reflector or spine should inherit EVPN address-family settings from a shared peer group, so a missing

neighbor activate

becomes structurally impossible rather than a per-device risk. Same principle applies to RD and RT configuration — template it, don't type it by hand each time.

After any VXLAN-related change, run a quick validation sequence before closing the change window:

show vxlan vtep

to confirm all expected VTEPs are present,

show bgp evpn summary

to confirm prefix counts look healthy and non-zero, and

show mac address-table

to confirm remote MACs are being learned. This takes about 30 seconds and catches the majority of issues before anyone notices traffic is broken.

Document your VNI allocation scheme and keep it current. When VNI 10100 maps to VLAN 100 on sw-infrarunbook-01 but VNI 10300 maps to VLAN 100 on sw-infrarunbook-02, you'll never find it without a reference. A simple spreadsheet or a network source-of-truth like NetBox eliminates this class of problem entirely. The few minutes spent documenting saves hours of fabric-wide comparison during a 2am outage.

Arista VXLAN Not Forwarding Traffic

Symptoms

Root Cause 1: VNI Not Configured or Mismatched

Why It Happens

How to Identify It

How to Fix It

Root Cause 2: VTEP Unreachable in the Underlay

Why It Happens

How to Identify It

How to Fix It

Root Cause 3: BGP EVPN Not Advertising Routes

Why It Happens

How to Identify It

How to Fix It

Root Cause 4: MAC Address Not Learned

Why It Happens

How to Identify It

How to Fix It

Root Cause 5: MTU Mismatch on the Underlay

Why It Happens

How to Identify It

How to Fix It

Root Cause 6: ARP Suppression Enabled Without a Populated Binding Table

Why It Happens

How to Identify It

How to Fix It

Root Cause 7: VTEP Source Interface Misconfigured or Down

Why It Happens

How to Identify It

How to Fix It

Prevention

Related Articles

Frequently Asked Questions

Why does my Arista VXLAN work for small pings but fail for large frames?

BGP EVPN sessions are Established on my Arista leaf switches but no routes are being exchanged. What should I check?

How do I check if a VNI is correctly mapped on an Arista switch?

What causes an empty flood VTEP list on an Arista VXLAN leaf?

Can ARP suppression on Arista VXLAN prevent hosts from communicating?

Related Articles