What is BGP?
Border Gateway Protocol (BGP) is the routing protocol that holds the internet together. Standardized in RFC 4271, BGP is an Exterior Gateway Protocol (EGP) engineered to exchange routing and reachability information between Autonomous Systems (ASes). An Autonomous System is a collection of IP networks and routers under the control of a single organization, presenting a unified routing policy to the outside world.
Unlike Interior Gateway Protocols such as OSPF or EIGRP — which focus on finding the lowest-cost path within a single organization — BGP is built for policy-based routing across administrative boundaries. BGP is a path-vector protocol: instead of advertising a distance metric, it advertises the complete sequence of AS numbers a route must traverse to reach a destination. This design makes loop prevention straightforward — if a router sees its own AS number already present in an incoming AS_PATH, it discards the route immediately.
BGP operates over TCP port 179, inheriting TCP's reliability, sequencing, and flow control without needing to implement them independently. All BGP relationships are explicitly configured — BGP never auto-discovers neighbors the way OSPF uses multicast hellos on a segment.
eBGP vs. iBGP
BGP operates in two distinct modes depending on whether the peering session crosses an AS boundary:
- External BGP (eBGP) — sessions between routers in different autonomous systems. eBGP peers are typically directly connected. Routes learned via eBGP are advertised to both eBGP and iBGP peers by default, and the next-hop attribute is set to the advertising router's own IP address.
- Internal BGP (iBGP) — sessions between routers within the same autonomous system. iBGP peers do not need to be directly connected; they commonly peer using loopback addresses for resilience. A fundamental iBGP rule is the split-horizon rule: a route learned from one iBGP peer is never re-advertised to another iBGP peer. This prevents loops but requires either a full mesh of iBGP sessions or the use of Route Reflectors or BGP Confederations to propagate routes throughout the AS. Additionally, iBGP does not modify the next-hop attribute by default — requiring either IGP reachability to the original eBGP next-hop or the use of
next-hop-self
.
BGP Message Types
BGP uses four message types to establish neighbor relationships and exchange routing state:
- OPEN — sent immediately after the TCP session is established. Negotiates BGP version, AS number, hold timer, and BGP Identifier (Router ID). Both peers must successfully exchange OPEN messages before the session moves to the Established state.
- UPDATE — the workhorse of BGP. Carries newly advertised prefixes (called NLRI — Network Layer Reachability Information) along with their path attributes, and simultaneously withdraws previously advertised prefixes that are no longer valid.
- KEEPALIVE — sent periodically (default every 60 seconds on Cisco IOS-XE) to maintain the session and confirm peer reachability. A KEEPALIVE is also sent in response to an OPEN message as an acknowledgment.
- NOTIFICATION — sent when an error is detected (malformed message, hold timer expiry, unsupported capability). Sending a NOTIFICATION immediately terminates the BGP session.
BGP Tables
A BGP router maintains three distinct information stores:
- Neighbor Table — tracks all configured BGP peers and their current finite state machine (FSM) state: Idle, Connect, Active, OpenSent, OpenConfirm, or Established.
- BGP RIB (Routing Information Base) — stores every path received from every BGP peer for every prefix, before any selection algorithm is applied. This is sometimes called the Adj-RIB-In.
- IP Routing Table — receives only the single best BGP path for each prefix after the path selection algorithm completes. Only the best path is installed here and used for forwarding.
How BGP Path Selection Works
When multiple paths to the same destination prefix are present in the BGP RIB, the router must select exactly one best path to install in the routing table and advertise to peers. Cisco's BGP implementation evaluates eleven attributes in strict sequential order — the algorithm stops at the first attribute where one path is clearly preferred over the others.
A useful mnemonic for the order is W-L-L-A-O-M-E-I-O-R-N, but most experienced engineers simply memorize the list through repeated exposure to production troubleshooting scenarios.
Step 1 — Weight (Highest Wins)
Weight is a Cisco-proprietary attribute that is entirely local to the router — it is never communicated to any BGP peer. Routes learned from a neighbor have a default weight of 0. Routes originated locally (via the
networkcommand or redistribution) have a default weight of 32768. Weight is the first and most powerful tiebreaker precisely because it is local: it lets a single router override global BGP policy without affecting any other device in the network.
! Prefer all routes from 172.16.10.1 over 172.16.20.1
router bgp 65001
neighbor 172.16.10.1 weight 200
neighbor 172.16.20.1 weight 100Step 2 — Local Preference (Highest Wins)
Local Preference is shared among all iBGP routers within the same AS. Its default value is 100. A higher Local Preference indicates a more preferred exit point from the AS. When your AS has two upstream providers and you want every router in the AS to agree on which one to use for outbound traffic, Local Preference is the correct tool. It is set inbound on the receiving edge router and propagated to all iBGP peers.
route-map ISP-A-IN permit 10
description Mark ISP-A as preferred exit
set local-preference 200
!
route-map ISP-B-IN permit 10
description Mark ISP-B as backup exit
set local-preference 100
!
router bgp 65001
neighbor 172.16.10.1 route-map ISP-A-IN in
neighbor 172.16.20.1 route-map ISP-B-IN inStep 3 — Locally Originated Routes (Prefer Local)
Routes originated by the local router — injected via
network, redistribution, or aggregation — are preferred over routes learned from any BGP peer. This step ensures a router always trusts its own origination over external advertisements for the same prefix.
Step 4 — AS Path Length (Shortest Wins)
BGP prefers the route with the fewest autonomous system hops in its AS_PATH attribute. Each time a route crosses an AS boundary, the traversed AS number is prepended to the path. AS Path prepending is a widely used traffic engineering technique: by artificially prepending your own AS number multiple times when advertising to a specific provider, you make that path appear longer and therefore less attractive to remote networks — steering inbound traffic toward your preferred ingress link.
! Make routes via ISP-B appear 3 hops longer
route-map PREPEND-TO-ISP-B permit 10
set as-path prepend 65001 65001 65001
!
router bgp 65001
address-family ipv4
neighbor 172.16.20.1 route-map PREPEND-TO-ISP-B outStep 5 — Origin Code (Lowest Wins)
The Origin attribute describes how a prefix entered BGP. In order of preference:
- IGP (i) — injected via the
network
command. Most preferred. - EGP (e) — learned from the now-obsolete Exterior Gateway Protocol. Extremely rare in modern networks.
- Incomplete (?) — injected via redistribution from another protocol. Least preferred.
In practice, Origin rarely acts as a tiebreaker in well-designed networks, but it can cause unexpected path selection when routes are redistributed rather than originated cleanly.
Step 6 — MED / Metric (Lowest Wins)
The Multi-Exit Discriminator (MED) is advertised to external peers to suggest the preferred entry point into your AS when multiple physical connections exist between the same two ASes. Unlike Local Preference — which controls outbound traffic — MED influences inbound traffic direction. By default, Cisco IOS-XE only compares MED values between routes received from the same neighboring AS. Enabling
bgp always-compare-medforces cross-AS MED comparison, but this is considered unsafe in most environments as it can produce non-deterministic routing.
! Advertise lower MED on preferred ingress link
route-map SET-MED-LOW permit 10
set metric 50
!
route-map SET-MED-HIGH permit 10
set metric 200
!
router bgp 65001
neighbor 172.16.10.1 route-map SET-MED-LOW out
neighbor 172.16.20.1 route-map SET-MED-HIGH outStep 7 — eBGP over iBGP
If all previous attributes are equal, BGP prefers a route learned from an external BGP peer over a route learned from an internal BGP peer. This reflects the principle that an external AS's direct advertisement of a prefix is more authoritative than a route that has been reflected through internal BGP infrastructure.
Step 8 — Lowest IGP Metric to Next-Hop
When two eBGP paths (or two iBGP paths) remain tied after all previous steps, BGP selects the path whose BGP next-hop is reachable via the lowest IGP cost. This step ties BGP path selection directly to your underlying IGP — OSPF, EIGRP, or IS-IS metric tuning becomes another lever for BGP traffic engineering in multi-homed data center or campus designs.
Step 9 — Oldest eBGP Route
To maximize routing stability, if two external BGP paths remain equivalent, BGP prefers the path that has been present in the BGP table the longest. This prevents unnecessary traffic shifts every time a new equivalent eBGP path is learned, prioritizing stability over churn.
Step 10 — Lowest BGP Router ID
The BGP Router ID is a 32-bit value displayed in dotted-decimal notation used to uniquely identify a BGP speaker. On Cisco IOS-XE it defaults to the highest loopback IP or the highest physical interface IP if no loopback exists, but should always be set explicitly. When paths are still tied at this step, the route from the peer with the numerically lowest Router ID is selected.
router bgp 65001
bgp router-id 10.255.0.1Step 11 — Lowest Neighbor IP Address
The final tiebreaker. If two paths still cannot be distinguished — possible in certain confederation or multi-session scenarios where peers share the same Router ID — BGP selects the path from the neighbor with the lowest IP address. Reaching this step is rare and typically indicates a design issue worth investigating.
Why BGP Matters Beyond the Internet
BGP is not exclusively an internet protocol. It has become the routing protocol of choice for large-scale enterprise and data center environments. Cisco's EVPN/VXLAN fabric designs use BGP as the control plane to distribute MAC and IP reachability information across leaf and spine switches. SD-WAN overlays use BGP to exchange reachability between branch sites and regional hubs. Cloud interconnect solutions — AWS Direct Connect, Azure ExpressRoute, Google Cloud Interconnect — all require BGP sessions to exchange routes between enterprise networks and cloud providers.
For organizations with multiple ISP connections, BGP provides the policy controls to:
- Prefer one provider for all outbound traffic while maintaining a second as a hot standby, using Local Preference or Weight
- Influence which provider's network carries inbound traffic for specific prefixes, using MED or AS Path prepending
- Apply granular routing policies with prefix-lists and route-maps on a per-neighbor or per-prefix basis
- Achieve automatic failover when a BGP session drops, without manual intervention
Real-World Dual-Homed BGP Configuration
The following complete configuration demonstrates a dual-homed BGP deployment on sw-infrarunbook-01, a Cisco IOS-XE edge router connected to two upstream providers. ISP-A (AS 64500) is the preferred path. ISP-B (AS 64501) is the warm standby.
! sw-infrarunbook-01 — Dual-homed BGP edge configuration
! Local AS: 65001 | BGP Router-ID / Loopback0: 10.255.0.1/32
! Advertised prefix: 192.168.100.0/24
ip prefix-list DEFAULT-ROUTE-ONLY seq 5 permit 0.0.0.0/0
route-map ISP-A-IN permit 10
description ISP-A is preferred exit — set high local-pref
set local-preference 200
!
route-map ISP-B-IN permit 10
description ISP-B is backup exit — set low local-pref
set local-preference 100
!
route-map PREPEND-TO-ISP-B permit 10
description Make our prefix less attractive via ISP-B
set as-path prepend 65001 65001 65001
!
route-map PREPEND-TO-ISP-B permit 20
description Allow all other prefixes unchanged
!
router bgp 65001
bgp router-id 10.255.0.1
bgp log-neighbor-changes
no bgp default ipv4-unicast
!
neighbor 172.16.10.1 remote-as 64500
neighbor 172.16.10.1 description ISP-A-PEER
neighbor 172.16.10.1 update-source Loopback0
!
neighbor 172.16.20.1 remote-as 64501
neighbor 172.16.20.1 description ISP-B-PEER
neighbor 172.16.20.1 update-source Loopback0
!
address-family ipv4
network 192.168.100.0 mask 255.255.255.0
!
neighbor 172.16.10.1 activate
neighbor 172.16.10.1 prefix-list DEFAULT-ROUTE-ONLY in
neighbor 172.16.10.1 route-map ISP-A-IN in
neighbor 172.16.10.1 send-community
!
neighbor 172.16.20.1 activate
neighbor 172.16.20.1 prefix-list DEFAULT-ROUTE-ONLY in
neighbor 172.16.20.1 route-map ISP-B-IN in
neighbor 172.16.20.1 route-map PREPEND-TO-ISP-B out
neighbor 172.16.20.1 send-community
exit-address-familyVerifying BGP Session State and Path Selection
sw-infrarunbook-01# show bgp summary
BGP router identifier 10.255.0.1, local AS number 65001
BGP table version is 14, main routing table version 14
2 network entries using 288 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
172.16.10.1 4 64500 512 510 14 0 0 05:12:44 1
172.16.20.1 4 64501 509 508 14 0 0 05:11:19 1
sw-infrarunbook-01# show bgp ipv4 unicast 0.0.0.0
BGP routing table entry for 0.0.0.0/0, version 4
Paths: (2 available, best #1, table default)
Advertised to update-groups:
1
64500
172.16.10.1 from 172.16.10.1 (172.16.10.1)
Origin IGP, metric 0, localpref 200, valid, external, best
Last update: 05:12:44 ago
64501
172.16.20.1 from 172.16.20.1 (172.16.20.1)
Origin IGP, metric 0, localpref 100, valid, external
Last update: 05:11:19 agoThe output confirms the default route via ISP-A (172.16.10.1) is marked best due to its Local Preference of 200 versus ISP-B's 100. The ISP-B path remains in the BGP table as a standby and is automatically promoted if the ISP-A session drops — BGP will withdraw the best path and re-evaluate, installing the ISP-B route in milliseconds (faster if BFD is configured alongside BGP).
Common BGP Misconceptions
Misconception 1: BGP is only relevant for ISPs.
Reality: BGP is the control-plane protocol of choice for modern data center fabrics (EVPN/VXLAN), SD-WAN overlays, cloud interconnects, and multi-homed enterprises. Network engineers encounter BGP far outside ISP environments today.
Misconception 2: A shorter AS Path always means a better physical path.
Reality: AS Path length counts administrative hops, not physical links or latency. A two-hop AS path could traverse slow or congested inter-AS links while a four-hop path uses high-capacity, low-latency infrastructure. BGP has no inherent awareness of link quality below the AS level.
Misconception 3: MED controls your outbound traffic.
Reality: MED is advertised to neighboring ASes to influence their routing decisions about how to send traffic into your AS. It controls inbound traffic. Local Preference is what controls outbound traffic selection within your own AS.
Misconception 4: BGP automatically fails over as quickly as OSPF.
Reality: BGP is deliberately conservative about convergence. Default hold timers are 180 seconds with keepalives at 60 seconds. BGP prioritizes stability over speed. Fast failure detection requires explicit configuration of BFD (Bidirectional Forwarding Detection) alongside BGP sessions.
Misconception 5: iBGP and eBGP handle next-hop the same way.
Reality: eBGP changes the next-hop to the advertising router's own address. iBGP does not modify the next-hop by default — it preserves the original eBGP peer's address. iBGP peers must reach that external next-hop via the IGP, or the edge router must configure
next-hop-selfto replace it with its own reachable address.
Frequently Asked Questions
Q: What TCP port does BGP use and why does it matter for firewall policy?
A: BGP uses TCP port 179. When configuring firewall rules or ACLs between BGP peers, you must permit TCP port 179 in both directions — one peer initiates the connection (destination port 179) and the other responds (source port 179). Blocking this port is one of the most common reasons BGP sessions fail to establish in new deployments.
Q: What is a BGP Autonomous System Number and how are they obtained?
A: An ASN is a unique identifier assigned to an autonomous system. ASNs were originally 16-bit (1–65535) but are now 32-bit (RFC 4893). IANA allocates ASN ranges to Regional Internet Registries (ARIN, RIPE, APNIC, etc.), which distribute them to organizations. Private ASNs — 64512 through 65534 for 16-bit, and 4200000000 through 4294967294 for 32-bit — are used for internal BGP deployments and should be stripped before advertising to public peers.
Q: What is the difference between Local Preference and MED, and which one should I use for traffic engineering?
A: Use Local Preference to control which link your AS uses for outbound traffic — it is shared among all iBGP peers within your AS and a higher value wins. Use MED to influence how neighboring ASes send traffic inbound into your AS — it is advertised externally and a lower value wins. For dual-homed enterprise scenarios, Local Preference handles outbound path selection and AS Path prepending or MED handles inbound traffic steering.
Q: Why does iBGP require a full mesh or route reflectors, and what breaks without it?
A: The iBGP split-horizon rule prevents a router from re-advertising a route it learned from one iBGP peer to another iBGP peer. Without a full mesh, some routers in the AS will never receive certain BGP prefixes and will have black holes or missing routes. Route Reflectors solve this at scale by designating one or more routers to redistribute iBGP routes to all clients, relaxing the split-horizon rule only for the reflector.
Q: What is next-hop-self and when is it required?
A: When an iBGP router advertises a route learned from an eBGP peer, it preserves the original external next-hop IP. If internal routers cannot reach that external IP via the IGP, they cannot forward traffic for those prefixes. Configuring
neighbor X.X.X.X next-hop-selfon the edge router causes it to replace the external next-hop with its own loopback or interface IP — which internal routers can reach. This is almost always required unless you redistribute external next-hops into your IGP.
Q: What are BGP communities and how do enterprises use them?
A: BGP communities are 32-bit tags attached to route advertisements. They allow networks to signal routing policy without complex per-prefix configurations. Well-known communities include NO_EXPORT (do not advertise this route outside the AS) and NO_ADVERTISE (do not advertise to any peer). ISPs offer customer-defined communities that trigger actions like adjusting Local Preference on the ISP's routers, enabling enterprises to influence inbound traffic engineering without managing the ISP's infrastructure directly.
Q: How do I enable BGP ECMP for load balancing across multiple paths?
A: By default, BGP installs only one best path in the routing table. The
maximum-pathscommand enables equal-cost multipath load balancing. For iBGP paths, use
maximum-paths ibgp. Paths must be equal across all BGP attributes evaluated before the ECMP decision point to qualify. In data center spine-leaf designs, BGP ECMP is commonly set to 64 or higher to distribute traffic across all available uplinks.
router bgp 65001
address-family ipv4
maximum-paths 8
maximum-paths ibgp 8Q: What is BGP route dampening and should I use it?
A: Route dampening (RFC 2439) penalizes unstable BGP routes by suppressing them after repeated flaps. Each withdrawal increments a penalty; when the penalty exceeds the suppress-limit, the route is hidden. The penalty decays exponentially and the route returns when it falls below the reuse-limit. While dampening improves network stability during outages, modern guidance (RFC 7196) recommends against it on internet-facing BGP sessions because it can delay legitimate route recovery significantly. It remains useful on customer-facing edges where flapping customer routes are isolated.
Q: What happens when the BGP hold timer expires?
A: If a BGP router does not receive a KEEPALIVE or UPDATE from its peer within the hold timer interval (default 180 seconds on Cisco IOS-XE), it sends a NOTIFICATION message with error code Hold Timer Expired, tears down the TCP session, and withdraws all routes learned from that peer. All traffic using those routes will black-hole until BGP reconverges. To reduce this risk, deploy BFD alongside BGP for sub-second link failure detection, allowing BGP to tear down the session immediately rather than waiting for the hold timer.
Q: How does BGP prevent routing loops?
A: BGP's primary loop prevention mechanism is the AS_PATH attribute. When a router receives a BGP UPDATE, it inspects the AS_PATH list. If its own AS number appears anywhere in the path, the route is silently discarded — this indicates the route has already traversed this AS and accepting it would create a loop. For iBGP, the split-horizon rule prevents routes from being re-advertised between iBGP peers. These two mechanisms together ensure loop-free routing both within an AS and across AS boundaries.
Q: What is the difference between a BGP hard reset and a soft reset, and which should I use in production?
A: A hard reset (
clear ip bgp 172.16.10.1) tears down the TCP session, drops all routes, and forces a full re-establishment — causing a brief traffic outage for paths using those routes. A soft reset (
clear ip bgp 172.16.10.1 soft) re-applies route maps and prefix lists without dropping the session or causing traffic loss. Inbound soft resets require either the BGP Route Refresh capability (RFC 2918, supported by all modern Cisco IOS-XE versions) or
neighbor soft-reconfiguration inboundconfigured beforehand. Always use soft resets in production. Reserve hard resets for situations where the session itself is in a broken state and will not recover on its own.
Q: How do I verify which BGP path is selected and why a specific path won?
A: Use
show bgp ipv4 unicast [prefix]to see all paths for a prefix and which one is marked best. The output includes all relevant attributes — localpref, weight, AS path, origin, MED — allowing you to trace exactly which step in the selection algorithm determined the winner. For deeper troubleshooting,
show bgp ipv4 unicast [prefix] bestpath-reasonon newer IOS-XE releases explicitly states the reason the best path was selected. Pair this with
debug ip bgp [neighbor] updatescautiously in production to trace incoming and outgoing UPDATE messages.
