Wednesday, January 25, 2012

BGP Split-Horizon and Full-Mesh IBGP Neighbors

BGP was originally intended to run along the borders of an AS, with the routers in the middle of the AS ignore of the details of BGP – hence the name “Border Gateway Protocol”. A transit AS is an AS that routes traffic from an external AS to another external AS. Typically, transit ASes are ISPs. All routers in a transit AS must have complete knowledge of external routes. One way to achieve this goal is to redistribute BGP routes into an IGP at the edge routers; however, this approach introduced many problems.

In 1994 the size of the Internet routing table was only about 4 to 8MB, so BGP could be redistributed into the local IGP, eg: EIGRP and OSPF. The edge routers running BGP would hold the full Internet routing table; and the routers in the middle that are running only the IGP, would not incur the overhead of running BGP but would still know about all the external routes.

As the current Internet routing table is very large, redistributing all the BGP routes into an IGP is not a scalable way for the interior routers within an AS to learn about the external networks. Running full-mesh IBGP within the AS is a viable alternative.

The BGP split-horizon rule governs the route advertisements between IBGP peers, which specifies that routes learn via IBGP are never propagated to other IBGP peers.
Note: The BGP split-horizon rule is slightly different that the split-horizon rule as in the distance vector routing protocols.
Note: Regular split-horizon rule still govern the route advertisements between EBGP peers, in which a route is not advertised back to the EBGP peer from which the route was received.

BGP Split-Horizon Rule

The BGP split-horizon rule prevents RT2 from propagating routes learned from RT1 to RT3. Similar to the split-horizon rule in the distance-vector routing protocols, BGP split-horizon is necessary to ensure that routing loops are not started within an AS. As a result, full-mesh IBGP peering is required within an AS for all the routers within the AS to learn about the BGP routes.

IGPs form adjacency and exchange routing information with directly connected neighbors. IGPs use broadcasts or multicasts to propagate topology changes across an AS. All IGP routers within an AS must be running the same routing protocol to handle the routing updates and maintain the same information for consistent routing operation.

BGP does not work in the same manner as IGPs. As the designers of BGP could not guarantee that an AS would run BGP on all its routers, a method had to be developed to ensure that IBGP routers could pass routing updates between them. By fully meshing all IBGP neighbors, when an update is received from an external AS, the EBGP router that is interfacing with the external AS is responsible for directly informing of all its IBGP neighbors regarding the change. IBGP neighbors that receive this update do not propagate it to any other IBGP neighbor, as they assume that the IBGP neighbor that originated the update is fully-meshed with all other IBGP neighbors and has sent the update directly to every IBGP neighbor.

The main reason that an AS needs to fully mesh its IBGP neighbors is due to the BGP split-horizon rule that prevents routing loops or routing black holes. If the originating IBGP router is not fully-meshed with every IBGP neighbor, the IBGP neighbors that are not peering with the originating IBGP router will have different IP routing tables than the IBGP neighbors that are peering with the IBGP router that received the original BGP update from the external AS. The inconsistent routing tables can cause routing loops or routing black holes.

TCP sessions cannot be multicast or broadcast because TCP has to ensure reliable delivery of packets to each recipient. Since TCP cannot use broadcasting, BGP cannot use it either; therefore BGP has to setup fully-meshed TCP sessions among the IBGP neighbors.

Partial-Mesh IBGP and Full-Mesh IBGP

Figure 13-6A shows IBGP update behavior in a partially-meshed neighbor environment. RT2 receives a BGP update from RT1. RT2 has 2 IBGP neighbors – RT3 and RT4, but does not have an IBGP neighbor session with RT5. RT3 and RT4 are able learn about the networks that were added and withdrawn on RT1. Even if RT3 and RT4 have IBGP neighbor sessions with RT5, they assume that the AS is fully-meshed for IBGP and therefore do not propagate the update to RT5 due to the BGP split-horizon rule. Sending IBGP updates to RT5 is the responsibility of RT2, as it is the router that obtains firsthand knowledge about the networks in and beyond AS 65001. RT5 does not learn of any networks through RT2 and therefore does not use RT2 to reach any networks in AS 65001 and other ASes behind AS 65001.

In Figure 13-6B, IBGP is fully-meshed between BGP routers in AS 65002. When an IBGP neighbor receives an update from an EBGP neighbor, the router will send the update to every IBGP neighbor in the AS. The update is sent only once to each IBGP neighbor and is not being replicated by any other IBGP neighbor.

Each IBGP neighbor needs to know all of the other IBGP neighbors in the same AS so that it can have a complete knowledge of how to exit the AS. When all BGP routers in an AS are fully-meshed and have the same database when a consistent routing policy, they will be able to apply the same path-selection formula with the path-selection results uniform across the AS, which means no routing loops and there is a consistent policy for existing and entering the AS.

Routing Loop without Full-Mesh IBGP

RT1, RT2, RT5, and RT6 are the only ones running BGP. An IBGP session has been established between RT2 and RT5; and EBGP sessions are established between RT1 – RT2 and RT5 – RT6. RT3 and RT4 are not running BGP. RT2, RT3, RT4, and RT5 are also running OSPF as their IGP.

Network 172.16.0.0/16 is owned by AS 65001 and is advertised by RT1 to RT2 via EBGP. RT2 advertises it to RT5 via IBGP. RT3 and RT4 never learn about this network as it is not being redistributed into the local routing protocol – OSPF, and RT3 and RT4 are not running BGP. If RT5 advertises this network to RT6 in AS 65003, and RT6 starts forwarding packets to 172.16.0.0/16 through AS 65002, where will RT5 forwards the packets to reach RT2?
If RT5 forwards packets with the destination address of 172.16.1.1 to either RT3 or RT4, they do not have an entry for 172.16.0.0/16 in their routing tables; therefore discard the packets.
If RT3 and RT4 have a default route towards the exit points of the AS – RT2 and RT5, there is a high possibility that when RT5 sends a destined to 172.16.0.0/16 to RT3 or RT4, they might send it back to RT5, which forwards it again to RT3 or RT4, causing a routing loop.
If BGP is fully-meshed and RT3 and RT4 are aware of network 172.16.0.0/16 from RT2, this problem does not occur.

AS 65002 in Figure 13-7 is responsible for moving packets between AS 65001 and AS 65003, much as an ISP would. AS 65002 (and any ISP network) is a transit AS, responsible for passing packets from one AS to another. Many ASes have multiple connections to the Internet but do not use their bandwidth to transport packets of other ASes; these ASes are called stub ASes. Most enterprise ASes connected to the Internet are stub ASes. An ISP must be configured as a transit AS by running BGP on all of its routers and fully meshing the IBGP sessions so that packets transiting the AS can reach networks and other ASes on the other sides of the transit AS.

5 comments:

  1. Thank you ! Wonderfull explanation

    ReplyDelete
  2. Thank you. very well explained

    ReplyDelete
  3. Thank you. Very well explained.

    ReplyDelete