Re: BGP Path Selection weirdness regarding next hops

From: Marko Milivojevic <markom_at_ipexpert.com>
Date: Fri, 30 Nov 2012 19:02:16 -0800

I knew it was a good guess. That's one of my favorites with BGP. It
gets people unawares all the time :-).

Now, I think Cisco is well within their rights not to touch that part
of the documentation. The next-hop is *usually* reachable via IGP.
There are very rare circumstances when the next-hop is reachable via
BGP *and* is valid for more than hold-down. It seems like you hit one
of those :-)

Fun.

--
Marko Milivojevic - CCIE #18427 (SP R&S)
Senior CCIE Instructor - IPexpert
On Fri, Nov 30, 2012 at 6:55 PM, John Neiberger <jneiberger_at_gmail.com> wrote:
> You are correct! I just did a test by creating a route map to bump up the
> MED of the prefix in question and it changed the behavior. That proved that
> even though one path now doesn't have an IGP metric to compare, it's still
> being compared. Maybe Cisco needs to change their documentation to say that
> one of the steps is to compare the metrics, not just "IGP metrics".  :-)
>
> Thanks!
> John
>
>
> On Fri, Nov 30, 2012 at 7:37 PM, Marko Milivojevic <markom_at_ipexpert.com>
> wrote:
>>
>> Without going any deeper (some topology information is missing and m
>> pod is otherwise busy to try this, no matter how FUN it sounds), I'd
>> venture a guess that yes, "igp" metric is compared.
>>
>> The "igp metric" in this sense is really "the metric to reach the
>> protocol, no matter what that protocol might be". In your case, one of
>> these protocols happens to be BGP. You may want to test this hypotesis
>> by tweaking the BGP's MED value for the default route to make it
>> numerically higher than OSPF cost to reach the next-hop of the other
>> route.
>>
>> Funnily enough, this is one of the few places where numerical metric
>> values of different protocols are directly compared, regardless of the
>> AD and/or longest-match.
>>
>> --
>> Marko Milivojevic - CCIE #18427 (SP R&S)
>> Senior CCIE Instructor - IPexpert
>>
>> On Fri, Nov 30, 2012 at 6:21 PM, John Neiberger <jneiberger_at_gmail.com>
>> wrote:
>> > I posted this question to the Cisco NSP list and I've also talked to a
>> > couple of guys from Cisco Advanced Services and I'm still stumped about
>> > something. I'll try my best to phrase it in a way that makes sense.
>> >
>> > Router A is learning about a prefix from two route reflector clients. In
>> > both cases, the next hop for the prefix is the loopback address of the
>> > advertising routers. Their loopback addresses are being advertised into
>> > OSPF.
>> >
>> > So, from the perspective of Router A, it's BGP table for this prefix has
>> > two paths:
>> >
>> > 1: 4.4.4.4  (loopback address of Router B, learned via OSPF) * winner
>> > due
>> > to lower IGP metric
>> > 2. 5.5.5.5 (loopback address of Router C, learned via OSPF)
>> >
>> > Now for the weirdness to begin. A network event occurs that causes the
>> > loopback address of Router C to go away. This shouldn't affect Router A
>> > because it is already selecting the shortest path to the network via
>> > Router
>> > B (4.4.4.4).
>> >
>> > However, Router A is also learning a default via BGP. That means that
>> > even
>> > though 5.5.5.5 (loopback of Router C) disappeared and is unreachable,
>> > the
>> > router is doing a recursive lookup and keeps the path in the BGP table;
>> > 5.5.5.5 is still reachable, it thinks, by using the default route.
>> >
>> > The weird thing is that this causes Router A to start using the wrong
>> > path!
>> > It seems to be preferring a path with a next hop learned via BGP to a
>> > path
>> > with a next hop learned via OSPF. Why would it do this? I see no
>> > documentation that would explain why a BGP-learned next hop is preferred
>> > over an IGP-learned next hop.
>> >
>> > Is the router still comparing IGP metrics even though the "wrong" path
>> > now
>> > has no IGP metric?
>> >
>> > It's not changing due to router ID, cluster length, or neighbor IP
>> > address.
>> > I checked. So, why is it switching?
>> >
>> > As soon as the BGP session from Router A to Router C times out, the
>> > extraneous path gets removed from the BGP table and the router goes back
>> > to
>> > using the correct path it should have been using all along.
>> >
>> > So, is a BGP-learned next hop preferred over an IGP-learned next hop? If
>> > so, why? If not, any idea why my router switches paths? I've turned on
>> > BGP
>> > debugging and IP routing debugging and haven't found a suitable
>> > explanation
>> > for the switch.
>> >
>> > John
>> >
>> >
>> > Blogs and organic groups at http://www.ccie.net
>> >
>> > _______________________________________________________________________
>> > Subscription information may be found at:
>> > http://www.groupstudy.com/list/CCIELab.html
Blogs and organic groups at http://www.ccie.net
Received on Fri Nov 30 2012 - 19:02:16 ART

This archive was generated by hypermail 2.2.0 : Tue Jan 01 2013 - 09:36:52 ART