Re: BGP Path Selection weirdness regarding next hops

From: Marko Milivojevic <markom_at_ipexpert.com>
Date: Fri, 30 Nov 2012 19:11:48 -0800

It's fun isn't it :-)

--
Marko Milivojevic - CCIE #18427 (SP R&S)
Senior CCIE Instructor - IPexpert
On Fri, Nov 30, 2012 at 7:07 PM, Yuri Bank <yuribank_at_gmail.com> wrote:
> So you increased the MED of the default route you're receiving? I find it
> interesting that its the actual metric of each protocol being compared,
> regardless of the prefix-length or AD.
>
> -Yuri
>
>
> On Fri, Nov 30, 2012 at 7:02 PM, Marko Milivojevic <markom_at_ipexpert.com>
> wrote:
>>
>> I knew it was a good guess. That's one of my favorites with BGP. It
>> gets people unawares all the time :-).
>>
>> Now, I think Cisco is well within their rights not to touch that part
>> of the documentation. The next-hop is *usually* reachable via IGP.
>> There are very rare circumstances when the next-hop is reachable via
>> BGP *and* is valid for more than hold-down. It seems like you hit one
>> of those :-)
>>
>> Fun.
>>
>> --
>> Marko Milivojevic - CCIE #18427 (SP R&S)
>> Senior CCIE Instructor - IPexpert
>>
>> On Fri, Nov 30, 2012 at 6:55 PM, John Neiberger <jneiberger_at_gmail.com>
>> wrote:
>> > You are correct! I just did a test by creating a route map to bump up
>> > the
>> > MED of the prefix in question and it changed the behavior. That proved
>> > that
>> > even though one path now doesn't have an IGP metric to compare, it's
>> > still
>> > being compared. Maybe Cisco needs to change their documentation to say
>> > that
>> > one of the steps is to compare the metrics, not just "IGP metrics".  :-)
>> >
>> > Thanks!
>> > John
>> >
>> >
>> > On Fri, Nov 30, 2012 at 7:37 PM, Marko Milivojevic <markom_at_ipexpert.com>
>> > wrote:
>> >>
>> >> Without going any deeper (some topology information is missing and m
>> >> pod is otherwise busy to try this, no matter how FUN it sounds), I'd
>> >> venture a guess that yes, "igp" metric is compared.
>> >>
>> >> The "igp metric" in this sense is really "the metric to reach the
>> >> protocol, no matter what that protocol might be". In your case, one of
>> >> these protocols happens to be BGP. You may want to test this hypotesis
>> >> by tweaking the BGP's MED value for the default route to make it
>> >> numerically higher than OSPF cost to reach the next-hop of the other
>> >> route.
>> >>
>> >> Funnily enough, this is one of the few places where numerical metric
>> >> values of different protocols are directly compared, regardless of the
>> >> AD and/or longest-match.
>> >>
>> >> --
>> >> Marko Milivojevic - CCIE #18427 (SP R&S)
>> >> Senior CCIE Instructor - IPexpert
>> >>
>> >> On Fri, Nov 30, 2012 at 6:21 PM, John Neiberger <jneiberger_at_gmail.com>
>> >> wrote:
>> >> > I posted this question to the Cisco NSP list and I've also talked to
>> >> > a
>> >> > couple of guys from Cisco Advanced Services and I'm still stumped
>> >> > about
>> >> > something. I'll try my best to phrase it in a way that makes sense.
>> >> >
>> >> > Router A is learning about a prefix from two route reflector clients.
>> >> > In
>> >> > both cases, the next hop for the prefix is the loopback address of
>> >> > the
>> >> > advertising routers. Their loopback addresses are being advertised
>> >> > into
>> >> > OSPF.
>> >> >
>> >> > So, from the perspective of Router A, it's BGP table for this prefix
>> >> > has
>> >> > two paths:
>> >> >
>> >> > 1: 4.4.4.4  (loopback address of Router B, learned via OSPF) * winner
>> >> > due
>> >> > to lower IGP metric
>> >> > 2. 5.5.5.5 (loopback address of Router C, learned via OSPF)
>> >> >
>> >> > Now for the weirdness to begin. A network event occurs that causes
>> >> > the
>> >> > loopback address of Router C to go away. This shouldn't affect Router
>> >> > A
>> >> > because it is already selecting the shortest path to the network via
>> >> > Router
>> >> > B (4.4.4.4).
>> >> >
>> >> > However, Router A is also learning a default via BGP. That means that
>> >> > even
>> >> > though 5.5.5.5 (loopback of Router C) disappeared and is unreachable,
>> >> > the
>> >> > router is doing a recursive lookup and keeps the path in the BGP
>> >> > table;
>> >> > 5.5.5.5 is still reachable, it thinks, by using the default route.
>> >> >
>> >> > The weird thing is that this causes Router A to start using the wrong
>> >> > path!
>> >> > It seems to be preferring a path with a next hop learned via BGP to a
>> >> > path
>> >> > with a next hop learned via OSPF. Why would it do this? I see no
>> >> > documentation that would explain why a BGP-learned next hop is
>> >> > preferred
>> >> > over an IGP-learned next hop.
>> >> >
>> >> > Is the router still comparing IGP metrics even though the "wrong"
>> >> > path
>> >> > now
>> >> > has no IGP metric?
>> >> >
>> >> > It's not changing due to router ID, cluster length, or neighbor IP
>> >> > address.
>> >> > I checked. So, why is it switching?
>> >> >
>> >> > As soon as the BGP session from Router A to Router C times out, the
>> >> > extraneous path gets removed from the BGP table and the router goes
>> >> > back
>> >> > to
>> >> > using the correct path it should have been using all along.
>> >> >
>> >> > So, is a BGP-learned next hop preferred over an IGP-learned next hop?
>> >> > If
>> >> > so, why? If not, any idea why my router switches paths? I've turned
>> >> > on
>> >> > BGP
>> >> > debugging and IP routing debugging and haven't found a suitable
>> >> > explanation
>> >> > for the switch.
>> >> >
>> >> > John
>> >> >
>> >> >
>> >> > Blogs and organic groups at http://www.ccie.net
>> >> >
>> >> >
>> >> > _______________________________________________________________________
>> >> > Subscription information may be found at:
>> >> > http://www.groupstudy.com/list/CCIELab.html
>>
>>
>> Blogs and organic groups at http://www.ccie.net
>>
>> _______________________________________________________________________
>> Subscription information may be found at:
>> http://www.groupstudy.com/list/CCIELab.html
Blogs and organic groups at http://www.ccie.net
Received on Fri Nov 30 2012 - 19:11:48 ART

This archive was generated by hypermail 2.2.0 : Tue Jan 01 2013 - 09:36:52 ART