Great write up Pavel. Thank you for clearing these details up!
Joe
From: Pavel Bykov [mailto:slidersv_at_gmail.com]
Sent: Monday, October 31, 2011 05:21 AM
To: Sajjad Najafizadeh <najafizadeh_at_gmail.com>
Cc: Yuri Bank <yuribank_at_gmail.com>; Radioactive Frog <pbhatkoti_at_gmail.com>; Joseph L. Brunner; Cisco certification <ccielab_at_groupstudy.com>
Subject: Re: High CPU load on 7609
Hi.
PBR is supported in the hardware on 7600/6500, but only with a very specific configuration. All other config, that is not stated in the documentation, will result in CPU punting, as was mentioned.
It is really important to make sure that you use as much hardware as possible on those boxes, because it is easy to get carried away and think of the platform as too versatile, and overload your weak 600MHz SR71000 CPU.
In this case it was pretty straigh forward - you knew what was the cause of the problem.
In cases where its not that straightforward, you can realize that it's not the process level load that is the problem, but the interrupt level (73% of CPU load in your "sh proc cpu" output came from interrupts). Interrupt level CPU load is either cause by security breaches, malfunctions (e.g. bad IP checksum from other device), or configuration issues.
To protect the CPU from extensive load, you can use control plane protection in a form of 10 available hardware policers, that prtect 1G pipes leading to the RP and SP cpus. Software COPP does not really do the trick in this case, as it is more for Process-Level intensive operations (e.g. good for SNMP, BGP, TELNET, FTP etc). But for floods, COPP will put as much load on CPU (as it is a software policer) as dispatching the packet - which defeats the purpose of COPP.
In any case, if you really want to know what is your box using the CPU for, you can easily SPAN the pipe to RP and SP CPU in-band, so you'll know what exactly your CPU is working on, and based on that information decide how you can protect it.
What you did on the end to reduce the CPU workload doesn't seem bad, so I'm not sure why you're dissapointed. 6500/7600 is not a software platform, so the functionality is fairly limited to PFC hardware capabilities. As I said, PBR is possible in hardware on these platforms, but only with a very specific command set.
P.S.: CPU on 6500/7600 will never be able to handle more than 1G of packets, regardless of optimizations, as that is the speed of CPU interface on PORT-ASIC.
On Thu, Oct 27, 2011 at 7:36 PM, Sajjad Najafizadeh <najafizadeh_at_gmail.com<mailto:najafizadeh_at_gmail.com>> wrote:
Friends,
I have traffic from other network that need to be sent to some packet
analyzer ( About 2 gbps ) then the packet analyser send it back to same
router ( 7609) , first I've used PBR but the CPU load goes very high , I
though VRF might help , I've changed PBR to VRF , same issue , 80% + cpu
load , drop down the traffic to 1gbps , downgrade the IOS from 12.2(33)
to 12.2(18) , no luck , same thing ...
The only idea that worked for me was I removed the VRF and put IP of CRF to
packet analyser , and make L2 link from other network to packet analyzer L2
through 7600 , it reduced CPU load from 80%+ to 30-40%.
really disappointed ...
Regards
On Thu, Oct 27, 2011 at 8:52 PM, Yuri Bank <yuribank_at_gmail.com<mailto:yuribank_at_gmail.com>> wrote:
> So is this a result of using PBR in VRFs? Or will using PBR period, cause
> all packets ( on that interface ) to hit the CPU?
>
> On Thu, Oct 27, 2011 at 5:16 AM, Radioactive Frog <pbhatkoti_at_gmail.com<mailto:pbhatkoti_at_gmail.com>>wrote:
>
>> >>>I cant believe given the state of the world that cisco is still selling
>> products where a feature that can be configured can take slow down the
>> system that much.
>>
>> Totally agreed Joseph.
>> I'd expect at least it captured somewhere in the 'show proc cpu' command
>> so
>> that u can see what is causing it. once it's PBR'd (L3) , all packets are
>> gonna punt onto the CPU.
>> May be it's documented somewhere and we don't know that secret location on
>> CCO yet :)
>>
>> Sajjad -
>> >>>The only workaround that i've think of was to make it L2 toward
>> next-hop
>> and removing VRF in configuration.
>>
>> Do u mean u turned it into L2 switch? If so what a waste of 760X! :(
>>
>>
>> On Thu, Oct 27, 2011 at 1:39 AM, Joseph L. Brunner
>> <joe_at_affirmedsystems.com<mailto:joe_at_affirmedsystems.com>>wrote:
>>
>> > Very good to know that Sajjad.
>> >
>> > Thanks for posting back what did it.
>> >
>> > I cant believe given the state of the world that cisco is still selling
>> > products where a feature that can be configured can take slow down the
>> > system that much.
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: nobody_at_groupstudy.com<mailto:nobody_at_groupstudy.com> [mailto:nobody_at_groupstudy.com<mailto:nobody_at_groupstudy.com>] On Behalf Of
>> > Sajjad Najafizadeh
>> > Sent: Wednesday, October 26, 2011 10:08 AM
>> > To: Radioactive Frog
>> > Cc: Cisco certification
>> > Subject: Re: High CPU load on 7609
>> >
>> > Hi all
>> >
>> > First of all I've change IOS to 12.2(18) , but the issue exist.
>> > I used VRF light to send traffic to next hop as PBR killed the CPU
>> before .
>> > The only workaround that i've think of was to make it L2 toward next-hop
>> > and
>> > removing VRF in configuration.
>> > The issue solved with this .
>> > I do not believe 7600 router can not handle VRF and BGP with some PBR in
>> > same time with max traffic of 4gbps.
>> >
>> > Thanks again to all for support.
>> >
>> > REgards
>> >
>> > On Wed, Oct 26, 2011 at 1:21 PM, Radioactive Frog <pbhatkoti_at_gmail.com<mailto:pbhatkoti_at_gmail.com>
>> > >wrote:
>> >
>> > > last week I had same issue on 6509. Weird thing as nothing will be
>> shown
>> > in
>> > > 'show proc cpu sorted' output.
>> > >
>> > > The root cause of my issue was someone added route-map (matching ACL
>> and
>> > > set next hop). There were about 8000+ users! University environment.
>> > > The core 6509 was running like a dog!
>> > >
>> > > The fix: implement VRF's . After removing route-maps CPU was back to
>> > normal
>> > > 40-55% (was 95-100% constantly for 10 days).
>> > >
>> > >
>> > > HTH
>> > >
>> > > Frog
>> > >
>> > >
>> > > On Wed, Oct 26, 2011 at 6:39 PM, Sajjad Najafizadeh <
>> > najafizadeh_at_gmail.com<mailto:najafizadeh_at_gmail.com>
>> > > > wrote:
>> > >
>> > >> Hi all
>> > >>
>> > >> we have high CPU load on 7609 router .
>> > >> there is no ip policy and no NAT on this router but here is the CPU
>> > load.
>> > >> Could any one suggest what to do ??
>> > >>
>> > >> *Output of sho ip proc cpu sorted :*
>> > >>
>> > >>
>> > >> CPU utilization for five seconds: 75%/73%; one minute: 77%; five
>> > minutes:
>> > >> 77%
>> > >> PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY
>> Process
>> > >> 220 949800 3116922 304 1.59% 1.58% 1.43% 0 IP
>> > Input
>> > >>
>> > >> 256 2544 3638244 0 0.15% 0.16% 0.15% 0
>> > Ethernet
>> > >> Msec Ti
>> > >> 83 80164 61809 1296 0.15% 0.02% 0.05% 2
>> Virtual
>> > >> Exec
>> > >> 2 30380 5976 5083 0.07% 0.05% 0.06% 0 Load
>> > Meter
>> > >>
>> > >> 219 1132 916608 1 0.07% 0.03% 0.02% 0 IP
>> ARP
>> > >> Retry Age
>> > >> 372 576 43514 13 0.07% 0.00% 0.00% 0 FM
>> core
>> > >>
>> > >> 162 632 29926 21 0.07% 0.02% 0.02% 0
>> > >> Per-Second
>> > >> Jobs
>> > >> 191 52 29866 1 0.07% 0.00% 0.00% 0 CWAN
>> > >> CHOCX
>> > >> PROCE
>> > >> 260 1164 916616 1 0.07% 0.04% 0.05% 0 IPAM
>> > >> Manager
>> > >> 27 1120 29181 38 0.07% 0.02% 0.02% 0 IPC
>> > >> Periodic Tim
>> > >> 326 1292 126588 10 0.07% 0.03% 0.02% 0 TCP
>> > Timer
>> > >>
>> > >> 555 89680 494908 181 0.07% 0.11% 0.11% 0 SNMP
>> > >> ENGINE
>> > >>
>> > >> 379 72 29805 2 0.07% 0.00% 0.00% 0 PfR
>> BR
>> > >> Learn
>> > >> 15 0 2 0 0.00% 0.00% 0.00% 0 ATM
>> Idle
>> > >> Timer
>> > >> 14 316 31552 10 0.00% 0.00% 0.00% 0 ARP
>> > >> Background
>> > >> 17 0 1 0 0.00% 0.00% 0.00% 0
>> > >> AAA_SERVER_DEADT
>> > >> 13 25684 36715 699 0.00% 0.04% 0.05% 0 ARP
>> > Input
>> > >>
>> > >> 12 3748 30473 122 0.00% 0.00% 0.00% 0
>> > WATCH_AFS
>> > >>
>> > >> 16 0 1 0 0.00% 0.00% 0.00% 0 ATM
>> > ASYNC
>> > >> PROC
>> > >> 18 0 1 0 0.00% 0.00% 0.00% 0
>> Policy
>> > >> Manager
>> > >> 22 12 6010 1 0.00% 0.00% 0.00% 0 IPC
>> > Event
>> > >> Notifi
>> > >> 23 64 29182 2 0.00% 0.00% 0.00% 0 IPC
>> > Mcast
>> > >> Pendin
>> > >> 24 0 500 0 0.00% 0.00% 0.00% 0 IPC
>> > >> Dynamic
>> > >> Cach
>> > >> 11 0 2 0 0.00% 0.00% 0.00% 0
>> Timers
>> > >>
>> > >> 26 16 107 149 0.00% 0.00% 0.00% 0
>> PF_Split
>> > >> Sync Pr
>> > >>
>> > >> Regards
>> > >>
>> > >>
>> > >> Blogs and organic groups at http://www.ccie.net<http://www.ccie.net/>
>> > >>
>> > >>
>> _______________________________________________________________________
>> > >> Subscription information may be found at:
>> > >> http://www.groupstudy.com/list/CCIELab.html
>> >
>> >
>> > Blogs and organic groups at http://www.ccie.net<http://www.ccie.net/>
>> >
>> > _______________________________________________________________________
>> > Subscription information may be found at:
>> > http://www.groupstudy.com/list/CCIELab.html
>>
>>
>> Blogs and organic groups at http://www.ccie.net<http://www.ccie.net/>
>>
>> _______________________________________________________________________
>> Subscription information may be found at:
>> http://www.groupstudy.com/list/CCIELab.html
Blogs and organic groups at http://www.ccie.net<http://www.ccie.net/>
Received on Mon Oct 31 2011 - 09:54:02 ART
This archive was generated by hypermail 2.2.0 : Tue Nov 15 2011 - 13:10:29 ART