Re: WFQ

From: Narbik Kocharians (narbikk@gmail.com)
Date: Sun Jul 13 2008 - 08:05:08 ART


ha ha ha ha
Well John, you may be sorry for asking the question but let's go way back
and explain how and what tools you should use to calculate before we get to
the simple WFQ that not many use, the tool that was used in the old days was
called An *abacus*, also called a *counting frame*, is a calculating tool
used primarily by Asians for performing arithmetic processes. Today, abaci
are often constructed as a wooden frame with beads sliding on wires, but
originally they were beads or stones moved in grooves in sand or on tablets
of wood, stone, or metal. The abacus was in use centuries before the
adoption of the written modern numeral system and is still widely used by
merchants and clerks in China, Japan, Africa, India and elsewhere.
So please add this to what was said and now you have a complete back ground
and info on everything.

The user of an abacus is called an abacist; he or she slides the beads of
the abacus by hand

On Sun, Jul 13, 2008 at 12:31 AM, Petr Lapukhov <petr@internetworkexpert.com>
wrote:

> Ooops, a typo in my large post :) Instead of
> "And if the interface bandwidth is 256K the resulting values are 16Kbps for
> each precedence 1 flow, 32 Kbps for each precedence 2 flow and <90Kbps> for
> precedence 5 flow."
>
> read
>
> "And if the interface bandwidth is 256K the resulting values are 16Kbps for
> each precedence 1 flow, 32 Kbps for each precedence 2 flow and <80Kbps> for
> precedence 5 flow."
>
> Effectively 5*16Kpbs + 3*32 + 1*80Kbps=256Kpbs
>
> --
> Petr Lapukhov, CCIE #16379 (R&S/Security/SP/Voice)
> petr@internetworkexpert.com
>
> Internetwork Expert, Inc.
> http://www.InternetworkExpert.com <http://www.internetworkexpert.com/>
>
> 2008/7/13 Petr Lapukhov <petr@internetworkexpert.com>:
>
> > John,
> >
> > WFQ may look a little bit tricky, especially if you're digging in the
> > implementation details right away :) However, let's start from the very
> > basics, looking at the original roots of WFQ/FQ family of schedulers.
> >
> > First, it all started with an abstract algorithm called GPS (Generalized
> > Processor Sharing) which describes and ideal procedure to share one
> resource
> > between many entities in a *statistically fair* and efficient manner.
> Here
> > "Fair" means so-called "max min" fairness, which maximizes the minimal
> share
> > of any of the active entities that is tries to give as much as possible
> to
> > the most starving entity. Additionally you may assign "weights" to
> entities,
> > to scale their importance. (Fairness, in general, is very complicated
> topic,
> > rooted in Games Theory, which raised a lot of heated discussions during
> > times. "Max Min" fairness is just one scheme proposed).
> >
> > However, GPS is an abstract algorithm, which could not be implemented in
> > real world due to the fact that it interprets the "resource shares"
> > (packets, when applied to networking) as infinitesimally divisible, and
> not
> > as quantum's. Therefore, an "approximation" scheme is needed to make it
> work
> > in real world. A large number of implementations simulating GPS behavior
> > exist, including: Fair Queuing (FQ), Weighted Fair Queuing (WFQ), Round
> > Robin (RR), Weighted Round Robin (WRR), Deficit Round Robin (DRR),
> Modified
> > Deficit Round Robin (MDRR) and so on. All of them represent more or less
> > accurate "packet-based" approximations of GPS.
> >
> > First, look at Fair-Queuing (FQ). The algorithm was proposed by infamous
> > guy named John Nagle (back in 80s), who is also responsible for that
> > *service nagle* command in IOS ;) The central concept of FQ is packet
> flow
> > (equivalent of a process in GPS), which could be defined in various ways.
> > For example, you may define a flow as a set of packets sharing the same
> > source/destination IP addresses. Cisco decided to define a flow based on
> > source/destination IPs, sourced/destination ports (where available), IP
> > protocol value and least 5 bits of TOS byte field (ouch, this is
> > complicated). They use a special hash-function that runs over all the
> values
> > and returns the queue-number to put packet in (It may be easily shown,
> that
> > basing flow on port number may allow one user with a lot of flows to
> claim
> > more bandwidth than other users can get. This is very true, with respect
> to
> > P2P applications and that is why the guys on IETF try to invent a new
> > fairness scheme, which works well in P2P sharing world).
> >
> > The second concepts of FQ is per-flow queue - each flow has it's own
> > special FIFO queue in a memory space used by FQ scheduler. Therefore, the
> > number of active queues for FQ is dynamic changes with the number of
> > active flows, but shares the same buffer space assigned to a Fair Queue
> > scheduler. Note that the structure of the hash function used by Cisco,
> > dictates that the number of queues is always a power of 2: 256, 512,
> 1024,
> > 4096 etc.
> >
> > To decide which of the FIFO queues to serve first, FQ scheduler
> calculates
> > so-called "virtual finishing time" for packets sitting in the head of
> every
> > FIFO queue. This value is based on the packet size and the packet arrival
> > time. The packet with minimum expected "virtual finishing time" is
> de-queued
> > first, and this is how a flow with small packets is able to get the same
> > share as the flow with large packets. The ultimate goal of FQ is to
> divide
> > interface bandwidth equally between flows, i.e. for N flows and interface
> > rate R, every flow will statistically obtain bandwidth no less than R/N
> (of
> > course a flow may use less bandwidth, but R/N is guaranteed in case a
> flow
> > will speed up). Note that formula is only valid for sustained flows
> > (statistically reliable) and you can get less accurate results with
> short,
> > dynamic flows.
> >
> > Next, WFQ (Weighted Fair Queuing) is a more general technique, which
> > assigns weights to packet flows, and shares the bandwidth in proportion
> to
> > weights. Whether FQ provides R/N share of rate R to N flows, WFQ will
> > provide wX*R/(w1+w2+ wN) [formula 1] share of bandwidth to flow X.
> Cisco's
> > implementation assigns weights automatically, based on a flow's IP
> > precedence value (this is why top 3 bit of TOS byte are not taken in
> account
> > by WFQ hashing function). You may consider w1, w2 wN to equal "IP
> > Precedence + 1" for use with the above mentioned "formula 1" in actual
> > computation of bandwidth shares. Obviously the formula means that if you
> > have just one flow of each precedence value, the bandwidth is shared in
> > proportions of "IP Precedence +1" (e.g. for one ip precedence 0 flow and
> one
> > IP precedence 1 flow you will get share 1:2). For a more complicated
> > example, consider an interface of 256Kbps with five precedence 1 flows,
> plus
> > three precedence 2 flows and one precedence 5 flow on the same link. The
> > bandwidth shares will be as following.
> >
> > Total number of flows (N) equals to: 5+3+1=9.
> >
> > Weights:
> >
> > w1=1, w2=1,w3=1,w4=1,w5=1
> > w6=2, w7=2, w8=2
> > w3=5
> >
> > Shares per formula 1:
> >
> > for each precedence 1 flow: 1/(5*1+3*2+1*5)=1/16
> > for each precedence 2 flow: 2/(5*1+3*2+1*5)=1/8
> > for the precedence 5 flow: 5/(5*1+3*2+1*5)=5/16
> >
> > And if the interface bandwidth is 256K the resulting values are 16Kbps
> for
> > each precedence 1 flow, 32 Kbps for each precedence 2 flow and 90Kbps for
> > precedence 5 flow.
> >
> > However, in the actual implementation Cisco uses special "scale factor"
> to
> > multiply the "virtual finishing time" for each flow and to achieve that
> > weighted fairness. Prior to IOS 12.0(5)T, the formula for the "scale
> factor"
> > was: 4096/(IP Precedence + 1) and the modern IOSes use 32768/(IP
> Precedence
> > + 1). As you can see, this scale factor is larger for smaller IP
> > precedence's so a flow with bigger IP precedence will have "virtual
> > finishing time" scaled less that a flow with smaller IP precedence value.
> > However, those are only the implementation details, and what's important
> to
> > note is that to get the bandwidth shares you may always use "formula 1".
> >
> > A few words on RSVP interaction with WFQ. WFQ was designed to be RSVP
> > aware, and every RSVP reservation gets a separate flow (FIFO queue)
> served
> > by WFQ scheduler. The "scaling factors" for RSVP flows are very small,
> > compared to the values calculated for the regular flows. For example the
> > RSVP reservation with largest bandwidth demand will get a scaling factor
> of
> > 6, and every other reservation gets higher scaling factor based on it's
> > bandwidth demand but also very small. This is how RSVP flows always get
> the
> > bandwidth they are asking for (provided that they are admitted by RSVP
> > process).
> >
> > Next thing, which is specific to Cisco's implementation of WFQ, is
> > so-called Congestive Discard Threshold. As we remember, WFQ shares a
> single
> > buffer between dynamic flows. Each queue is FIFO and may eventually eat
> all
> > available buffer space. To avoid this condition, CDT used in the
> following
> > manner. When one of the queues reaches CDT size, a packet is dropped, but
> > possibly not from the oversized queue. The actual queue that has a packet
> > dropped is the most "active" queue of them all the one accumulated the
> > most bytes in packets and having the biggest total virtual finishing
> time.
> > This way (hoping that the flow was TCP) the most active flow is
> implicitly
> > signaled to slow down it's sending rate. This is not a part of original
> FQ
> > or WFQ algorithm, but Cisco's extension to make it more adaptive. Here is
> > how WFQ settings may look like on IOS router:
> > "fair-queue <CDT> <NumDynamicQueues> <RSVP Queues>". The total buffer
> space
> > available to WFQ is set with "hold-queue" command.
> >
> > Another two extensions to WFQ were IP RTP Reserve and IP RTP Priority
> (the
> > latter is also known as PQ/WFQ). Those two specifically aim to make WFQ
> work
> > with voice traffic, which requires expedite priority treatment for voice
> > bearer packets. IP RTP Reserve used a special flow with a very small
> scaling
> > factor (similar to RSVP queues and also based on bandwidth provided) to
> > queue all RTP packets (actually packets with UDP ports in the range you
> > specify) but was deprecated by IP RTP Priority. The latter one provides
> true
> > priority queue (a separate FIFO flow queue) for RTP traffic on interface
> > with WFQ enabled. RTP packets are always serviced first, but limited in
> > their rate using an explicit policing rate (note that the burst size for
> > this special policer equals to specified rate multiplied by one second).
> The
> > commands for IP RTP Reserve and IP RTP Priority are "ip rtp reserve 16384
> > 16383 256" and "ip rtp priority 16384 16383 256".
> >
> > The last thing about WFQ is that it reserves special number of flow-queue
> > to network management traffic, such as link-layer keep-alives, routing
> > protocol traffic, netflow packets etc. You can't see those special flow
> > queues in show commands output, but they do exist and have special high
> > weights to guarantee timely delivery to critical network traffic.
> >
> > Of course, all WFQ features were superseded by CBWFQ, which not only
> allows
> > you to specify any criteria for traffic flow and assign weights manually,
> > but also is also tightly integrates with other QoS features using common
> MQC
> > syntax. At his time, I'm working on a more detailed post to for our blog,
> > describing WFQ mechanics in depth. So if you are looking for more
> > information on WFQ and CBWFQ, stay tuned :)
> >
> >
> > --
> >
> > Petr Lapukhov, CCIE #16379 (R&S/Security/SP/Voice)
> > petr@internetworkexpert.com
> >
> > Internetwork Expert, Inc.
> > http://www.InternetworkExpert.com <http://www.internetworkexpert.com/>
> >
> > 2008/7/13 John <jgarrison1@austin.rr.com>:
> >
> >> I don't know why I have a hard time getting this, but lets see if I have
> >> this
> >> right.
> >>
> >> WFQ will give low volume traffic "better treatment" then high volume
> >> traffic.
> >> It does this through some mysterious algorithm that I can't seem to
> find.
> >> WFQ
> >> is the default for all interfaces under E1 speeds. WFQ works with ip
> >> precedence and RSVP. WFQ is flow based. I cannot change the behaviour
> of
> >> WFQ
> >> through configuration. There are some QOS features I cannot configure
> on
> >> an
> >> interface with WFQ.
> >>
> >> Thats what I get. I would appreciate comments on anything I missed or
> got
> >> wrong
> >>
> >>
> >> ______________________________ _________________________________________
> >> Subscription information may be found at:
> >> http://www.groupstudy.com/list/CCIELab.html
>
>
> _______________________________________________________________________
> Subscription information may be found at:
> http://www.groupstudy.com/list/CCIELab.html
>
>
>
>
>

-- 
Narbik Kocharians
CCSI#30832, CCIE# 12410 (R&S, SP, Security)
www.Net-Workbooks.com
Sr. Technical Instructor


This archive was generated by hypermail 2.1.4 : Mon Aug 04 2008 - 06:11:54 ART