RE: What's wrong with this design?

From: asadovnikov (asadovnikov@comcast.net)
Date: Wed Sep 10 2003 - 22:42:11 GMT-3


Sorry for the long reply :)

I actually like the design they have.

Let me go through such design in a little more details and then move to
review of how it works.

To begin with a real important feature of this design is that a single VLAN
is only run on a single access switch. In such design an access switch can
run multiple VLANs which can be trunked over uplinks, but the critical point
of unusual for you spanning tree manipulation is that the same VLAN is never
on 2 access switches.

Having single VLAN can be administratively easier, especially if a size of
access box lends itself naturally to single subnet (say you may have 4006
switches 196 ports each and use /24 subnet on each one). From now on let me
assume that it is just a single VLAN per switch but as mentioned earlier
having multiple VLANs just adds trunking and almost does not require extra
explanation.

Further the core/distribution boxes terminate the VLAN on L3 and it is
routed from there.

Depending on implementation 2 core boxes may have the VLAN trunked between
them on L2 or may not. I personally do not like such L2 link as it does not
add much to redundancy, but I have seen it argued both ways. You did not
mention if L2 trunk between the core boxes is present (as per Cisco
recommendation) or not. I have to say that if the trunk between cores is
present the placement of the root on access level switch becomes lot more
important.

So here is a picture (with L2 trunk shown); I know it is not a Visio but
will have to do:

     Core A Core B
     router router
        \--------/
         \ /
          \ /
           \ /
            \/
          Access
          Switch

For STP the root is on access switch.

Now let us look at how it works... STP will be blocking on an trunk between
2 core switches, which is a good place for it to block so both uplinks will
be utilized all the time. If the trunk is not present it is not a big deal
either.

If an uplink fails, then the core-trunk makes difference. If it is present
then the link failure would be transparent for L3, but if it is not L3 will
take care of the failure just fine. UDLD on uplinks will help to make sure
there is no half-failed links. Depending of which link failed and presence
of a cross link convergence time will be the same (or better) then the Cisco
collapsed core design. In most cases replacement of cross-core trunk with
cross-core L3 links will improve fail over time, and then it will be better
then usual Cisco way to put STP root on the core.

Load balancing... In this scenario as you rightly say the load balancing is
controlled by L3. If upstream load balancing is required (which is often is
not due to traffic being lot lighter from workstation to the server then in
reverse direction) multigroup HSRP and combined with 2 DHCP copes can be
used to load balance the traffic. For downstream traffic each core router
will receive about 50% of it due to routing protocol load balancing, and
forward it right to access switch.

If you think carefully about it the core->access load balancing will be real
close to 50% as it relies upon true load balancing of L3 protocol
(especially when combined with CEF and server farms are L3 as well). Note
that STP load balancing by making different VLANs to block different
uplinks. The STP way load balances VLANs which may be of a very different
sizes, and achieving load distribution as good as it would be otherwise is
almost impossible.

So from both load balancing and fail over this design is better then widely
used Cisco way of STP root on the core. And it is easier to
implement/operate I think then more traditional design.

Again only making an assumption that a VLAN never spans multiple access
switches makes such design possible and I am sure members of "flat earth
society" would not be happy with such assumption, but only in extremely rare
conditions this days such assumption is any issue at all, and in very short
time L3 will run on access switches (and it does already on many) and
spanning VLANs across multiple access switches will be completely gone.

I do not think you will find anything wrong with it providing nothing big
was missed in the description, and it is better then the design
traditionally recommended by Cisco.

Unfortunately the Cisco design guidelines are still based upon old model
when many devices run L2 switching, VLANs span multiple access switches,
routers were slow... Since then nobody really did good re-thinking to build
new designs for today networks. The map to a certain extend still describes
"flat earth" even that everybody knows it is not flat.

I trust whoever designed this had very high design skills of a level which
is rare to be seen this days (or he may got confused and just did it without
much thinking) and I am glad to give him a credit.

Best regards,
Alexei

-----Original Message-----
From: nobody@groupstudy.com [mailto:nobody@groupstudy.com] On Behalf Of Mike
Taylor
Sent: Wednesday, September 10, 2003 12:04 PM
To: ccielab@groupstudy.com
Subject: OT: What's wrong with this design?

Hi all. I'm hoping you'll take a look at this interesting campus network
design scenario, and apologize in advance for the length of this message.

We're working with a customer who has implemented a campus network design
quite unlike that which we've seen elsewhere. Their network design is very
close to that of a standard collapsed-core model (see figure 4 of the
following link if you need to know just what that is - beware of wrap):

http://www.cisco.com/en/US/netsol/ns110/ns146/ns147/ns17/networking_solution
s_white_paper09186a00800a3e16.shtml

Normally with this design, you'd configure the core devices to be the
spanning-tree roots for all VLANs. You would configure uplink-fast on the
access-layer switches to improve spanning-tree convergence times. You would
tune the HSRP active router to be the same as the spanning-tree root per
VLAN, etc, etc.

The network in question, however, has been configured such that the
spanning-tree roots for each VLAN are found on the access-layer devices.
Uplink-fast configuration has been moved to the core switches. Routing and
HSRP still happen on the core devices. Additionally, access-layer devices
(for the most part) support only a single unique VLAN (i.e. all client ports
on access-layer device X would be configured as access-links to VLAN25, and
device X would be the spanning-tree root for VLAN25).

Why was the network designed this way? Since the access-layer devices
support a single VLAN, there would be no opportunity to load-balance
multiple VLANs across separate trunks (as you would in a "normal" design).
They moved the spanning-tree roots for their VLANs to the edge so that both
uplinks to the core would be forwarding (and in their minds, load
balancing). What really happens is not true load balancing - all traffic
sourced from devices connected to the access-layer switch (and destined for
some other broadcast
domain) follows the uplink towards the current HSRP active router (their
default gateway). Return traffic might come back to the access-layer switch
via either (forwarding) trunk. While this creates asymmetric traffic
patterns, we haven't found this to cause any issues and it does create a
pseudo load-balancing situation. In our lab, we tested convergence times
for various failover situations and found them to be in line with those of a
"normal" collapsed-core design.

Can anyone find gotchas with this design? We're having a hard time proving
that this design is worse or less efficient than a "normal" collapsed-core
model, even though the theory of the two designs are nearly opposite. What
do you big-brained people think?

Thanks!

Mike Taylor - CCIE #9658, MCSE, CNE
Network Engineer - Network Solutions, Inc.



This archive was generated by hypermail 2.1.4 : Wed Oct 01 2003 - 07:24:26 GMT-3