Monday, July 23, 2018

Follow the PACKET!!!


                 In last blog we discussed on how a frame is processed at layer 2 from ingress switch port to egress switch port. In this blog let us look at layer 3 i.e a packet being processed by a L3 or multi layer  switch. Look at the number of steps a layer 3 switch should go through to process a frame and them imaging the speed of the box :)

                 The path a layer 3 packet follows through a multi layer switch is similar to that of a layer 2 switch. Each packet is pulled off from an ingress queue and inspected for both layer 2 and layer 3 destination address. Decision to forward the packet is based on two address tables.



Layer 2 forwarding table(CAM) : 
The frame destination MAC address is used as an index or key into the CAM table(content addressable memory).  If the frame contains a layer 3 packet to be forwarded , the destination MAC address is that of layer 3 port of the switch. In this case, the CAM table results are used only to decide that frame should be processes at layer 3.

L3 forwarding table (FIB):
The FIB also known as forwarding information base is consulted using the destination IP address as an index. The longest match in the table is found and the resulting next hop layer 3 address is obtained.FIB also contains each next hop entry's layer 2 MAC address and the egress switch port(VLAN ID) so that further lockups are not necessary.

Always remember irrespective of the routing protocol , administrative distance or metric the longest match is preferred in routing. Example if a switch is learning 10.1.1.0/25 from EIGRP  and 10.1.1.0/24 from static, EIGRP is preferred over static as it has most exact match/longest match  i.e /27. 

Ternary Content addressable memory (TCAM) : 
  *Securing ACL - ACL can be used to identify frames according to their MAC address , protocol,IP address and layer 4 port numbers. TCAM's carry ACL's in compiled form so that a decision can be made on whether to forward a frame  in a single table lookup.

  *QOS ACL - Other ACL's which are used to classify incoming frames according to quality of service parameters, to control the rate of traffic flows and to mark QOS parameters in outbound frames.

 
As with layer 2 switching the packet finally must be placed in appropriate egress queue on the appropriate egress switch port.Layer 3 address identified the next hop and found its layer 2 address. The next hop layer 2 address must be put into the frame in place of the original destination MAC address. Frames's layer 2 source address also must be changes to the one of the multi layer switch
(L3 packet rewrite). Because the contents of packet and frame are changes, checksum is recalculated.



Sunday, July 22, 2018

Follow the FRAME!!!


                                  As frames arrive upon switch ports, the source MAC addresses are learned and recorded in the CAM table, along with the port of arrival, the VLAN, and a timestamp.If a MAC address learned upon a port has moved to another port, the MAC address and timestamp are recorded for the most recent port; and then the previous entry is deleted.


When a frame arrives at a switch port, it is placed into one of the ports ingress queue. Each queue contain frames to be forwarded , with each queue have different priority and service level. Critical data loss can be avoided by fine tuning the switch port so the important frames get processed and forwarded first.As ingress queues are serviced and a frame is pulled off the switch must figure out not only where the forward the frame but also whether it should be forwarded and how ?




Layer 2 forwarding table(CAM) : 
The frame destination MAC address is used as an index or key into the CAM table(content addressable memory). CAM table has three three rows - MAC address, Egress port and VLAN. If the address is found, egress switch port and appropriate VLAN ID are read from the table.

Ternary Content addressable memory (TCAM) : 
  *Securing ACL - ACL can be used to identify frames according to their MAC address , protocol,IP address and layer 4 port numbers. TCAM's carry ACL's in compiled form so that a decision can be made on whether to forward a frame  in a single table lookup.

  *QOS ACL - Other ACL's which are used to classify incoming frames according to quality of service parameters, to control the rate of traffic flows and to mark QOS parameters in outbound frames.


In the next blog lets follow a packet :)

Switching Basics


It took a year and a half to start writing again !! Time just flies :)

I will follow OSI reference model and will start with data link layer(Layer 2). What comes to mind when we talk about layer 2 ? Switches, hubs bridges, frames etc.

Let us first  try understanding what is a collision domain and broadcast domain.

Collision Domain: Anywhere within a network where a Collision can occur. When more than one host tries to talk at one time a collision occurs and everyone should backoff, wait to talk again.This forces every host to operate in half duplex(either send or receive). 

Broadcast Domain: All devices that a Broadcast message reaches at the Data Link layer within a network is called a Broadcast domain


Hub : Every single port on a hub belongs to same collision domain and broadcast domain.
Bridge : Bridge breaks up collision domain/per port and is in same broadcast domain.
Switches : Each port of the switch is in different collision domain and same broadcast domain, you add another switch it is also in same broadcast domain.
Router : Each port belongs to different collision and broadcast domain.

A Media Access Control (MAC) address is a 48-bit address that is used for communication between two hosts in an Ethernet environment. MAC address actually has two parts, first half is OUI - Organizationally unique identifier assigned to hardware vendores by IEEE.The address is written in the form of 12 hexadecimal digits.

                


                             Ethernet switch operates at layer 2 of OSI reference model making decisions about forwarding frames based on destination MAC address found within the frame.Each port can operate in full duplex switches have per port collision domain, hence bandwidth is not shared. A switch either should e told explicitly where hosts are located or must learn information itself. You can configure MAC address statically but this gets out of control when more number of hosts are added to network.

Switch dynamically learns MAC address listening to incoming frames and keeps a table of information.When a frame is received on switch port, switch inspects the source MAC address and adds to the table if it does not have an entry. Incoming frames also include the destination MAC address and switch looksup for a entry. If address is not found in the table, switch  floods to all the switchports assigned to VLAN. This is known as unknown unicast flooding. 


Further reading :
https://learningnetwork.cisco.com/docs/DOC-30227



Tuesday, October 27, 2015

Secure Copy Protocol (SCP) and line cards powered down


                       How many times we have seen modules on a switch going down after a reboot. This can be due to couple of reasons. In this article I'll talk about module being  powered down due to SCP communication failures.

SCP - Secure Copy Protocol is used for communication between switch processor and line cards through the Ethernet out of band channel. EOBC is a bus on the chassis and line cards communicate with supervisor only through this bus.On Cisco 65xx series switches has a single EOBC and operates in half duplex.

Whenever you see a module powered  down, check for a SCP or keep-alive polling failures. This indicates issue in communication between supervisor and line cards.

Error message:
 Oct 21 09:43:44.121 utc: %ONLINE-SP-6-REGN_TIMER: Module 3, Proc. 0. Failed to bring online because of registration timer event
Oct 21 09:43:44.121 utc: %C6KPWR-SP-4-DISABLED: power to module in slot 3 set off (Module  Failed SCP dnld)
 
Oct 24 09:43:44.121 utc:%C6KPWR-SP-4-DISABLED: power to module in slot 2 set off (Module not responding to Keep Alive polling)

Troubleshooting:

1. Check SCP counters. If counters for one's marked in bold increase then congestion in EOBC.


Switch1#remote command switch show scp counters
received packets            = 6398025
transmitted packets         = 1801147
retransmitted packets       = 110
fast retransmitted packets  = 0
loop back packets           = 4956282
transmit failures           = 0
recv pkts not for me        = 0
recv pkts to dead process   = 0
recv pkts not enqueuable    = 0
response has wrong opcode   = 0
response has wrong seqnum   = 0
response is not an ack      = 0
response is too big         = 38975
received expedited packets  = 0
transmitted expedited pkts  = 0

2.Check per-module SCP receive/transmit counters, and  incrementing SCP retries.

Switch1#remote command switch show scp status
Rx 6411004,  Tx 1804882,  scp_my_addr 0x4
Id Sap      Channel name    current/peak/retry/dropped/total  time(queue/process/ack)
-- ---- ------------------- --------------------------------  ----------------------
0  11   SCP Unsolicited:11      0/    0/    0/      0/    0      0/   0/   0
1  20   SCP Unsolicited:20      0/    0/    0/      0/    0      0/   0/   0
2  0    SCP Unsolicited:0       0/    3/    0/      0/2447294      0/   0/8244
3  2    SCP Unsolicited:2       0/    4/    0/      0/2516140      0/   0/   0
4  21   SCP Unsolicited:21      0/    0/    0/      0/    0      0/   0/   0
5  16   SCP Unsolicited:16      0/    0/    0/      0/    0      0/   0/   0
6  1    SCP Unsolicited:1       0/    4/    0/      0/18962      0/   0/ 236
7  18   SCP Unsolicited:18      0/    0/    0/      0/    0      0/   0/   0
8  17   SCP Unsolicited:17      0/    0/    0/      0/    0      0/   0/   0
9  33   SCP async: LCP#5        0/   39/    0/      0/652887    152/  40/   8
10 32   SCP async: LCP#1        0/  150/    0/      0/128237    456/ 232/ 228
11 36   SCP async: LCP#8        0/  150/    0/      0/99088    444/ 228/ 228
12 35   SCP async: LCP#9        0/  150/    0/      0/98919    816/ 228/ 228
13 37   SCP async: LCP#2        0/  150/    0/      0/126100    828/ 228/ 228
14 41   SCP async: LCP#7        0/   17/    0/      0/86316    204/ 228/ 228

3. SCP ping from supervisor to module

Switch1#remote command switch test scp ping 3
pinging addr 3(0x3)
assigned sap 0x28
no response from addr 3(0x3)     //communication between supervisor and line card is having issue

Switch1#remote command switch test scp ping 1
pinging addr 1(0x1)
assigned sap 0x28
addr 1(0x1) is alive                 //communication between supervisor and line card is good


4. Change diagnostic level to complete and reseat the module

Switch1(config)#diagnostic level complete

Switch1#show diagnostic result module 2 | inc Diagnostic
  Overall Diagnostic Result for Module 2 : PASS
  Diagnostic level at card bootup: complete


 



Further reading:
EOBC interface





Tuesday, October 20, 2015

Etherchannel load balancing decision


                                  Ether Channel as most of you know allows multiple physical Ethernet links to combine into one logical channel. Advantage is not only it allows load sharing of traffic among the links in the channel but  redundancy in the event that one or more links in the channel fail.




How the traffic flows over the physical links bundled is the key topic.Ether channel uses a hash algorithm to achieve this task. Hash algorithm computes a value in the range 0 to 7 and this result  is called a Result Bundle Hash (RBH). Only on this value a particular port is chosen. Hashing algorithm is deterministic; if you use the same addresses and session information, you always hash to the same port in the channel. This method prevents out of order packet delivery.

Below table displays the ratio value each port accepts. Maximum ports than can be bundled is  8.


Number of Ports in the EtherChannel
Load Balancing
8
1:1:1:1:1:1:1:1
7
2:1:1:1:1:1:1
6
2:2:1:1:1:1
5
2:2:2:1:1
4
2:2:2:2
3
3:3:2
2
4:4


Ether-channel load balancing can use MAC addresses, IP addresses, or Layer 4 port numbers and you should be very careful while choosing one. Once applied this implies to all the ether-channels on the switch.Below command can be used to check the load balancing method used on the switch.

sh etherchannel load-balance

Lets take few examples analyzing traffic and what method suits best for load balancing to work effectively.

1. Users are sending files to same file share.
   If destination MAC address is used then it results in the choice of the same link in the channel each time.
   It is good to used source MAC or IP address for better load sharing.

2. Communication is between two hosts with different port numbers
    In this scenario use of IP address or MAC will result in overloading one link in the bundle. Load balancing     can be achieved only when port numbers are used.

Identifying the physical interface from a bundle for a particular flow  is very useful when troubleshooting issues in a switched environment. Few days back i came across a issue which explains the importance of finding the physical interface from a bundle. Ether channel is built between two 65xx switches and each physical interface is terminated on different module(2 and 3). A layer 3 VLAN is configured on the switches with HSRP. Few host machines on the same segment became unreachable from standby switch. Upon further troubleshooting we identified any traffic flowing on physical  interface in module 2 of the bunlde was unreachable where as the one's flowing on interface in module 3 were reachable.We tried replacing the GBIC,cable etc but the result remained the same until we reseated module-2.


Command to check which physical interface from the bundle the algorithm selects is model specific and listed in URL mentioned in further reading.





Further Reading:
http://www.cisco.com/c/en/us/support/docs/lan-switching/etherchannel/116385-technote-etherchannel-00.html

Thursday, October 8, 2015

Cisco Router/Switch performance by platform/Model


                     Routers/Switches from multiple vendors are available in the market but the most commonly used is Cisco. There are multiple reasons why one selects  Cisco products over the others like stability,performance,support etc.Extra care needs to be taken while selecting a network device.

Network architects plan and design the network based on customer needs.They decide on which device to use, which features should be enabled, any license required, capacity management etc. With the expansion of business device throughput is one key factor that some times gets overlooked and might cause a major outage or performance issues.

How this happens? for instance a site was setup with a Cisco 1841 router and a 30 Mbps link. With time number of users in site increased and few servers were setup which was accessed by external users. Bandwidth is now not sufficient and they decide for a upgrade(45 Mbps). After upgrade users complain intermittent connectivity issues,slowness, drops etc. Service provider tests the circuit and confirms no issue.

Where is the problem then ? Yes,throughput now comes into picture. Cisco 1841 platform supports a throughput of 38 Mbps and link bandwidth is 45Mbps, which means the router is receiving more traffic that it can handle. It's like a person can juggle with 5 glasses,you start throwing additional glasses which he can not handle and will start dropping. It's the same with Cisco devices as well. If more traffic is received on devices than it can handle, packets will be dropped. You will also notice the CPU utilization of the device will go high and usually IP input process/interrupts to be consuming more of CPU.

While calculating throughput of a device Cisco uses IP only and 64 byte packet size. Switching performance is in  packets per second. If additional features like access-list, encryption etc are added, throughput might decreases.

1. Cisco Router performance Matrix
Platform
Process Switching
Process   Switching
Fast/CEF    Switching
Fast/CEF   Switching
PPS
Mbps
PPS
Mbps
14xx
600
0.3072
4,000
2.05
160x(-R)
600
0.3072
4,000
2.05
1701
1,700
0.8704
12,000
6.14
1710
1,300
0.6656
7,000
3.58
1711-1712
1,700
0.8704
13,500
6.91
1720
1,400
0.7168
8,500
4.35
1721
1,700
0.8704
12,000
6.14
1750
1,400
0.7168
8,500
4.35
1751
1,500
0.768
12,000
6.14
1760
1,700
0.8704
16,000
8.19
1801-1812


70,000
35.84
1841


75,000
38.4
1861


146,142
74.82
1941


299,000
153.08
2500
800
0.4096
4,400
2.25
261x
1,500
0.768
15,000
7.68
262x
1,500
0.768
30,000
15.36
265x
2,000
1.024
40,000
20.48
2691
7,400
3.7888
70,000
35.84
2801
3,000
1.536
90,000
46.08
2811
3,000
1.536
120,000
61.44
2821
11,500
5.888
170,000
87.04
2851
15,000
7.68
220,000
112.64
3620
2,000
1.024
20,000-40,000
20-Oct
2901


327,000
167.42
2911


353,000
180.73
2921


480,000
245.76
2951


580,000
296.96
3640
4,000
2.048
50-70,000
25.6-36
3660
12,000
6.144
100-120,000
51.2-61.4
3631
4,000
2.048
50-70,000
25.6-36
3725


100-120,000
51.2-61.4
3745


225-250,000
25.6-36
3810
2,000
1.024
8,000
4.1
3810-V3
3,000
1.536
15,000
7.68
3825
25,000
12.8
350,000
179.2
3845
35,000
17.92
500,000
256
3925


833,000
426.49
3945


982,000
502.78
4000
1,800
0.9216
14,000
7.17
7120
13,000
6.656
175,000
89.6
7140
20,000
10.24
300,000
153.6
7200-NPE100
7,000
3.584
100,000
51.2
7200-NPE150
10,000
5.12
150,000
76.8
7200-NPE175
9,000
4.608
177,848
91.06
7200-NPE200
13,000
6.656
200,000
102.4
7200-NPE225
13,000
6.656
233,170
119.38
7200-NPE300
20,000
10.24
353,000
180.74
7300-MPLS-Experts-100


3,500,000(PXF)
1,792
7600-MSFC2
20,000
10.24
30,000,000
1,792
ASR1000-PRE4


10,000,000
5,120
12000(Engine 6)


50,000,000
20,000
CRS-1 LC


80,000,000
40,960



 2. Cisco Switch performance Matrix
Platform
Switch Performance(Mpps)
Switch Fabric(Gbps)
L2/L3
1900/2800
550,000
1
L2
2900-XL
3,000,000
3.2
L2
2900-LRE
3,000,000
5
L2
2940-8(TT,TF)
2,700,000
3.2
L2
2950-ST8-LRE
3,200,000
8.8
L2
2950-ST24-LRE
3,500,000
8.8
L2
2950(12,24)
6,600,000
8.8
L2
2950-48
10,100,000
13.6
L2
2950G(12,24)
10,100,000
8.8
L2
2950G-48
10,100,000
13.6
L2
2955-12(C,S)
2,000,000
13.6
L2
2955-12(T)
4,800,000
13.6
L2
2960-8
2,700,000
16
L2
2960-24
6,500,000
32
L2
2960-48
10,100,000
32
L2
2960G-8
11,900,000
32
L2
2960G-24
35,700,000
32
L2
2960G-48
39,000,000
32
L2
2970G-24(T)
35,700,000
24
L2
2970G-24(TS)
38,700,000
28
L2
3500-XL
8,000,000
10.8
L2
3550-24
6,600,000
8.8
L3
3550-48
10,100,000
13.6
L3
3550-12T/G
17,000,000
24
L3
3560-24TS/PS
6,600,000
8.8
L3
3560-48TS/PS
13,100,000
17.6
L3
3560G-24TS/PS
38,700,000
32
L3
3560G-48TS/PS
38,700,000
32
L3
3560E-24TD/PD
65,500,000
128
L3
3560E-48TD/PD
101,200,000
128
L3
3750-24TS/PS
6,500,000
32
L3
3750-48TS/PS
13,100,000
32
L3
3750G-24TS/PS
38,700,000
32
L3
3750G-48TS/PS
38,700,000
32
L3
3560E-48TD/PD
101,200,000
128
L3
3750G-24T
35,700,000
32
L3
3750G-12S
17,800,000
32
L3
3750G-16TD
35,700,000
32
L3
3750E-24TD/PD
65,500,000
128
L3
3750E-48TD/PD
101,200,000
128
L3
2948G/2980G/4912
18,000,000
24
L2
2948G-L3
10,000,000
22
L3
2948G-GE-TX
29,760,000
20
L2
2948G/2980G/4912
18,000,000
24
L2
2948G-L3
10,000,000
22
L3
2948G-GE-TX
29,760,000
20
L2
4003w/Sup I
18,000,000
24
L2
4006 w/Sup II
18,000,000
24x3
L2
4006 w/SupIII or IV
48,000,000
64
L3
4503 w/Sup II
18,000,000
24
L2
4503 w/Sup III/IV
21,000,000
28
L3
4503 w/Sup II-Plus-TS
48,000,000
64
L3
4506 w/Sup II
18,000,000
24
L3
4506 w/Sup III/IV
48,000,000
64
L3
4X06/4503/4507-R
48,000,000
64
L3
4507 w/SupIV
48,000,000
64
L3
45XX/4006 w/SupV
72,000,000
96
L3
45XX w/SupV-10GE
102,000,000
136
L3
4908G-L3(L2/L3)
11,000,000
22
L3
4948
72,000,000
96
L3
5500-SupI/II
1,000,000
1.2
L3
5500-SupIII
2,200,000
3.6
L3
600X,65XX Sup1/1A
15,000,000
32
L2
600X,65XX Sup1/1A-MSFC
15,000,000
32
L3
65XX/76XX Sup32
15,000,000
32
L3
65XX/76XX Sup2
30,000,000
256
L2
65XX/76XX Sup2-MSFC
30,000,000
256
L3
65XX/76XX-w/Sup 720
400,000,000
720
L3
8510(IP/IPX/IPMC/Bridging)
6,000,000
10
L3
8540(IP/IPX/IPMC/Bridging)
24,000,000
40
L3