Tuesday, October 27, 2015

Secure Copy Protocol (SCP) and line cards powered down


                       How many times we have seen modules on a switch going down after a reboot. This can be due to couple of reasons. In this article I'll talk about module being  powered down due to SCP communication failures.

SCP - Secure Copy Protocol is used for communication between switch processor and line cards through the Ethernet out of band channel. EOBC is a bus on the chassis and line cards communicate with supervisor only through this bus.On Cisco 65xx series switches has a single EOBC and operates in half duplex.

Whenever you see a module powered  down, check for a SCP or keep-alive polling failures. This indicates issue in communication between supervisor and line cards.

Error message:
 Oct 21 09:43:44.121 utc: %ONLINE-SP-6-REGN_TIMER: Module 3, Proc. 0. Failed to bring online because of registration timer event
Oct 21 09:43:44.121 utc: %C6KPWR-SP-4-DISABLED: power to module in slot 3 set off (Module  Failed SCP dnld)
 
Oct 24 09:43:44.121 utc:%C6KPWR-SP-4-DISABLED: power to module in slot 2 set off (Module not responding to Keep Alive polling)

Troubleshooting:

1. Check SCP counters. If counters for one's marked in bold increase then congestion in EOBC.


Switch1#remote command switch show scp counters
received packets            = 6398025
transmitted packets         = 1801147
retransmitted packets       = 110
fast retransmitted packets  = 0
loop back packets           = 4956282
transmit failures           = 0
recv pkts not for me        = 0
recv pkts to dead process   = 0
recv pkts not enqueuable    = 0
response has wrong opcode   = 0
response has wrong seqnum   = 0
response is not an ack      = 0
response is too big         = 38975
received expedited packets  = 0
transmitted expedited pkts  = 0

2.Check per-module SCP receive/transmit counters, and  incrementing SCP retries.

Switch1#remote command switch show scp status
Rx 6411004,  Tx 1804882,  scp_my_addr 0x4
Id Sap      Channel name    current/peak/retry/dropped/total  time(queue/process/ack)
-- ---- ------------------- --------------------------------  ----------------------
0  11   SCP Unsolicited:11      0/    0/    0/      0/    0      0/   0/   0
1  20   SCP Unsolicited:20      0/    0/    0/      0/    0      0/   0/   0
2  0    SCP Unsolicited:0       0/    3/    0/      0/2447294      0/   0/8244
3  2    SCP Unsolicited:2       0/    4/    0/      0/2516140      0/   0/   0
4  21   SCP Unsolicited:21      0/    0/    0/      0/    0      0/   0/   0
5  16   SCP Unsolicited:16      0/    0/    0/      0/    0      0/   0/   0
6  1    SCP Unsolicited:1       0/    4/    0/      0/18962      0/   0/ 236
7  18   SCP Unsolicited:18      0/    0/    0/      0/    0      0/   0/   0
8  17   SCP Unsolicited:17      0/    0/    0/      0/    0      0/   0/   0
9  33   SCP async: LCP#5        0/   39/    0/      0/652887    152/  40/   8
10 32   SCP async: LCP#1        0/  150/    0/      0/128237    456/ 232/ 228
11 36   SCP async: LCP#8        0/  150/    0/      0/99088    444/ 228/ 228
12 35   SCP async: LCP#9        0/  150/    0/      0/98919    816/ 228/ 228
13 37   SCP async: LCP#2        0/  150/    0/      0/126100    828/ 228/ 228
14 41   SCP async: LCP#7        0/   17/    0/      0/86316    204/ 228/ 228

3. SCP ping from supervisor to module

Switch1#remote command switch test scp ping 3
pinging addr 3(0x3)
assigned sap 0x28
no response from addr 3(0x3)     //communication between supervisor and line card is having issue

Switch1#remote command switch test scp ping 1
pinging addr 1(0x1)
assigned sap 0x28
addr 1(0x1) is alive                 //communication between supervisor and line card is good


4. Change diagnostic level to complete and reseat the module

Switch1(config)#diagnostic level complete

Switch1#show diagnostic result module 2 | inc Diagnostic
  Overall Diagnostic Result for Module 2 : PASS
  Diagnostic level at card bootup: complete


 



Further reading:
EOBC interface





Tuesday, October 20, 2015

Etherchannel load balancing decision


                                  Ether Channel as most of you know allows multiple physical Ethernet links to combine into one logical channel. Advantage is not only it allows load sharing of traffic among the links in the channel but  redundancy in the event that one or more links in the channel fail.




How the traffic flows over the physical links bundled is the key topic.Ether channel uses a hash algorithm to achieve this task. Hash algorithm computes a value in the range 0 to 7 and this result  is called a Result Bundle Hash (RBH). Only on this value a particular port is chosen. Hashing algorithm is deterministic; if you use the same addresses and session information, you always hash to the same port in the channel. This method prevents out of order packet delivery.

Below table displays the ratio value each port accepts. Maximum ports than can be bundled is  8.


Number of Ports in the EtherChannel
Load Balancing
8
1:1:1:1:1:1:1:1
7
2:1:1:1:1:1:1
6
2:2:1:1:1:1
5
2:2:2:1:1
4
2:2:2:2
3
3:3:2
2
4:4


Ether-channel load balancing can use MAC addresses, IP addresses, or Layer 4 port numbers and you should be very careful while choosing one. Once applied this implies to all the ether-channels on the switch.Below command can be used to check the load balancing method used on the switch.

sh etherchannel load-balance

Lets take few examples analyzing traffic and what method suits best for load balancing to work effectively.

1. Users are sending files to same file share.
   If destination MAC address is used then it results in the choice of the same link in the channel each time.
   It is good to used source MAC or IP address for better load sharing.

2. Communication is between two hosts with different port numbers
    In this scenario use of IP address or MAC will result in overloading one link in the bundle. Load balancing     can be achieved only when port numbers are used.

Identifying the physical interface from a bundle for a particular flow  is very useful when troubleshooting issues in a switched environment. Few days back i came across a issue which explains the importance of finding the physical interface from a bundle. Ether channel is built between two 65xx switches and each physical interface is terminated on different module(2 and 3). A layer 3 VLAN is configured on the switches with HSRP. Few host machines on the same segment became unreachable from standby switch. Upon further troubleshooting we identified any traffic flowing on physical  interface in module 2 of the bunlde was unreachable where as the one's flowing on interface in module 3 were reachable.We tried replacing the GBIC,cable etc but the result remained the same until we reseated module-2.


Command to check which physical interface from the bundle the algorithm selects is model specific and listed in URL mentioned in further reading.





Further Reading:
http://www.cisco.com/c/en/us/support/docs/lan-switching/etherchannel/116385-technote-etherchannel-00.html

Thursday, October 8, 2015

Cisco Router/Switch performance by platform/Model


                     Routers/Switches from multiple vendors are available in the market but the most commonly used is Cisco. There are multiple reasons why one selects  Cisco products over the others like stability,performance,support etc.Extra care needs to be taken while selecting a network device.

Network architects plan and design the network based on customer needs.They decide on which device to use, which features should be enabled, any license required, capacity management etc. With the expansion of business device throughput is one key factor that some times gets overlooked and might cause a major outage or performance issues.

How this happens? for instance a site was setup with a Cisco 1841 router and a 30 Mbps link. With time number of users in site increased and few servers were setup which was accessed by external users. Bandwidth is now not sufficient and they decide for a upgrade(45 Mbps). After upgrade users complain intermittent connectivity issues,slowness, drops etc. Service provider tests the circuit and confirms no issue.

Where is the problem then ? Yes,throughput now comes into picture. Cisco 1841 platform supports a throughput of 38 Mbps and link bandwidth is 45Mbps, which means the router is receiving more traffic that it can handle. It's like a person can juggle with 5 glasses,you start throwing additional glasses which he can not handle and will start dropping. It's the same with Cisco devices as well. If more traffic is received on devices than it can handle, packets will be dropped. You will also notice the CPU utilization of the device will go high and usually IP input process/interrupts to be consuming more of CPU.

While calculating throughput of a device Cisco uses IP only and 64 byte packet size. Switching performance is in  packets per second. If additional features like access-list, encryption etc are added, throughput might decreases.

1. Cisco Router performance Matrix
Platform
Process Switching
Process   Switching
Fast/CEF    Switching
Fast/CEF   Switching
PPS
Mbps
PPS
Mbps
14xx
600
0.3072
4,000
2.05
160x(-R)
600
0.3072
4,000
2.05
1701
1,700
0.8704
12,000
6.14
1710
1,300
0.6656
7,000
3.58
1711-1712
1,700
0.8704
13,500
6.91
1720
1,400
0.7168
8,500
4.35
1721
1,700
0.8704
12,000
6.14
1750
1,400
0.7168
8,500
4.35
1751
1,500
0.768
12,000
6.14
1760
1,700
0.8704
16,000
8.19
1801-1812


70,000
35.84
1841


75,000
38.4
1861


146,142
74.82
1941


299,000
153.08
2500
800
0.4096
4,400
2.25
261x
1,500
0.768
15,000
7.68
262x
1,500
0.768
30,000
15.36
265x
2,000
1.024
40,000
20.48
2691
7,400
3.7888
70,000
35.84
2801
3,000
1.536
90,000
46.08
2811
3,000
1.536
120,000
61.44
2821
11,500
5.888
170,000
87.04
2851
15,000
7.68
220,000
112.64
3620
2,000
1.024
20,000-40,000
20-Oct
2901


327,000
167.42
2911


353,000
180.73
2921


480,000
245.76
2951


580,000
296.96
3640
4,000
2.048
50-70,000
25.6-36
3660
12,000
6.144
100-120,000
51.2-61.4
3631
4,000
2.048
50-70,000
25.6-36
3725


100-120,000
51.2-61.4
3745


225-250,000
25.6-36
3810
2,000
1.024
8,000
4.1
3810-V3
3,000
1.536
15,000
7.68
3825
25,000
12.8
350,000
179.2
3845
35,000
17.92
500,000
256
3925


833,000
426.49
3945


982,000
502.78
4000
1,800
0.9216
14,000
7.17
7120
13,000
6.656
175,000
89.6
7140
20,000
10.24
300,000
153.6
7200-NPE100
7,000
3.584
100,000
51.2
7200-NPE150
10,000
5.12
150,000
76.8
7200-NPE175
9,000
4.608
177,848
91.06
7200-NPE200
13,000
6.656
200,000
102.4
7200-NPE225
13,000
6.656
233,170
119.38
7200-NPE300
20,000
10.24
353,000
180.74
7300-MPLS-Experts-100


3,500,000(PXF)
1,792
7600-MSFC2
20,000
10.24
30,000,000
1,792
ASR1000-PRE4


10,000,000
5,120
12000(Engine 6)


50,000,000
20,000
CRS-1 LC


80,000,000
40,960



 2. Cisco Switch performance Matrix
Platform
Switch Performance(Mpps)
Switch Fabric(Gbps)
L2/L3
1900/2800
550,000
1
L2
2900-XL
3,000,000
3.2
L2
2900-LRE
3,000,000
5
L2
2940-8(TT,TF)
2,700,000
3.2
L2
2950-ST8-LRE
3,200,000
8.8
L2
2950-ST24-LRE
3,500,000
8.8
L2
2950(12,24)
6,600,000
8.8
L2
2950-48
10,100,000
13.6
L2
2950G(12,24)
10,100,000
8.8
L2
2950G-48
10,100,000
13.6
L2
2955-12(C,S)
2,000,000
13.6
L2
2955-12(T)
4,800,000
13.6
L2
2960-8
2,700,000
16
L2
2960-24
6,500,000
32
L2
2960-48
10,100,000
32
L2
2960G-8
11,900,000
32
L2
2960G-24
35,700,000
32
L2
2960G-48
39,000,000
32
L2
2970G-24(T)
35,700,000
24
L2
2970G-24(TS)
38,700,000
28
L2
3500-XL
8,000,000
10.8
L2
3550-24
6,600,000
8.8
L3
3550-48
10,100,000
13.6
L3
3550-12T/G
17,000,000
24
L3
3560-24TS/PS
6,600,000
8.8
L3
3560-48TS/PS
13,100,000
17.6
L3
3560G-24TS/PS
38,700,000
32
L3
3560G-48TS/PS
38,700,000
32
L3
3560E-24TD/PD
65,500,000
128
L3
3560E-48TD/PD
101,200,000
128
L3
3750-24TS/PS
6,500,000
32
L3
3750-48TS/PS
13,100,000
32
L3
3750G-24TS/PS
38,700,000
32
L3
3750G-48TS/PS
38,700,000
32
L3
3560E-48TD/PD
101,200,000
128
L3
3750G-24T
35,700,000
32
L3
3750G-12S
17,800,000
32
L3
3750G-16TD
35,700,000
32
L3
3750E-24TD/PD
65,500,000
128
L3
3750E-48TD/PD
101,200,000
128
L3
2948G/2980G/4912
18,000,000
24
L2
2948G-L3
10,000,000
22
L3
2948G-GE-TX
29,760,000
20
L2
2948G/2980G/4912
18,000,000
24
L2
2948G-L3
10,000,000
22
L3
2948G-GE-TX
29,760,000
20
L2
4003w/Sup I
18,000,000
24
L2
4006 w/Sup II
18,000,000
24x3
L2
4006 w/SupIII or IV
48,000,000
64
L3
4503 w/Sup II
18,000,000
24
L2
4503 w/Sup III/IV
21,000,000
28
L3
4503 w/Sup II-Plus-TS
48,000,000
64
L3
4506 w/Sup II
18,000,000
24
L3
4506 w/Sup III/IV
48,000,000
64
L3
4X06/4503/4507-R
48,000,000
64
L3
4507 w/SupIV
48,000,000
64
L3
45XX/4006 w/SupV
72,000,000
96
L3
45XX w/SupV-10GE
102,000,000
136
L3
4908G-L3(L2/L3)
11,000,000
22
L3
4948
72,000,000
96
L3
5500-SupI/II
1,000,000
1.2
L3
5500-SupIII
2,200,000
3.6
L3
600X,65XX Sup1/1A
15,000,000
32
L2
600X,65XX Sup1/1A-MSFC
15,000,000
32
L3
65XX/76XX Sup32
15,000,000
32
L3
65XX/76XX Sup2
30,000,000
256
L2
65XX/76XX Sup2-MSFC
30,000,000
256
L3
65XX/76XX-w/Sup 720
400,000,000
720
L3
8510(IP/IPX/IPMC/Bridging)
6,000,000
10
L3
8540(IP/IPX/IPMC/Bridging)
24,000,000
40
L3