Implementation Guides

Proactively Troubleshooting

PCQ Bureau

07 Apr 2004 05:39 IST

New Update

Prevention is better than cure. If you can anticipate the problems your network may face and fix them, both you and your users will be in a better position than if you had to resort to taking corrective action once trouble had struck. You need not only tools with diagnostic capabilities but also an understanding of your business needs (the service level requirements from the network arise from those of the business; if you are a stock broking firm, for example, you cannot afford even a few seconds delay in order placing during peak trading hours) to be able to proactively monitor your network infrastructure.

Advertisment

You may well argue that if you have a well-placed monitoring and alerting system why do you need to proactively seek out problems and solutions. But, take for example that you have configured your system to alert you if your router reaches an 80 percent threshold level of load. But, what if your router has been running at a 79 percent for all week? Since your system is not configured to send you an alarm below 80 percent, you will not even know of this potentially dangerous anomaly in your router till trouble actually strikes. A proactive stance would be to not let such a situation arise. One way out is to set multiple thresholds for alarms, one at the critical point and another before that when the load reaches, say 10 percent before the critical point. But, still you need to check whether usage levels are nearing thresholds and whether the settings are right-what if the critical level is being crossed, but no alarm is being raised because of a wrong setting?

Based on your knowledge of your network you must always reconfigure the default settings on your management tools. For instance, the default setting on a tool may say that a 4 percent error rate can be ignored, while anything over 10 percent must be reported. But, your experience with your network (and your needs) may tell you that error rates above 2 percent may result in slow response time (as in the stock broker example above). This, in fact, is an ongoing process. As the usage patterns on your network change, which they will depending on business situations, you need to readjust the threshold levels.

Another example of a problem that you can anticipate before it occurs is OS upgrade related network-connectivity issues. It may happen that your network adapter may be working fine on your current OS, but may fail to do so when you upgrade the OS or shift to a different one on say a software firewall (dangerous) or a file and print server (not so dangerous), or even when you apply a simple sounding upgrade or patch. Needless to say, your and your users' lives will be far simpler if you can anticipate and correct such problems before they strike. But, if they miss all the 'traps' that you have set, like some obviously will, then it is only proactive monitoring that will help save the day.

Advertisment

In the multi vendor, heterogeneous networks of today, automation of services, including network monitoring and management,
is not always the end of the story. Your network-management tools may sometimes not work for all devices, services, OSs and applications-some may be standards compliant and robustly tested, while others may be developed in-house or custom developed for you. Here again, the solution is: pre-emptive management rather than finding fault or troubleshooting after an outage.

with Kunal Dua

Advertisment

Stay connected with us through our social media channels for the latest updates and news!