by April 7, 2004 0 comments

When we decided to do this story, we asked our CIO to ‘give’ us our company’s network for a few weeks so that we could manage it first hand . He agreed, provided we did not bring any machines down, tell him all that we found out…. And, thus, we got our hands on some 400 hosts across three subnets. (A host is any device with an IP address. This could be a server, a PC, a router ,a networked MFD. etc.).

We followed a tried-and-tested approach to network management:: map, monitor, alert , proactively troubleshoot (see diagram on left). As a first step, we mapped our network for what devices were connected and where. In short, we made an inventory of our assets. The next step was to monitor these devices for various parameters, such as availability, hard-disk space utilized, changes in performance levels, memory usage, system load and traffic patterns. We used a popular and free tool called Nagios for this . It told us whether our hosts and the services they offered (such as http and ftp) were up or not.

We configured our switches and routers such that they could send traps to the monitoring station if they encountered abnormal behavior, such as a machine sending excessive broadcast packets or the router dropping too many packets. The network-monitoring station can then receive those traps and generate alerts.

Map Your Network
A network map is like a road map, which tells you how you can reach various landmarks. Similarly, a network map tells you where the hosts on your network are and how they are connected

And Alert For Anomalies

Monitor your network to know what is going wrong with it, where and when so that you can fix the problem before it gets out of control 
SNMP Monitoring and Alerting
With SNMP traps and an SNMP collection program you can easily know of the problems in the network
Proactively Troubleshooting
Anticipate and fix problems before they strike your network

Arising out of monitoring is alerting. You should configure what you need to be alerted about, the method (SMS or e-mail?) and define the escalation process (if a problem is not solved in a certain time, who else should be alerted).

All through these three stages, we try to anticipate problems and find solutions for them. But, not everything had straightforward answers. For example, when we discovered that the CPU utilization of one server was quite high, showing a continuous 60 to 70 percent usage, we found that a client infected with the Welchia worm was sending a continuous stream of pings to an Internet host and this was causing a hike in utilization levels of that server!

If you can anticipate such problems for your network, you can put preventive measures in place. In this case, it could be two things. One, run Nagios agents (software installed on local host. It reports performance levels) that will send an alarm if CPU utilization goes beyond a limit. Two, configure the switch to report if a port generates excessive traffic. A combination of these
ill tell you what the anomaly is.

We have explained how proper network management can help you anticipate and resolve problems before something nasty happens.

We have used certain software for our specific tasks. You could choose from a wider list; but remember that the ones you choose, the parameters that you monitor for and the values that you set as triggers will depend on your businesses needs.

Bottomline: If you are responsible for a reasonably sized network (100 clients plus), then it your life would be a lot easier if you were to invest time, resources and effort in creating a network-management setup. Also, network management is not a one-time activity. Map, monitor, alert, rectify, back to map, monitor….

By Anoop Mangla, Juhi Bhambal, Krishna kumar, Pallavi Sharda and Sanjay Majumder

No Comments so far

Jump into a conversation

No Comments Yet!

You can be the one to start a conversation.