by July 14, 2001 0 comments

Here, we discuss some techniques that let you distribute your Web server’s load among multiple machines to achieve better performance and scalability

Like everything else in life, your Web server too has a constraint on the number of pages it can serve simultaneously to clients (usually browsers). You’ll especially appreciate this if you serve dynamic content generated in response to user input–CGI scripts, for example–as this depends heavily on database access and use of other server-side resources; or run a website whose popularity has outgrown its current means of serving.

One way of extending this limit is ‘load balancing’.

Web servers serve pages to clients as and when a request is made. Whenever a server receives a request, it creates a child process, which handles that particular request. As a result, most Web servers run in multi-threading and multi-processing environments. However, even such an environment puts a limit on the number of Web pages that can be served concurrently, largely because of two factors: the bandwidth available and the Web server itself. Assuming that you have sufficient bandwidth, the performance of your Web server becomes the critical factor.

Your Web server’s performance is determined to a large extent by the underlying hardware resources available to it. This limit is higher when the content delivered is static like images or text, but considerably lower when dealing with dynamic content. Load balancing involves spreading the ‘load’ among multiple machines, or sometimes even among multiple sites, thereby increasing the resources available. Load balancing in its crudest form would, for example, involve placing all HTML files on one host, all images on another and all CGI scripts on the third. Real-life load balancing, however, involves carefully examining access patterns of various files on the website and keeping identical copies of the same Web server and distributing the load amongst them. 

Load-balancing techniques

Load balancing can be done through hardware- or software-based techniques. One technique, called DNS load balancing, involves maintaining identical copies of the site on physically separate servers. The DNS entry for the site is then set to return multiple IP addresses, each corresponding to the different copies of the site. The DNS server then returns a different IP address for each request it receives, cycling through the multiple IP addresses. This method gives you a very basic implementation of load balancing. However, since DNS entries are cached by clients and other DNS servers, a client continues to use the same copy during a session. This can be a serious drawback, as heavy website users may get the particular IP address that is cached on their client or DNS server, while less-frequent users get another. So, heavy users could experience a performance slowdown, even though the server’s resources may be available in abundance.

Another load-balancing technique involves mapping the site name to a single IP address, which belongs to a machine that is set up to intercept HTTP requests and distribute them among multiple copies of the Web server. This can be done using both hardware and software. hardware solutions, even though expensive, are preferred for their stability. This method is preferred over the DNS approach, as better load balancing can be achieved. Also, these load balancers can see if a particular machine is down, and accordingly divert the traffic to another address dynamically. This is in contrast to the DNS method, where a client is stuck with the address of the dead machine, until it can request a new one.

Another technique, reverse proxying, involves setting up a reverse proxy, that receives requests from the clients, proxies them to the Web server and caches the response onto itself on its way back to the client. This means that the proxy server can provide static content from its cache itself, when the request is repeated. This in turn ensures that the server itself can focus its energies on delivering dynamic content. Dynamic content cannot generally be cached, as it is generated real time. Reverse proxying can be used in conjunction with the simple load- balancing techniques discussed 
earlier–static and dynamic contents can be split across different servers and reverse proxying used for the static content Web server only.

Load balancing allows Web servers to be scaleable, that is, a server can be scaled up or scaled down. For example if the load on your website is balanced across three identical servers and it experiences a sudden flood of users beyond its current capacity, you can set up additional servers identical to the ones already running and modify the DNS entry of your website to include pointers to these additional hosts. This is called scaling up. Conversely, if you know that your website is particularly low on traffic during certain times, on Sundays, for example, you can remove one or more servers and put their resources to better use. This is scaling down. This involves prior knowledge of access patterns and a little bit of foresight.

These techniques apart, it is important to have the server’s basics right too–fast CPUs, plenty of RAM, faster (and fatter) disks, etc. Check to ensure that your operating system supports symmetric multiprocessing (SMP), else having multiple processors is a waste. Turning off reverse DNS (looking up the names of addresses accessing your site), if not needed, also provides a
considerable increase in performance. 

Kunal Dua

No Comments so far

Jump into a conversation

No Comments Yet!

You can be the one to start a conversation.

Your data will be safe!Your e-mail address will not be published. Also other data will not be shared with third person.