In the article ‘Setting up Web Servers’ (page 106) in the PCQuest June 2001 issue, we talked about how to install the Apache Web server on a system running PCQ Red Hat Linux 6.2. (PCQ Linux 7.1 based on Red Hat 7.1 is on our July 2001 CDs.) The installation procedure for the Apache Web server remains the same. However, if you need to install the Apache development libraries and the Apache manuals then you can find their RPMs (apache-devel-1.3.19-5.i386.rpm and apache-manual-1.3.19-5.i386.rpm respectively) in the directory RedHat/RPMS on the second July 2001 CD (CD 2). It must be noted that default document root has changed from /home/httpd/html to /var/www/ html on PCQ Linux 7.1.
The most important configuration file for the Apache Web server is ‘httpd.conf’. By understanding and editing this file, you can control and customize the server to your needs. In this article we’ll explore a part of this configuration file, and in subsequent issues, talk about it in detail.
Assuming that you have installed the Web server, you can locate the file httpd.conf in the directory /etc/httpd/conf. Open this file in a Linux text editor to follow us better. The configuration in this file is separated into three sections. Section 1 consists of configuration parameters–called directives–which control the server processes called child (to understand the concept of child processes read the article ‘How Web Servers Work’, page 101, PC Quest June 2001), and declares global directives, which are applicable to all sections. The default directives in this section work fine for most scenarios. However, we will explain some of the directives that you may want to tweak to your needs. Section 2 comprises directives that configure the
Apache server, its document root, other Web directories, CGI directories, language, MIME types, etc. Section 3 is dedicated to virtual hosting, or hosting multiple websites on the same server. We will discuss this in the next issue. Lines preceded by a hash (#) mark are comments and not a part of the configuration. Comments are used to explain in brief about each directive.
Let’s start with some important directives in Section 1. Look at the following line in Section
1.
Timeout 300
When you key in the URL of a site in a client (Web browser), the client first connects to the server (Web server). After connecting, the Web browser issues a GET request to get a file–generally an HTML page. The ‘Timeout’ directive specifies the time between the connection and the request. In this case if the time between the connection and the request crosses 300 seconds the browser receives a timeout error and the connection closes. The timeout also applies for a PUT and POST methods. This directive thus ensures judicious utilization of a connection (to be more precise, connection resources).
Once a request is processed, the connection between the Web server and the browser is disconnected. Thus, for each subsequent request a new connection must be established. Consider an HTML page having graphic images and applets. This means a separate connection establishment for the page and for each graphics file and applet. This may produce a noticeable latency. The directive
KeepAlive on makes a connection persistent across several requests. The following directive ‘MaxKeepAliveRequests’ specifies the number of requests to allow during a persistent connection and the directive ‘KeepAliveTimeout’ specifies a timeout in seconds between requests during a persistent connection. We suggest that you have the ‘KeepAlive’ directive ‘on’ (also the default) and tweak the ‘MaxKeepAliveRequests’ directive according to the user experience on your site.
As explained in the article ‘How Web Servers Work’, the Web server handles the client’s request by forking or spawning out child processes. Spawning a new child process is a time consuming task, which involves setting up a new environment (variables) for the new process, allocating the process a slot in the operating system’s process stack, allocating memory segments, etc. Thus, instead of waiting for a request before creating a child process, Apache spawns some number of child processes in advance, specified by the directive ‘StartServers’. The ‘MinSpareServers’ directive specifies a minimum number of processes that are listening for a request and the ‘MaxSpareServers’ directive specifies the maximum number, which when reached, some of the processes are killed. You may like to tweak these two directives according to the load on your site and according to the hardware resources of your server respectively. The ‘MaxClients’ directive specifies the maximum number of clients that can connect to the server simultaneously and again must be set according to the previous considerations. We will explain ‘Listen’ and ‘BindAddress’ directives (related to virtual hosting) when we talk about Section 3.
Let’s look at Section 2. Here, the directive ServerAdmin must be set to a valid e-mail address (conventionally webmaster@your-domain). Usually the ‘root’ account is given an e-mail alias of ‘webmaster’. In case of an error at the server side, a page is displayed which would contain this e-mail address and hence the error can be reported to the right person. The directive ‘DocumentRoot’ specifies the Web directory from where to serve the files and is preset to /var/www/html. The next few directives that follow set permissions for the ‘DocumentRoot’ Web directory. Don’t think these permissions to be the usual read/write permissions given to file system directories. These permissions allow or disallow some features and client access to the Web directory. The features allowed for each Web directory are specified using the directive named ‘Options’. These include things like inserting an e-mail id of the webmaster in an error page, defining the root directory of a website, and setting its permissions, which are different from your ordinary file system permissions. We’ll talk more on what these options are, and what they do in the next issue, with some examples that you can try out.
Shekhar Govindrajan