Advertisment

Backups and Disaster Recovery

author-image
PCQ Bureau
New Update

Computers have represented evolving compaction, growing power and high

availability. A computer system makes use of rotating mechanical devices such as hard disk

drives, floppy disk drives and cooling fans. Disk drives are used for long term storage,

but unfortunately, being mechanical in nature, these have higher mean time between

failures (MTBF) compared to static storage components - RAM, NVRAM, etc. Added to this,

dear Murphy provides an accurate sense of timing for such failures–a bluechip invites

you for a demo of your product, you land up ahead of time, unpack your laptop, and boom!

The laptop reports a hard disk drive error. Needless to mention, redundancy has become the

name of the game and backup devices are big business.

Advertisment

If you think you use Linux and don’t need any of this, you are

wrong. There’s precious little Linux can do to restrain your hard disk failure. If

the disk goes down, Linux goes down with it, and depending upon the situation, Linux would

perhaps care to tell you that something is drastically wrong. So with the need for backups

firmly established, let’s look at what Linux provides in terms of support for backing

up and recovering data from backups. Note that any situation, which implies loss of data,

is termed a "disaster".

backup.weekly



#!/bin/sh


# backup to remote tape drive on dumpyard, compressed, all files under


# /export/home


DEVICE=backups@dumpyard:/dev/st0


FILES="*"


#


cd /export/home


echo "Insert Cartridge labelled WEEKLY"


read x


tar -zcvf $DEVICE $FILES >/tmp/backup.log.‘date ‘+%b%d%Y"








Advertisment

Linux, like every other flavor of Unix, provides the basic data

archiving utilities like tar, cpio, and dd, used for backing up data on tapes. Linux also

provides an implementation of software RAID that provides up to level 5 support. Armed

with this information, let’s see how we can plan for backups and disaster recovery.

First, what is your environment? Is the MTTR (Mean Time To Repair)

very critical that it must be kept the lowest? Depending on these, you would choose to

implement hardware RAID with at least levels 0 and 1. Hardware RAIDs come with hot disk

swap capability; such a solution is expensive but gives you literally zero down time. An

alternate to such a system is software RAID implementation that functionally eliminates

service interruptions due to disk failures by making use of the mirrored disk and keeping

the services going. However, the absence of hot swap capability would require that the

system is shut down, the disk replaced, synchronized with the mirror disk and the services

started up. This down time could be during the lean service hours. The least expensive

solution with the longest MTTR is using tapes for restoring the data onto a new disk.

It’s important to understand that there must be made a clear

demarcation of the data on the system into system data–data used for configuring

system services like printers, network, kernel, etc, and user data–data generated by

users. Typically, system data is not expected to change very often (except system logs)

and does not, whereas user data does change daily. So one would handle this by backing up

the user area, possibly every day, and the system data less frequently. Backing up itself

is done in two possible ways: complete or full backups and incremental backups. As the

name suggests, incremental backups record changes that have occurred since the last full

backup. A short note about the backup devices. Linux, by default, has

support for SCSI tapes. The device /dev/st0 indicates the first such tape drive attached

to your system. If you’d use mirroring across disks, you’ll see devices like

/dev/md0, /dev/md1, etc, which are logical devices that represent a file system that

points to two physical disk partitions that are mirrored.

Advertisment

Now, let’s look at what utilities to use when you do not use

mirroring. I’d advocate the use of cpio to copy files from one disk to another. One

single command does all, elegantly. Here’s a typical command that you’d use to

copy the directory structure under /export/users to /spare/backup/users

align="left" bgcolor="#FFEAF4"> backup.daily



#!/bin/sh


# Dump thingie


# Do incremental backups either to tape that is already in the remote


# tape drive or onto a spare disk that is mounted locally; it is assumed that


# these archives will be written to tape later.


#


if < ‘echo $#‘ -ne 1 >; then


echo "Usage: $0 "


exit


fi


#


BASE=/usr/local/bin


now=‘date ‘+%b %d %T %Y’‘


nowfile=‘date ‘+%b%d%Y’| tr -d ‘\040*’‘


then=‘cat $BASE/date.last.dump‘


if < ‘echo $1‘ = "disk" >; then


DEVICE=/mnt/backups/$nowfile.tar.gz


echo "Backing up to "$DEVICE"; Check if the backup partition is mounted in
/mnt"



echo "Enter to continue; ^C to abort"


# The mount check for the partition can be automated too; for the present,


# do it one at a time, abort and restart.


read x


else


DEVICE=root@dumpyard:/dev/st0


echo "Backing up to "$DEVICE"; Assuming tape is loaded"


read x


fi


#


HOME=/export/home


#


cd $HOME


#


FILES="*"


# Update FILES as required


#


/usr/sbin/tar -z -c -v \


-f $DEVICE \


-N "$then" \


-V "Dump from $then to $now" \


$FILES >/tmp/backup.log.$nowfile


echo $now > $BASE/date.last.dump







































# cd /export/users; find . -name -depth | cpio -pldmuv

/spare/backup/users/

Advertisment

You could similarly create a cpio archive by specifying a tape

device instead of the destination directory.

A variation of the same using the (GNU) tar is

# cd /export/users; tar zcvf /dev/st0 * >/tmp/backup.log.DDMMYY

Advertisment

dd is typically used in conjuction with cpio or tar.

# cd /export/users; tar zcvf - * | dd of=/dev/st0

is equivalent to the earlier command. To backup on a remote system,

use either of

Advertisment

# cd /export/users; tar zcvf backups@remote.host.name:/dev/st0 *

(Backup onto tape device attached to host remote.host.name on which

userid "backups" has permissions to allow remote command execution from userid

root on your machine.) How you back up is also dependent on how your file systems are laid

out. The most trivial case is where the root file system has all the system areas (/var,

/usr, /sbin, /etc, /dev, /lib, /boot and /bin), and the other file system is the user file

system (typically named /export, /home or /users). In such a situation, contents of the

user file systems will be backed up more frequently when compared to the root file system.

When you back up the root file system, you may have non-critical data like the cache area

of the web/ftp proxy-cache, which you may want to exclude from the backup.

I use tar effectively and would advise you to do so too. tar

(expands to tAPE arCHIVER) is used for archiving on tape. Here’s an example of

creating a compressed tar archive of /usr/doc/HOWTO excluding the "mini"

directory:

Advertisment

# cd /usr/doc/HOWTO; tar –exclude mini -zcvf

/tmp/x.tar.gz .

Note that the source file list is specified as the current directory

".". A common mistake users make here is to use "*" which contradicts

the exclusion and overrides it. The destination file can be written to tape by replacing

/tmp/x.tar.gz by the local or remote device name. Note that the archive will not contain

the leading path names since we have descended to that directory. Replace "mini"

and "." with the pathnames "/usr/doc/HOWTO/mini" and

"/usr/doc/HOWTO", respectively. Again, tar will strip the leading "/"

in this pathname in the tar archive. Use the -P option of tar to retain the leading

"/".

Here’s how to back up files that have changed after a

particular date, using the -N option of tar. You would use this option to back up config

file changes in the systems areas or to perform a daily backup of user areas. The example

shows how to back up changes on the user file system since the Valentines day.

# cd /export/users/; tar -zcv -N "Feb 14 17:30:32 1999" \?

-V "Incremental backup from Feb 14 17:30:32 1999 to Mar 1 12:00:00 1999" \? -f

/dev/st0 *

(The "\" at the end of each line implies (a \and a

) a continuation of the command on the next line. The "?" is the

secondary prompt that Linux gives by default to signal a continuation. You could continue

to type the command without having to give these command continuations.)

The -V option allows you to tag a volume label to the tape archive.

Having seen the various options, let’s now construct a backup

strategy. We’ll assume that you toy around less with your system areas. As soon as

you install the system, configure and customize it to your requirements, it’s time to

take an incremental backup of the system area. You may or may not want to take a complete

backup of the system area just after installation–you have it on your distribution

anyway. You could repeat such incremental backups either after each change you make to the

system config files (recommended), or periodically, say every fortnight. You could make

the increment either from the date of install or from the last incremental backup date.

The former would not allow you to record a history of changes whereas it’s faster

when it comes to restoring the state in case of a crash; one extract would bring it to the

state before the crash. As for the user file system, it’s only fair that incremental

backups are done daily, and a full backup every week. The high frequency is again to

restore the latest state of the file system ASAP. So you would do an incremental backup of

the system areas as soon as you had your system ready after installation and do so each

time a change is made. Your user areas would be backed up fully at start, incrementally on

a daily basis, and a full backup every weekend. To have guard against tape errors,

it’s a good practice to retain the previous week’s set of backups–one full

backup and that week’s incremental backups. This way you’d be set back in time

when you extract the file system, but not too much.

In case of a crash, you would first install the bare system after

perhaps replacing the disk and then begin to extract the file system states, in the same

sequence as they were backed up so that the latest backup is extracted last. In case of

the system backup, you’d need just one extract, if you did an incremental backup each

time from the date of install. In case of the user file systems, you would extract the

full backup first and then the incremental backups for that week.

Extracting all files from an archive is simple. Assuming you have

archived the information with leading pathnames (using -P), here’s what to do:

# tar zxvf /dev/st0

If only a particular directory is to be extracted, specify that

particular directory as the last argument to tar.

There are other options in tar. -M allow an archive to be split

across tapes, -L allows you to specify the length of the tape after which to ask for a

fresh tape.

Typically, data cartridges used are DDS-I, 90 meter tapes that have

a capacity of 2 GB. Depending on the tape drive, compression can be enabled by using the

"mt" command in Linux. Try "mt -f /dev/st0 densities" to get a listing

of the densities of various drives and formats. DDS-II tapes give higher densities and

hence more capacity.

There’s another driver in Linux that uses the floppy drive

interrupt (IRQ 6). That device is /dev/ftape. It’s not commonly used as of now.

A commercial package available to do backups with a GUI front-end is

BRU (Backup and Restore Utility) (www.estnic.com/). One interesting option that BRU

provides is to estimate the backup size, number of tapes required, etc. An option for data

consistency check is also available. The big difference with BRU is its capability to

archive raw disk partitions. For example, one could use "bru -options -r

/dev/sda1" to back up the root partition of your SCSI disk. BRU is an interesting

utility to use given that it bunches together the capabilities of tar, dd and dump (a

command available to archive file systems on Solaris–tm).

Before I sign off a few tips for those of you who handle sensitive

data and require low downtime. Here’s my experience–downtimes are largely due to

disk crashes or power supply failures. Disk crashes occur when you don’t want them

to: December 31 is my favorite; happened three times so far! Build backing up into your

routine; automate them when possible. Do not automate full backups if you can avoid it.

Have someone around when they are going on. After backing up, ensure that data is backed

up properly by actually extracting a file from that backup. This is still not foolproof.

You might want to add a file to the archive and then extract that file. In the process,

the entire tape is scanned since the added file is at the end of the archive. Finally, as

a thumb rule, I discard tapes that I use for full backups after about 60 writes; at a

frequency of once a week, they last me for a good year. What I have not mentioned here is how to manage the tapes themselves

and details of the tape drives. The commands shown are not optimal; they are kept simple

and non-repulsive by cutting down geekish appearance or usage. The boxes included sample

programs that do weekly and daily backup, adapted from the GNU tar information files.

Happy archiving.

Linux user groups in India



There is an all India user group Linux–India www.linux-india.org and chapters in
various cities.



To subscribe to the Linux-India mailing list send mail to majordomo@grandteton.cs.uiuc.edu
with subscribe Linux-India in the body of the message.



Besides, there are active user groups in many cities, including Chennai, Bangalore,
Mumbai, Ahmedabad, Delhi, and Trivandrum.


Linux

software vendors




This is an incomplete list of vendors stocking Linux software. Today, you can find Linux
software on the shelves of most bookstalls, including Computer Bookshop in Mumbai,

Gangarams in Bangalore, and Ebony in Delhi.




Bangalore



G T Enterprises


913, 14th Main, 4th Cross, Maruthi Circle, Hanumantha Nagar, Bangalore. Tel:
80-6606093,6671407



E-mail:gtintblr@blr.vsnl.net.in

http://www.gtcdrom.com.

Prices at www.gtcdrom.com/plist.txt



Genisys


# 2, MIG, 2nd Stage, KHB Colony, Basaveswaranagar,Bangalore 560079


Tel: 80-3481443,3481315 E-mail: kvs@blr.vsnl.net.in


Delhi






G Bhattacherjee



Tel: 11-6855711 E-mail: bgana@ndb.vsnl.net.in


Prabhakar Singh


Tel:11-91297478

Advertisment