"Linux Gazette...making Linux just a little more fun!"


Backup for the Home Network

By JC Pollman and Bill Mote


Everyone has a backup plan. Unfortunately, most of us use the "No Backup" plan.

Disclaimer: This article provides information we have gleamed from reading the books, the HOWTOs, man pages, usenet news groups, and countless hours banging on the keyboard. It is not meant to be an all inclusive exhaustive study on the topic, but rather, a stepping stone from the novice to the intermediate user.  All the examples are taken directly from our home networks so we know they work.

How to use this guide:

Prerequisites: If you have Linux installed, you will have everything you need.

Backup Plan: For the home network, you have to have some sort of backup plan.  Although hard drives will crash, the real value in the backups is restoring accidentally deleted, or changed, files. Sooner or later you will delete, or change, something important, and without a backup, you could render your computer unbootable.  I am embarrassed to admit this, but I actually deleted /root on one occasion. Note: backups should be considered compromised if you have been cracked.  Backup plans need to be simple to implement or they will not get done - especially at home. A  backup plan for home should cover two areas: how much are you going to backup, and how are you going to do it with the least amount of effort.

How much to backup: I try to minimize the amount I backup because storage space costs money.  I only backup directories, not the entire file system. Most of /usr and /opt are on the install cdrom, so if the hard drive crashes, I will install them by default with a new install.  /etc and /home are the most important as they contain the configuration and custom settings files.  Your backup plan should include full backups of the selected directories every so often, and then backup just the changes (incremental backups) daily.

How to backup: tape drives are usually too expensive for the home network, and floppies are impractical. (Note: I gave up on floppies when the disk count went over 132!)  We believe the best compromise is using a spare hard drive.  Notice we said hard drive and not partition! Every time I have had problems with hard drives, the entire drive died or became corrupted, not just a partition. Hard drives are so cheap that using one solely for backups is the most cost efficient method. It is not the most secure way to save your files as a cracker can get to them, but there are limits to how far we are willing to go to make home backups.

Backup Programs: There are three common programs used for backups that come with almost all un*x distributions: tar, cpio, and dump.  Each has its strengths and weaknesses.

TAR: Tar is the most commonly used backup program for small networks. It has been around quite a while and will likely remain for quite some time.  Most people do not know, however, that although tar was designed to put files on tapes, it was not designed for backups. Instead, its purpose is to put the files on the tape so they can be installed on other computers. As such, its incremental backup function is weak.

CPIO:  cpio is similar to tar in that it does not have an incremental backup function. In fact, it does not even have a "file list" function: you have to feed it the name of the files you want to archive by piping them from the find program.  cpio has two advantages over tar: it creates a smaller uncompressed archive, and it does not die if part of the archive is corrupted.

DUMP: dump is completely different from tar or cpio.  It backups up the entire file system - not the files. dump does not care what file system is on the hard drive, or even if there are files in the file system. It dumps one file system at a time, quickly and efficiently, and it supports 9 levels of incremental backups. Unfortunately, it does not do individual directories, and so, it eats up a great deal more storage space than tar or cpio.

Our Backup Solution: Click here to see our backup script - named run-backup. Save it your hard drive and then make it executable by typing:

chmod 777 run-backup [Enter]
What part of the script you need to modify: This script is designed to run on any computer by changing only the four variables: COMPUTER, DIRECTORIES, BACKUPDIR, and TIMEDIR.  Currently we are running it on 2 linux boxes and 2 solaris boxes. The BACKUPDIR is nfs mounted on our machines, but it could be another hard drive on the computer. We suggest that you set this script up and run it for a month before making major changes.

What the script does: when the script is run, it first looks to see if today is the first day of the month. If it is, it makes a full backup of the files listed in the variable DIRECTORIES, names the tar ball after the computer and date, e.g. myserver-01Nov.tgz and puts it in the BACKUPDIR directory. Since this is a unique file name, it will stay in the BACKUPDIR until you delete it.  Next, if today is not the first of the month, but it is Sunday, the script will make a full backup of the DIRECTORIES, and overwrite the Sunday file in BACKUPDIR.  In other words, there is only one Sunday file in the backupdir and it is overwritten every Sunday. That way we do not waste much space on the hard drive but still have a full backup that is at most one week old. The script also puts Sunday's date in the TIMEDIR directory. If today is not the first or a Sunday, the script will make an incremental backup of all the files that have changed since Sunday's full backup. As such, each day's backup after Sunday should get larger than the last.  This is the trade-off: you could do an incremental backup of just the files that changed in the last 24 hours and keep each day's backup quite small, but if your hard drive goes south on Friday, you will have to restore Sunday's, Monday's, Tuesday's, Wednesday's and Thursday's backups.  By doing an incremental backup from Sunday each day, the backups are larger, but you only have to restore Sunday's and Thursday's backup. Here is an abbreviated look at the backup directory:

root   828717 Oct  1 16:19 myserver-01Oct.tgz
root    14834 Oct 22 01:45 myserver-Fri.tgz
root     5568 Oct 18 01:45 myserver-Mon.tgz
root    14999 Oct 23 01:44 myserver-Sat.tgz
root  1552152 Oct 24 01:45 myserver-Sun.tgz
root     5569 Oct 21 01:45 myserver-Thu.tgz
root     5570 Oct 19 01:45 myserver-Tue.tgz
root     5569 Oct 20 01:45 myserver-Wed.tgz


How to run the script: We run this script as a cron job at one o'clock in the morning every day. If you need help with cron, click here. Note: the incremental backups need the time of the Sunday backup. If you start in the middle of the week, you need to create the time file in the TIMEDIR. Using the script above as an example, the file's name is: myserver-full-date, and its consists of a single line:

26-Sep

Restoring: Restoring is relatively easy, with only one thing to remember: tar does not include the leading / on files. So,if you wanted to restore /etc/passwd, you would first have to cd to /, and then type:

tar -zxvf {wherever_file_is}/myserver-Sun.tgz  etc/passwd
Next month we will be discussing dhcp.


Copyright © 1999, JC Pollman and Bill Mote
Published in Issue 47 of Linux Gazette, November 1999


[ TABLE OF CONTENTS ] [ FRONT PAGE ]  Back [ Linux Gazette FAQ ]  Next