# vim: set ft=text80:
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
#
#              C E D A R
#          S O L U T I O N S       "Software done right."
#           S O F T W A R E
#
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
#
# Author   : Kenneth J. Pronovici <pronovic@ieee.org>
# Project  : Cedar Backup
# Revision : $Id: INSTALL,v 1.5 2002/12/10 19:37:49 pronovic Exp $
# Purpose  : Installation instructions
#
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

#
# Background
##

The initial process of configuring the Cedar Backup package is a
unfortunately somewhat complicated, and it's also difficult to script
from an installation standpoint.  (The process of configuration is not
any more difficult than some other similar kinds of tasks - such as
setting up a VPN - but that may or may not be of comfort to you.) The
good news is that once you get through the initial configuration
process, you'll hardly ever have to change anything. Even better, the
most typical kinds of changes (i.e. adding and removing directories from
a backup) are easy.

This file describes how to install the Cedar Backup source package,
and then also describes all of the steps that you'll need to take
after installing the package.



#
# Installing package dependencies
## 

The Cedar Backup package relies on a number of outside packages.  Before
installing Cedar Backup, you must make sure that certain dependencies
are met.

First, Cedar Backup is written in and depends on Python
(http://www.python.org), version 2.2.  Most Linux distributions already
have Python pre-packaged.  Cedar Backup also depends on the Python
4Suite XML package.  These must be installed before you install the
Cedar Backup package itself.

Client machines require:

   du 
   mount
   umount
   GNU tar
   gzip                 (optional)
   compress             (optional)
   bzip2                (optional)
   ssh/scp or rsh/rcp   (not required in a pool of one)

Master machines have the same requirements as client machines, and then
also require:

   cdrecord
   eject
   mkisofs

If you do not currently have one of these packages installed, you should
be able to find a link to it at freshmeat (http://www.freshmeat.net).


#
# Installing the package
##

Cedar Backup is distributed as a standard Python distutils
source distribution.  When you un-tar the Cedar Backup tarfile, you
will find a directory containing (among other things) a file called
setup.py.  Simply use the command:

   python setup.py install

to install Cedar Backup.  Note: in the command, 'python' must be
the Python version 2.2 interpreter.


#
# Determining the mode of installation
## 

Before proceeding any further, read manpage for cback(1) and skim the
manpage for cback.conf(5).  Both manpages are included with the source
distribution.  The manpage for cback(1) details all of the options
available to the cback script (the main Cedar Backup front-end), and
also provides some information on how backups are structured. The
manpage for cback.conf(5) describes all of the different Cedar Backup
configuration options.

After you have read the manpages, decide what role the machine you're
configuring will play in a Cedar Backup pool:

   o single machine (pool of one)
   o master of a pool
   o client in a pool

You must take slightly different actions depending on the role your machine
will be taking.  


#
# Configuring a single machine (a pool of one)
##

Cedar Backup has been designed to back up many machines in a backup
pool. However, it will just as easily work for a single machine (i.e. a
backup pool of one).

All of these configuration steps should be run as the root user, except
when indicated.


Step 1: Configure your CD-R or CD-RW drive.  

   Your CD-R or CD-RW drive must either be a SCSI device or must be
   configured to act like a Linux SCSI device (for instance, IDE devices
   can use the Linux IDE-SCSI interface). For more information on how to
   configure your CD-R/CD-RW device, check out the Linux CDROM HOWTO:

      http://www.tldp.org/HOWTO/CDROM-HOWTO/

   Regardless of what type of device you have, make sure you know your
   device's SCSI address (scsibus, target, lun) and Linux device name
   (i.e. /dev/cdrw).


Step 2: Configure your backup user.

   You should create a user to be used for backups.  Some Linux
   distributions may come with a "ready made" backup user.  For other
   distributions, you may have to create a user yourself.  You may
   choose any id you like, but a descriptive 'backup' or 'cback'
   name works well.  

   See your distribution's documentation for information on how to
   add a user.


Step 3: Create your backup tree.

   Cedar Backup requires a backup directory tree on disk. This directory
   tree must be roughly three times as big as the amount of data that
   will be backed up on a nightly basis, to allow for the data to be
   collected, staged, and then placed into an ISO CD-ROM image on disk
   (this is one disadvantage to using Cedar Backup in single-machine
   pools, but in this day of really large hard drives, it might not be an
   issue). Note that if you elect not to purge the staging directory
   every night, you will need even more space.

   You should create a collect directory, a staging directory and a
   working (temporary) directory.  One recommended layout is this:

      /opt/
           backup/
                  collect/
                  stage/
                  working/

   If you will be backing up sensitive information (i.e. password
   files), it is recommended that these directories be owned by the
   backup user (whatever you named it), with permissions 700.


Step 4: Modify the backup cron jobs.

   There are four parts to a Cedar Backup run: collect, stage, store
   and purge.   The usual way of setting off these steps is through
   a cron job.  For more information on using cron, see the manpage
   for crontab(5).

   Since Cedar Backup should be run as root, you should either add
   a line such as this:

      30 00 * * * root  cback --all

   to your /etc/crontab file, or else create an executable script 
   containing just:
   
      #/bin/sh
      cback --all

   and place that file in the /etc/cron.daily directory.

   Backing up large directories and creating ISO CD-ROM images can
   be rather intensive operations, and could slow your computer down
   significantly. Choose a backup time that will not interfere with
   normal use of your computer (3:00am is usually a reasonable time).
   Also, be careful to choose your backup time so that a backup will not
   start executing before midnite and finish after midnite. This could
   cause inconsistencies with the directory structure on disk.


Step 5: Create the Cedar Backup configuration file.

   System-wide Cedar Backup configuration is generally controlled via
   the file:

      /etc/cback.conf

   You may change the location of the configuration file using the
   --config option to the cback script.

   The configuration file is written in XML.  If you are unfamiliar 
   with XML and you would like an overview of how it works, this 
   link might be useful:

      http://www.xml.com/pub/a/98/10/guide0.html

   A sample cback.conf script is distributed with the Cedar Backup
   source package.  This file provides an example of a "real" Cedar
   Backup configuration.

   Update the configuration file using the items you configured in steps
   1-4, using the sample as a starting point and using the manpage for
   cback.conf(5) as a reference.  You will need to configure all four
   sections: collect, stage, store and purge.


Step 6: Validate your configuration file.

   Use the command:

      cback --valid

   to validate your configuration file.  This command checks that the
   configuration file can be found and parsed, and also checks for
   typical configuration problems, such as invalid CD-R/CD-RW device
   entries.

   The most common cause of configuration problems is in not closing
   XML tags properly.  Any XML tag that is "opened" must be "closed"
   appropriately. 


Step 7: Test your backup.

   Place a valid CD-R or CD-RW disc in your drive, and then use:

      cback --all --full

   If the command completes with no output, then the backup was run
   successfully.  Just to be sure, check the log:

      /var/log/cback.log

   and also mount the CD-R or CD-RW disc to be sure that you can read
   it.

   IF CEDAR BACKUP EVER COMPLETES "NORMALLY" BUT THE DISC THAT
   IS CREATED IS NOT USABLE, PLEASE REPORT THIS AS A BUG. TO BE
   SAFE, ALWAYS ENABLE THE CONSISTENCY CHECK OPTION IN THE <store>
   CONFIGURATION SECTION.


From this point forward, your backups will run as scheduled out of cron.
Any errors that occur will be reported in daily emails to your root user
(or whichever other user receives root's email). If you don't receive any
emails, then you know your backup worked.


#
# Configuring the master of a pool
##

Cedar Backup has been designed to backup entire "pools" of machines. In
any given pool, there is one master and some number of clients. Most of
the work takes place on the master, so configuring a master is somewhat
more complicated than configuring a client.

Backups are designed to take place over an RSH or SSH connection.
Because RSH is generally considered insecure, you are highly encouraged
to use SSH rather than RSH. This document will only describe how to
configure a client to use SSH; if you want to use RSH, you're on your
own.

All of these configuration steps should be run as the root user, except
when indicated.


Step 1: Configure your CD-R or CD-RW drive.  

   Your CD-R or CD-RW drive must either be a SCSI device or must be
   configured to act like a Linux SCSI device (for instance, IDE devices
   can use the Linux IDE-SCSI interface). For more information on how to
   configure your CD-R/CD-RW device, check out the Linux CDROM HOWTO:

      http://www.tldp.org/HOWTO/CDROM-HOWTO/

   In either case, regardless of what type of device you have, make sure
   you know your device's SCSI address (scsibus, target, lun) and Linux
   device name (i.e. /dev/cdrw).


Step 2: Configure your backup user.

   You should create a user to be used for backups.  Some Linux
   distributions may come with a "ready made" backup user.  For other
   distributions, you may have to create a user yourself.  You may
   choose any id you like, but a descriptive 'backup' or 'cback'
   name works well.  

   See your distribution's documentation for information on how to
   add a user.

   Once you have created your backup user, you must create an SSH
   keypair for it.  Log in as your backup user, and then run the
   command:

      ssh-keygen -N "" -f ~/.ssh/identity

   Make sure that ~/.ssh is owned by your backup user and has
   permissions 700, and that the files ~/.ssh/identity and
   ~/.ssh/identity.pub are owned by your backup user and have
   permissions 600.


Step 3: Create your backup tree.

   Cedar Backup requires a backup directory tree on disk. This directory
   tree must be roughly large enough hold twice as much data as will
   be backed up from the entire pool on a given night, plus space for
   whatever is collected on the master itself. This will allow for all
   three operations - collect, stage and store - to have enough space to
   complete. Note that if you elect not to purge the staging directory
   every night, you will need even more space.

   You should create a collect directory, a staging directory and a
   working (temporary) directory.  One recommended layout is this:

      /opt/
           backup/
                  collect/
                  stage/
                  working/

   If you will be backing up sensitive information (i.e. password
   files), it is recommended that these directories be owned by the
   backup user (whatever you named it), with permissions 700.


Step 4: Modify the backup cron jobs.

   There are four parts to a Cedar Backup run: collect, stage, store
   and purge.   The usual way of setting off these steps is through
   a cron job.  For more information on using cron, see the manpage
   for crontab(5).

   Since Cedar Backup should be run as root, you should add a set of
   lines such as this:

      30 00 * * * root  cback --collect
      30 02 * * * root  cback --stage
      30 04 * * * root  cback --store
      30 06 * * * root  cback --purge

   Backing up large directories and creating ISO CD-ROM images can
   be rather intensive operations, and could slow your computer down
   significantly. Choose a backup time that will not interfere with
   normal use of your computer.  Also, be careful to choose your backup
   time so that a backup will not start executing before midnite and
   finish after midnite. This could cause inconsistencies with the
   directory structure on disk.

   You will need to coordinate the collect and purge actions on clients
   so that their collect actions complete before the master attempts to
   stage, and so that their purge actions do not begin until after the
   master has completed staging.


Step 5: Create the Cedar Backup configuration file.

   System-wide Cedar Backup configuration is generally controlled via
   the file:

      /etc/cback.conf

   You may change the location of the configuration file using the
   --config option to the cback script.

   The configuration file is written in XML.  If you are unfamiliar 
   with XML and you would like an overview of how it works, this 
   link might be useful:

      http://www.xml.com/pub/a/98/10/guide0.html

   A sample cback.conf script is distributed with the Cedar Backup
   source package.  This file provides an example of a "real" Cedar
   Backup configuration.

   Update the configuration file using the items you configured in steps
   1-4, using the sample as a starting point and using the manpage for
   cback.conf(5) as a reference.  You will need to configure all four
   sections: collect, stage, store and purge.


Step 6: Validate your configuration file.

   Use the command:

      cback --valid

   to validate your configuration file.  This command checks that the
   configuration file can be found and parsed, and also checks for
   typical configuration problems, such as invalid CD-R/CD-RW device
   entries.

   The most common cause of configuration problems is in not closing
   XML tags properly.  Any XML tag that is "opened" must be "closed"
   appropriately. 


Step 7: Test connectivity to client machines.

   This step must wait until after your client machines have been at
   least partly configured. Once the backup user(s) have been configured
   on the client machine(s) in a pool, attempt an SSH connection to each
   client. Log in as the backup user on the master, and then use the
   command:

      ssh user@machine

   where 'user' is replaced with the name of the backup user *on the
   client machine* and 'machine' is replaced with the name of the client
   machine.

   If you are able to log in successfully without entering a password,
   then things have been configured properly. Otherwise, double-check
   that you followed the user setup instructions for the master and the
   clients.

   
Step 7: Test your backup.

   Make sure that you have configured all of the clients in your backup
   pool.  On all of the clients, execute:

      cback --collect --full

   (You will probably already have tested this command on each of the
   clients, so it should succeed).

   When all of the collect actions have completed, place a valid CD-R or
   CD-RW disc in the master's drive, and use this command on the master:

      cback --all --full

   If the command completes with no output, then the backup was run
   successfully.  Just to be sure, check the log:

      /var/log/cback.log

   and also mount the CD-R or CD-RW disc to be sure that you can read
   it and that data exists on disk for all of the clients in a pool.

   IF CEDAR BACKUP EVER COMPLETES "NORMALLY" BUT THE DISC THAT
   IS CREATED IS NOT USABLE, PLEASE REPORT THIS AS A BUG. TO BE
   SAFE, ALWAYS ENABLE THE CONSISTENCY CHECK OPTION IN THE <store>
   CONFIGURATION SECTION.


From this point forward, your backups will run as scheduled out of cron.
Any errors that occur will be reported in daily emails to your root user
(or whichever other user receives root's email). If you don't receive any
emails, then you know your backup worked.


#
# Configuring a client in a pool
##

Cedar Backup has been designed to backup entire "pools" of machines. In
any given pool, there is one master and some number of clients. Most of
the work takes place on the master, so configuring a client is a little
simpler than configuring a master.

Backups are designed to take place over an RSH or SSH connection.
Because RSH is generally considered insecure, you are highly encouraged
to use SSH rather than RSH. This document will only describe how to
configure a client to use SSH; if you want to use RSH, you're on your
own.

All of these configuration steps should be run as the root user, except
when indicated.

Step 1: Configure the master in your backup pool.

   You will not be able to complete the client configuration until at
   least Step 2 of the master's configuration has been completed. In
   particular, you will need to know the master's public SSH identity to
   go any further.

   To find the master's public SSH identity, log in as your backup
   user on the master machine, and use:

      cat ~/.ssh/identity.pub


Step 2: Configure your backup user.

   You should create a user to be used for backups.  Some Linux
   distributions may come with a "ready made" backup user.  For other
   distributions, you may have to create a user yourself.  You may
   choose any id you like, but a descriptive 'backup' or 'cback'
   name works well.  

   See your distribution's documentation for information on how to
   add a user.

   Once you have created your backup user, you must create an SSH
   keypair for it.  Log in as your backup user, and then run the
   command:

      ssh-keygen -N "" -f ~/.ssh/identity

   Make sure that ~/.ssh is owned by your backup user and has
   permissions 700, and that the files ~/.ssh/identity and
   ~/.ssh/identity.pub are owned by your backup user and have
   permissions 600.

   Finally, take the master's public SSH identity, and cut-and-paste it
   into the file

      ~/.ssh/authorized_keys

   Make sure the identity value is pasted in all on one line, and
   that the authorized keys file is owned by your backup user and has
   permissions 600.

   
Step 3: Create your backup tree.

   Cedar Backup requires a backup directory tree on disk. This directory
   tree must be roughly as big as the amount of data that will be backed
   up on a nightly basis (more if you elect not to purge it all every
   night).

   You should create a collect directory and a working (temporary)
   directory. One recommended layout is this:

      /opt/
           backup/
                  collect/
                  working/

   If you will be backing up sensitive information (i.e. password
   files), it is recommended that these directories be owned by the
   backup user (whatever you named it), with permissions 700.


Step 3: Modify the backup cron jobs.

   There are four parts to a Cedar Backup run: collect, stage, store
   and purge.  Clients run just the collect and purge actions.  The
   usual way of setting off these steps is through a cron job.  For more
   information on using cron, see the manpage for crontab(5).

   Since Cedar Backup should be run as root, you should add a set of
   lines like this:

      30 00 * * * root  cback --collect
      30 04 * * * root  cback --purge

   to your /etc/crontab file.

   Backing up large directories can be a be rather intensive operation,
   and could slow your computer down significantly. Choose a backup time
   that will not interfere with normal use of your computer (3:00am is
   usually a reasonable time). Also, be careful to choose your backup
   time so that a backup will not start executing before midnite and
   finish after midnite. This could cause inconsistencies with the
   directory structure on disk.

   You will need to coordinate the collect and purge actions on the
   client so that the collect action completes before the master
   attempts to stage, and so the purge action does not begin until after
   the master has completed staging (putting several hours between the
   two actions is usually sufficient).


Step 4: Create the system configuration file.

   System-wide Cedar Backup configuration is generally controlled via
   the file:

      /etc/cback.conf

   You may change the location of the configuration file using the
   --config option to the cback script.

   The configuration file is written in XML.  If you are unfamiliar 
   with XML and you would like an overview of how it works, this 
   link might be useful:

      http://www.xml.com/pub/a/98/10/guide0.html

   A sample cback.conf script is distributed with the Cedar Backup
   source package.  This file provides an example of a "real" Cedar
   Backup configuration.

   Update the global configuration file using the items you configured
   in steps 1-4, using the sample as a starting point and using the
   manpage for cback.conf(5) as a reference. You will need to configure
   just the collect and purge sections.


Step 6: Validate your configuration file.

   Use the command:

      cback --valid

   to validate your configuration file.  This command checks that the
   configuration file can be found and parsed, and also checks for
   some typical configuration problems.

   The most common cause of configuration problems is in not closing
   XML tags properly.  Any XML tag that is "opened" must be "closed"
   appropriately. 


Step 7: Test your backup.

   Use the command:

      cback --collect --purge --full

   If the command completes with no output, then the backup was run
   successfully.  Just to be sure, check the log:

      /var/log/cback.log


From this point forward, your backups will run as scheduled out of cron.
Any errors that occur will be reported in daily emails to your root user
(or whichever other user receives root's email). If you don't receive any
emails, then you know your backup worked.
