1. Overview
2. Setup
2.1 Installation
2.2 Configuration
3. Manual Backups
3.1 Initial Backup
3.2 Pruning Old Backups
3.3 Failed Backups
3.4 Retired Volumes
3.5 Status Reports
4. Automatic Backups
5. Snapshots
6. Device Management
6.1 Lost Devices
6.2 Upgrading Devices
7. Restoring
7.1 Manual Restoration
7.2 Restoring With rsync
8. Links
rsbackup
backs up your computer(s) to removable
hard disks. The backup is an ordinary filesystem tree, and hard
links between repeated backups are used to save space. Old
backups are automatically pruned after a set period of
time.
This guide describes how to set up and manage
rsbackup
. See the man
page for detailed reference information.
The systems you want to back up are called clients.
The system that has the backup hard disk(s) attached to it is
called the server, and it is on this system that the
rsbackup
program runs. The server can itself be a
client.
Each client must have an SSH server and rsync installed. For Debian and Ubuntu systems it should be sufficient to install them as follows (if you don’t have them already):
apt-get install openssh-server rsync
The server requires rsync and an SSH client. Again for Debian:
apt-get install openssh-client rsync
For other platforms, you must consult their documentation, or install them from source:
The server’s root login needs to be able to SSH to each of the clients’ root logins without having to enter a password or confirm a key hash. You should consult the SSH documentation to set this up, but the general procedure, assuming you use RSA keys and OpenSSH, is as follows. If you are sufficiently familiar with SSH to do this without further documentation, skip to the next section.
On the server create an SSH key with:
sudo ssh-keygen
When asked for a passphrase, just hit return (but see
below). Then copy ~root/.ssh/id_rsa.pub
to each
of the clients and append it to their
~root/.ssh/authorized_keys
. At the same time,
retrieve the clients’ host key hashes with:
ssh-keygen -l -f /etc/ssh/ssh_host_rsa_key.pub
As root on the server, ssh
to each of the clients
and verify their host keys hashes.
rsbackup
To install rsbackup
, go to www.greenend.org.uk/rjk/rsbackup
and download the source code:
wget http://www.greenend.org.uk/rjk/rsbackup/rsbackup-1.1.tar.gz tar xf rsbackup-1.1.tar.gz cd rsbackup-1.1 make sudo make install
(You will probably need to change the version number.)
On Debian systems you can use the pre-built
.deb
files:
wget http://www.greenend.org.uk/rjk/rsbackup/rsbackup_1.1_amd64.deb sudo dpkg -i rsbackup_1.1_amd64.deb
(You will probably need to change the version number and perhaps the architecture.)
At this point it should be possible to read the man page, which contains reference information:
man rsbackup
What If The Server Is Also A Client?
If you want to backup the backup server itself then you don’t need to set up the server to be able to SSH to itself. See below for how to configure this.
What If I’m Not The Superuser?
rsbackup
does not actually depend on being
the superuser, although of course its functionality will
be limited if it isn’t. However you could for instance
use it to back up your home directory to your portable USB
disk. The setup is the same except that you do it for
your personal login rather than root and therefore don’t
use sudo
.
What If I Don’t Like Empty Passphrases?
In this case you will have to find some other way of making the server’s private SSH key available when backups run. This is outside the scope of this document.
rsbackup
reads a configuration file from
/etc/rsbackup/config
. (Use the
--config
option to override this, if you prefer
another location.) You will need to enter some global
settings and then describe the backup clients.
First you should define where backups will be stored. This guide will assume you use removable hard disks, but you can use permanently online backups too.
For each distinct backup device you need to define two
things. The first is the mount point that the device will
appear at. For example, if you have two backup disks and the
mount points are /backup1
and /backup2
you would write the following:
store /backup1 store /backup2
The second is to define device names. Device names
correspond to the contents of a single-line file
called device-id
in the root of the backup device.
For instance, if you called your devices backup1
and backup2
you would write the following:
device backup1 device backup2
Of course, you must also create these files! For example:
echo backup1 > /backup1/device-id echo backup2 > /backup2/device-id
rsbackup
does not mind if the devices share a
mount point (and only one is present at a time); any device
may use any mount point as far as it’s concerned. You will
probably find it more convenient to give them separate mount
points though. If more than one device is mounted when you
make a backup, backups will be made to all of them.
You can use the --store
option to select just
one.
Although it would of course be convenient for users to be able
to access backups of their files directly, it would also mean
that they could go “back in time” past permission changes or
deletions of private files belonging to one another.
Therefore, the top directory of your backup devices should
(usually) be owned by root and mode 0700
(i.e. -rwx------
).
By default, rsbackup
will insist on this,
although you can use the public
option to change
this behaviour.
chmod 700 /backup1 /backup2
Remember to update /etc/updatedb.conf
to
exclude your backup devices. Otherwise updatedb
will spend ever-increasing amounts of time indexing your
backups.
Next you may want to define some global ageing parameters. These can be overridden for each volume you back up, so by defining global ones you are only setting defaults.
The first one you might want to set is the maximum age of the most recent backup. If any volume’s most recent backup is older than this many days then it will show up as red in the backup status report. The default is 3 days. To reduce it to (for instance) 1 day:
max-age 1
The second parameter to choose is the age at which backups are automatically pruned, i.e. deleted. The default is 366 days, ensuring that you will be able to “go back” up to a year. If you only wanted to go back a month you could reduce it as follows:
prune-age 31
Remember, these are defaults and can be overridden on a per-host or per-volume basis.
There are a few other global settings described in the man page. They will not be covered here.
The rest of the configuration file will define what to back up (and what to exclude).
There are two ways to organise your configuration file.
You can put all the configuration for all the hosts in the main configuration file.
You can put each host’s definitions in a file of its own and include them all. To do this put a line at the end of your configuration file as follows
include /etc/rsbackup/hosts.d
Then for each host create a file named after the host in this directory and use it to store the host’s configuration, as described below.
The the Debian packaging of rsbackup uses this approach.
For each host to back up, you should write a host stanza. This will contain some host level settings and then a volume stanza for each part of the host’s filesystem back up.
Here is an example host stanza:
host sfere volume root / volume boot /boot volume home /home prune-age 366 exclude /*/Desktop/Trash/* exclude /*/.local/share/Trash/* exclude /*/.mozilla/*/Cache/* exclude /*/.ccache volume var /var exclude /tmp/*
The meaning of this is as follows:
The first line contains the name of the host. This would normally be its DNS name (see below for an example of where it is not).
Each of the volume
lines contains the name of
a volume on the host and the path to that volume. By
default, rsbackup
will assume that each volume
corresponds to a (mounted) filesystem and therefore not backup
files from other filesystems.
In this case there are four volumes. root
and boot
are quite simple: all the files in them
will be backed up.
home
, however, is more complex. Firstly, it
has a prune-age
setting to ensure that it is kept
for longer than the default lifetime. Secondly, it excludes
various trash and cache directories.
In the first three cases, it does this by backing up the
directory but not its contents; they will be empty on the backup
device. In the fourth case, it does not even backup the
directory. Note that the exclusion patterns are rooted at the
path to the volume - they are not absolute path names.
(Consult the rsync documentation for --exclude
for
more information about these patterns.)
/var
is similar to home
in that
a temporary directory is excluded.
An important note: the indentation is not
significant to rsbackup
- only to the reader.
Anything that comes after a host
directive and
before the next host
or volume
directive is considered part of that host. Similarly anything
that comes after a volume
directive and before the
next host
or volume
directive is
considered part of that volume.
This example shows how to back up a host where the name
differs from the DNS name. The important part is the
hostname
directive:
host lithhostname chymax volume lith /Volumes/Lith exclude "Temporary Internet Files" exclude /RECYCLER exclude /pagefile.sys exclude "/Documents and Settings/*/Local Settings/Temp" exclude "/System Volume Information/_restore*"
What is actually going on here is that lith
is
really the Windows partition on chymax
. The
computer usually runs Unix, with the Windows partition mounted
for convenience. So to get lith
’s files, it is
necessary to ssh to chymax
.
This example shows how to back up without using SSH at all:
host araminta hostname localhost volume root / volume boot /boot volume home /home prune-age 366 exclude /*/Desktop/Trash/* exclude /*/.local/share/Trash/* exclude /*/.mozilla/*/Cache/* exclude /*/.ccache volume var /var exclude /tmp/* volume news /var/spool/news prune-age 14
In this case, araminta
is actually the backup
server, so using SSH would mean SSHing to itself. The hostname
localhost
is special-cased to avoid using SSH at all.
Here is example of backing up a laptop:
host kakajou max-age 7 volume users /Users prune-age 366 exclude /*/.Trash/* exclude /*/Library/Caches exclude /*/Library/VirtualBox/Machines/*/Snapshots exclude /*/Library/VirtualBox/**.vdi volume local /local prune-age 366 volume etc /etc volume sw /sw
This host is usually asleep or not even in the house, so
opportunities to back it up are rare. Therefore it has a
host-wide max-age
setting.
Before you actually make a backup, you should do a “dry run”
to verify that rsbackup
does what you expect.
rsbackup --backup --dry-run
This will print out the commands that it would run without actually executing them. It’s also a good way of verifying that the syntax of the configuration is correct.
Once you’re happy with the output, you can try making an initial backup:
rsbackup --backup --verbose
The --verbose
option makes rsbackup
report what it is doing as it progresses. In normal use you
would omit it but it’s useful when setting up and crucial when
debugging.
Depending on how much data you have (and how fast your disks and network are) the initial backup may take a very long time. I did my initial backups inside an instance of screen so that they couldn’t be affected by logging out, etc.
For each backup of each volume, a file will be created in
/var/log/backup
detailing the command executed and
any error output.
Pruning refers to deleting a volume’s old backups. During
pruning, a backup will be deleted if it is older than the
volume’s max-age
setting, with the proviso that the
most recent min-backups
backups on each store will
not be deleted.
To prune any backups that are due to be deleted:
rsbackup --prune --verbose
The details of what is pruned are logged to files
in /var/log/backup
.
If a backup fails then it will be left in an incomplete
state. You can tell rsbackup
to pick up where it
left off simply by running it again on the same day; if however
you leave it until another day then that backup will never be
completed. To delete any incomplete backups, therefore:
rsbackup --prune-incomplete --verbose
Backups are not pruned immediately because even if they are incomplete, the portion that succeeded may be useful to reduce the amount of data copied when you retry.
When you take a volume or host out of service, you need to
tell rsbackup
about this. The first part of this
is to remove the corresponding volume
or
host
sections in the configuration file. If you
don’t do this then rsbackup
will keep on trying
to backup the obsolete volume or host.
The second part is to delete old backups for the volume and
their corresponding logfiles. If you don’t do this then the
report will complain about them. This can be done with the
--retire
option:
rsbackup --retire HOST:VOLUME
...or:
rsbackup --retire HOST
All backups on available devices and their corresponding logfiles will be deleted (possibly a slow process).
If you want to keep (say) the last backup for the volume
then you should at least rename it aside, otherwise
--retire
will delete it; you may prefer to tar it
up and compress it.
You can generate an HTML status report to a disk file:
rsbackup --html status.html
This will show:
max-age
setting for the volume and the latter is
red if there are no backups of the volume on the device.--logs
option to make the logfile section
more verbose.If you prefer plain text you can use --text
instead of --html
.
You can also request that the report by sent by email, with
the --email
option. This is intended for use
when automating backups. The email contains both the plain
text and the HTML versions, most mail clients should be able
to pick whichever they are best able to display.
If you installed from a .deb
then have a look at
the Debian-specific documentation.
Manual backups might be perfectly adequate for you. However, computers are often better at remembering to perform scheduled tasks than humans are, so it may be better to run your backups as a cron job. For example, to run your backups, with pruning and a report, at 1am every day you might use the following crontab line:
0 1 * * * rsbackup --backup --prune --email root
This will automatically do a backup every night, prune any out-of-date backups, and mail a report to the system administrator.
You might want to add (for example) a weekly prune of incomplete backups:
0 8 * * 0 rsbackup --prune-incomplete
If you use Linux LVM then you may prefer to snapshot filesystems while they are being backed up, so that there is no possibility of files changing half way through the backup. This can be achieved by adding the following lines to your configuration file:
pre-backup-hook rsbackup-snapshot-hook post-backup-hook rsbackup-snapshot-hook
Then for each volume that is to be snapshotted, create
/snap/VOLUME
on the target host. The volume
name must be the one used by rsbackup
, for instance
root
, boot
, home
or
var
in the first example
above. Before each volume is backed up, a snapshot will
automatically be created; it will be removed again after the
backup is complete or on failure.
There is a man page for rsbackup-snapshot-hook detailing what it does and the options it can take.
Hard disks get lost or stolen, and fail. In this
case rsbackup
needs to be told that one of its
devices has gone away. The first part of this is to delete the
corresponding device
directive in the configuration
file (and the store
directive, if that’s unique to
the device).
The second part is to delete logfiles for backups to the
device. If you don’t do this then the report will complain
about them. This can be done with the
--retire-device
option:
rsbackup --retire-device NAME
You can do these steps in either order, but if you delete
the logfiles first, you will be ask if you are sure. You can
override this with the --force
option.
--retire-device
will never delete actual
backups, only logfiles.
If a backup device gets full you have several options:
Reduce the number of backups to be kept on it by
lowering prune-age
. But sooner or later you
may reach the point where you just cannot keep backups as
long as you like.
Introduce an entirely new, bigger device and take the old device out of service, either keeping it against a rainy day or destroying it as described above.
Copy all its contents to a new, bigger device,
keeping the same device name. Remember to delete the old
device-id
file, or confusion may
follow!
Backups aren’t worth anything if you can’t restore, course.
The backup has the same layout, permissions etc as the original system, so it’s perfectly possible to simply copy files from a backup directory to their proper location. This is the most convenient way if you want to rescue only a small number of files.
Be careful to get file ownership right. The backup is stored with the same numeric user and group ID as the original system used. Put another way, the relationship between usernames and user IDs (and group names and group IDs) on the backup disk reflects the client, not the server (or any other machine the disk might be attached to).
Until a backup is completed, a corresponding
.incomplete
file will exist. Check for such a
file before restoring any given backup.
rsync
You can do bulk restores with rsync
. For
example, supposing that host chymax
has a volume
called users
in which user home directories are
backed up, and user rjk
wants their entire home
directory to be restored:
rsync -aSHz --numeric-ids /store/chymax/users/2010-04-01/rjk/. chymax:~rjk/.
-a
means recursive into directories; preserve
symlinks, permissions, modification times, groups, owners,
device files and special files. -S
means handle
sparse files efficiently. -H
means preserve hard
links. -z
means compress data for transfer; you
might want to omit this if your CPU is slow.
--numeric-ids
is important as backups are
stored with the same numeric user and group IDs as the
original system; no translation via name is performed.
Configuration and cronjob structure for rsbackup.deb.
Please report bugs via Github.