From the manpage of rsync(1):
rsync is a program that behaves in much the same way that rcp does, but has many more options and uses the rsync remote-update protocol to greatly speed up file transfers when the destination file is being updated.The rsync remote-update protocol allows rsync to transfer just the differences between two sets of files across the network connection, using an efficient checksum-search algorithm described in the technical report that accompanies this package. |
rsync is not any file transfer program. It is an intelligent file transfer program, used widely to mirror websites, directories and entire filesystems. What makes rsync superior to other file transfer programs, like rcp and scp, is its ability to efficiently compare the differences between two files and to copy only if either has been updated.
At work, I do all my development inside a Virtual Machine running Slackware. Any code I write at work and, more importantly, as part of work, I keep in a separate directory cleverly named “work” inside the /home directory. With no back-up server yet in place, and fearing the day the host OS would crash or the Virtual Machine image would get corrupt, I brought my home laptop at work to mirror, at least, the work directory. I used EverythingLinux.org’s simple-to-follow tutorial to set up rsync on the Virtual Machine image, and call the rsync client from the laptop to back-up the work directory.
While I strongly suggest that both the manpage and the tutorial referenced be read thoroughly, I will, nonetheless, list down instructions to quickly get rsync up and transferring files.
The enviornment I’m using is laid out like this: I wish to make a copy of my work directory, /home/work/, on a Slackware box, bound to the IP 192.168.1.10, over to my laptop, also running Slackware, and bound to 192.168.1.247. rsync is installed on both systems. First, I need to set up rsync daemon on 192.168.1.10 by creating rsyncd.conf, the file from which the rsync daemon reads various configuration settings. The manpage of rsyncd.conf(5) thoroughly documents all configuration variables, parameters, directives, and also contains useful examples. I set up /etc/rsyncd.conf to mirror the following config:
motd file = /etc/rsyncd motd
log file = /var/log/rsyncd.log
pid file = /var/run/rsyncd.pid
lock file = /var/run/rsyncd.lock
[work]
path = /home/work
comment = Code repository from Work
uid = ayaz
gid = ayaz
read only = yes
list = yes
auth users = ayaz
secrets file = /etc/rsyncd.scrt
|
A brief explanation is in order. “path” points to the directory to be mirrored. “uid” and “gid” are the user and group IDs, respectively, under which file transfers will take place. I don’t want /home/work to be altered by any client through rsync, so I have set it to be read only. I have also set “ayaz” to act as the user allowed to connect to the rsync server. A user:pass pair for the allowed user is put in plain-text in the “secrets file”. If anonymous rsync is desired, then the “auth users” and, consequently, “secrets file” directives should be taken off.
These are a small subset of the options rsyncd supports. The manpage, rsyncd.conf(5), documents every aspect and option of rsyncd in detail. Finally, to execute the rsync daemon, I call rsync with the “–daemon” flag. rsync’s daemon runs in the background, on port 873 (default).
Now, over to the client side, the laptop. I create a directory, /home/work, and modify its user and group ownership to match the “uid” and “gid” set in rsync daemon’s config file. Running the rsync client is simple. Before running it, I’d suggest a thorough read of rsync’s manpage, again, rsync(1), to understand the various switches it supports and the different ways in which it can be used.
I call rsync like this:
$ rsync -avrzogtp –rsh=ssh –exclude “*.~” –exclude “linux/” 192.168.1.10:/home/work/ /home/work/
Here’s a quick description of the various switches used. “-a” does the archiving. “-v” turns on verbosity in output. “-r” tells rsync to get into recursive mode while traversing directories. “-z” enables compression. “-p”, “-o”, and “-g” preserve, in that order, the permissions, owner and group information of files and directories to be copied. “-t” preserves the file and directory timestamps. I don’t wish to send data in plain on the wire, so, I order rsync to use “ssh” instead to tunnel data. Like tar, rsync supports –exclude switch. I tell it to exclude any files with a trailing “~” character in their names — Vim rather stubbornly does tha –, and to exclude the entire “linux/” directory. Finally, I specify the source host and source directory, followed by the local directory where the data should be moved.
And that’s it. The first time around, I was nervous running rsync. Just to be sure it wasn’t going to do anything crappy on my laptop’s filesystem (which shamefully itself isn’t backed-up yet), I ran rsync on my laptop with the “-n” switch. It does a dry-run only, in that it only generates a harmless list of files it will copy from the source system and quits. Again, read the manpage for many more options.