Philip Guo (Phil Guo, Philip J. Guo, Philip Jia Guo, pgbovine)

Lightweight File Versioning and Synchronization with Git and Unison

Today I found a lightweight and elegant solution to the problem of maintaining my personal website on the MIT webserver: I was seeking a solution that met the following two requirements: I wanted to be able to edit files locally on my computer, and I also wanted version histories of my files both for archival purposes and to protect against accidental mistakes.

My solution involves two programs:

  • Editing HTML and other website files locally on my laptop and using Unison to synchronize them onto the MIT webserver.

    My love for Unison is no secret; I've written an extensive tutorial on its usage.

  • Using Git on my laptop to provide version histories for the necessary files within my website; the histories are then synced to the MIT webserver using Unison for backup.

    Git is the best version control system I've ever tried so far; it's scalable enough to use on the Linux kernel (it was originally written by the man himself, Linus Torvalds), but most impressively to me, it's simple enough that I set up a repository in less than 5 minutes on my first try by following this tutorial (which I couldn't ever dream of doing with CVS or SVN).

I really like this solution because it's fast, conserves disk space, and is portable:

  • Fast: I do all of the editing on my own computer, so that's much faster than ssh-ing into a remote server to edit. Also, Git is blazingly fast, especially when running on my own computer rather than remotely via ssh.

  • Conserves disk space: Photographs take up the bulk of the disk space in my website, and those never change, so I don't need to place them under version control. The only files I need to add to my Git repository are HTML files (and other small text files) that don't take up much space.

  • Portable: I really like Git because it keeps all of its file metadata and version histories inside of the directories where those files reside, in the form of .git/ sub-directories. This means that whenever my website directory is backed-up or synced, the version history automatically tags along for the ride. There is no need to separately backup the version histories. If I want to edit my website on another computer, all I need to do is copy the entire directory over and make sure that I have Git installed on that machine, and I'm ready to go!

Contrast with my former CVS setup

For the past 2 years, I've had my entire website inside a CVS repository housed on the MIT webserver. I did a checkout on my laptop (or on any other machine where I wanted to work), and whenever I wanted to push my updates, I would do a cvs commit to the server. But that step alone wasn't adequate for those changes to be seen publicly. I also had another checkout of the repository on the MIT webserver located in my www/ sub-directory. After committing my changes from my laptop to the repository on the server, I had to then log in and run a cvs update to pull the latest changes from the repository to the www/ directory so that they could be publicly visible.

I always had to do 2 remote CVS operations whenever I wanted to update my website, and those were always soooo slow since I naively had all my files in my repository (including the large image files that don't ever change). Also, there were essentially two copies of my website on the MIT webserver (the publicly visible copy and the 'copy' that's actually the contents of the CVS repository), which took up extra disk space that counted against my quota.

I finally got fed up with how slow it was for me to update my own website using CVS and decided to switch over to the Git and Unison solution. Now I can edit and commit as many changes as I want locally, and then just run Unison once over ssh to sync my changes to the www/ sub-directory on the MIT webserver. Running Unison to sync the files is far faster than running a cvs commit over my entire website directory tree.

Created: 2007-09-14
Last modified: 2007-09-15
Related pages tagged as software: