Crazy Backup Solution
by Steven NoonanI am thinking about attempting to install FreeBSD on my Mac again. The last time I tried, though, it wiped my GUID Partition Table (GPT) and my Master Boot Record (MBR). So basically, it trashed the system and I had to reformat again. It wasn’t lovely. So this time, I decided to back up important things. My source code is already backed up, thanks to Git. My preferences, bookmarks, etc are all backed up by .Mac (oh, excuse me. “MobileMe“), so all that’s really left is my Applications folder and a few miscellaneous documents.
Normally, when backing up my Applications folder, the process is fairly straightforward:
Unfortunately, this is prone to error, especially in the third step. It’s possible to click too quickly when multi-selecting, and open dozens of applications at once. I don’t much like that. So I hacked together a fairly simple (but quite nifty) solution to deal with backing up my applications.
Let’s start with the problem. The problem is that there are several applications that I shouldn’t back up, because they’re provided by the Mac OS X installer. iTunes, Front Row, etc. In fact, I can’t think of a single application made by Apple that doesn’t either come with the OS or need an installer in order to operate correctly (i.e. Aperture, Logic, Final Cut, iLife, iWork, etc). So these applications need to be excluded while backing up.
So I started with a simple script that filters through the applications I’ve got and finds ones that need to be backed up:
filter.sh
#!/bin/bash BANLIST="$(cat filter-blacklist)" WHITELIST="$(cat filter-whitelist)" # We don't want spaces to muck up the paths. export IFS=$'\n' rm -f backup-queue for a in $(find /Applications -maxdepth 1 -depth 1 -type d | sed 's/\/Applications\///g' | sort -f); do # Simple exclusion rule for Apple-provided apps. EXCLUDE=$(cat "$a/Contents/Info.plist" 2> /dev/null | grep -A 1 CFBundleIdentifier | grep com.apple) # We also run this through a blacklist and a whitelist, just # in case there's something that we _did_ want to be banned # or vice versa. if [ "$EXCLUDE" == "" ]; then BANNED=0 for b in $BANLIST; do if [ "$a" == "$b" ]; then BANNED=1 break fi done else BANNED=1 for b in $WHITELIST; do if [ "$a" == "$b" ]; then BANNED=0 break fi done fi # JUDGEMENT TIME if [ $BANNED -eq 0 ]; then echo $a OK echo $a >> backup-queue else echo $a BANNED fi done
The script requires two files. A whitelist (applications that should be backed up, but are excluded by the filter) and a blacklist (applications that shouldn’t be backed up, but pass the filter). Here are mine:
filter-blacklist
Adobe Bridge CS4 Adobe Device Central CS4 Adobe Dreamweaver CS4 Adobe Drive CS4 Adobe Extension Manager CS4 Adobe Flash CS4 Adobe Illustrator CS4 Adobe Media Encoder CS4 Adobe Media Player.app Adobe Photoshop CS4 AppleScript Microsoft Office 2008 NetBeans.app Utilities VirtualBox.app VMware Fusion.app
filter-whitelist
Plasma Pong.app
A bit of explanation on my blacklist and whitelist… Plasma Pong.app is on the whitelist because the author specified that the CFBundleIdentifier is ‘com.apple.plasmapong’, which gets caught by the anti-Apple app filter (a CFBundleIdentifier of ‘com.plasmapong.plasmapong’ would be better). The folders/applications listed on the blacklist are ones which are either provided with the OS (‘AppleScript’, ‘Utilities’), or require an installer to function (‘NetBeans’, ‘VirtualBox’, ‘VMWare Fusion’).
So anyway, the script uses these two files and its filter to figure out what apps should be included or excluded. Let’s look at what the script outputs in my case:
Alcarin:Applications steven$ ./filter.sh 0xED.app OK Address Book.app BANNED Adium.app OK Adobe Bridge CS4 BANNED Adobe Device Central CS4 BANNED Adobe Dreamweaver CS4 BANNED Adobe Extension Manager CS4 BANNED Adobe Flash CS4 BANNED Adobe Illustrator CS4 BANNED Adobe Media Encoder CS4 BANNED Adobe Media Player.app BANNED Adobe Photoshop CS4 BANNED Angband.app OK Anxiety.app OK Aperture.app BANNED AppFresh.app OK AppleScript BANNED AppZapper.app OK Arora.app OK Audacity.app OK Automator.app BANNED BetterZip.app OK blender 2.48a OK Braid.app OK Calculator.app BANNED Canary.app OK CandyBar.app OK Chess.app BANNED Chmox.app OK coconutBattery.app OK coconutIdentityCard.app OK Coda OK Colloquy.app OK CrossOver Games.app OK CrossOver.app OK Cyberduck.app OK DAA Converter.app OK Darwinia.app OK Dashboard.app BANNED Dictionary.app BANNED Disco.app OK Dock Library.app OK DOSBox.app OK Doukutsu.app OK DropCopy.app OK DVD Player.app BANNED Dwarf Fortress 0.28.181.40d OK Expose.app BANNED Firefox.app OK Flip4Mac OK Flock.app OK Font Book.app BANNED Freeciv.app OK Front Row.app BANNED FrostWire.app OK Geekbench (64-bit).app OK Geekbench (Rosetta).app OK Geekbench.app OK Gimp.app OK GrandPerspective.app OK GridWarsOSX OK Hacker Evolution Untold.app OK HandBrake.app OK iCal.app BANNED iChat.app BANNED Image Capture.app BANNED Inkscape.app OK iPodDisk.app OK iStumbler.app OK iSync.app BANNED iTunes.app BANNED Jaikoz.app OK Leopard Cache Cleaner.app OK LiquidMac.app OK Little Snitch Configuration.app OK MacHeist Chat.app OK MacPorts OK Mactracker.app OK Mail.app BANNED Microsoft Office 2008 BANNED Multiwinia.app OK NetBeans OK NetNewsWire.app OK Nocturne.app OK Opera 10 beta.app OK Opera.app OK Photo Booth.app BANNED Picasa.app OK Picturesque.app OK Pixelmator.app OK Plasma Pong.app OK Preview.app BANNED Privateer - Ascii Sector OK Quicksilver.app OK QuickTime Player.app BANNED RealPlayer.app OK Safari.app BANNED Scribus.app OK SeaMonkey.app OK Senuti.app OK Sequel Pro.app OK Shimo.app OK SiteSucker.app OK SketchFighter 4000 Alpha OK Skitch.app OK Skype.app OK smcFanControl.app OK Smultron.app OK Spaces.app BANNED Speed Download 5 OK Stickies.app BANNED SubEthaEdit.app OK System Preferences.app BANNED TextEdit.app BANNED The Unarchiver.app OK Time Machine.app BANNED Transmission.app OK TrueCrypt.app OK Twitterrific.app OK Unangband.app OK Utilities BANNED Ventrilo.app OK VLC.app OK VMware Fusion.app BANNED Vuze.app OK WebKit.app OK World of Goo.app OK X-Chat Aqua.app OK Xbench.app OK Xslimmer.app OK Yep.app OK ZAngband.app OK Alcarin:Applications steven$
Alright, looks good. filter.sh also generates a file called ‘backup-queue’ which contains all the apps which passed the filter. So now, step two is a two-part process:
- Archive the apps (gzipped tarball works for this)
- Copy the tarballs to a network share
So I created a couple scripts for this (they can be run in parallel, which is fantastic for a couple reasons which I’ll outline below).
backup.sh
#!/bin/bash cd /Applications QUEUE="$(cat backup-queue 2> /dev/null)" # We don't want spaces to muck up the paths. export IFS=$'\n' echo "Running backup queue..." rm -iv *.tar.gz.lock *.tar.gz for a in $QUEUE; do echo -n "Creating $a.tar.gz... " # Create a lock so that the move script won't touch # the file until we're done with it here. touch "$a.tar.gz.lock" # Do the actual work. tar -czpf "$a.tar.gz" "$a" # Mark this tarball complete. rm -f "$a.tar.gz.lock" echo "OK" done echo "All done."
move.sh
#!/bin/bash DESTINATION="$1" # The user doesn't know what he's doing, clearly. if [ "$DESTINATION" == "" ]; then echo "Please provide a destination directory." exit 1 fi # I spoke too soon. _NOW_ the user has no idea what they're doing. if [[ ( ! -d "$DESTINATION" ) || ( ! -w "$DESTINATION" ) ]]; then echo "Destination specified is not a writable directory." exit 1 fi # We don't want spaces to muck up the paths. export IFS=$'\n' cd /Applications while true; do # Stays zero unless something actually gets moved. DIDWORK=0 QUEUE="$(ls | grep tar.gz$)" for a in $QUEUE; do # If the lock file for the tarball doesn't exist, # we assume that backup.sh has finished its work. if [ ! -f "$a.lock" ]; then DIDWORK=1 echo "Moving $a to $DESTINATION/..." # Finally move it to the backup storage # directory. mv "$a" $DESTINATION/ &> /dev/null fi done # We sleep if no work is done because otherwise # this infinite loop would waste quite a few CPU # cycles. if [ $DIDWORK -eq 0 ]; then echo "Nothing to do. Sleeping..." sleep 2.5s fi done
You might wonder why I didn’t just have it tarball directly to the remote server. The reason is fairly simple. If the network latency is bad enough, it won’t be able to transfer over the network fast enough and the archiving process grinds to a halt while waiting for the network to catch up. And if the bottleneck happens to be in your archiving speed, this doesn’t adversely affect it, either. This parallel method ensures that the maximum amount of work is being done at a time.



Leave a Reply
You must be logged in to post a comment.