31 July 2014

Find Bug!

No-one is perfect. Sometimes you'll write code and think it works fine; you've tested all the edge-cases you can think of and nothing seems amiss. It won't be until months later that you're using that code and see it stumble over something it shouldn't. Something that couldn't possibly go wrong just did, right before your eyes.

In my case, the vigil.pl program I wrote in Integrating Integrity stumbled on some directories with Unicode in their names. It's a good thing I noticed it, because I'd started to rely on my little program more and more recently. Let's figure out what happened and fix the bug.

20 July 2014

Get S.M.A.R.T.

As followers of this blog may know, I've been having a cacophony of hardware problems lately. Most of them revolve around that one inevitability of packing more and more data into tinier and tinier spaces: Hard disk corruption. I've been busy moving my vital datas onto an older machine of mine and setting it up to host all my source code, so now is a great time to get paranoid about disk integrity.

10 July 2014

Oh no not again

So, I was doing a routine upgrade of my (very) old laptop the other day. It no longer has a working battery and doesn't quite have the power I want for modern day-to-day stuff, but it's served me very well as a SSH gateway, Subversion server and a place to keep my IRC session idling. Old laptops can make really nice servers since they're typically quiet, draw little power, and come with their own keyboard and monitor.

But I digress.

I was upgrading the packages, and one of those was a kernel update, so I rebooted for the first time in months and...

Remember how I had hard drive problems recently? Yeah.

30 June 2014

Decimating Directories

Decimating Directories

Whenever you set up some automated system that produces files, there's always that nagging fear that you'll forget about it and it will run rampant, filling up your hard drive with clutter. One good example is using motion as a security system - you want to keep the most recent video clips in case you need to refer back to them, but there's little point in keeping the oldest ones.

Keeping only the most recent n videos and deleting the rest could be problematic, because the individual files could be large. Keeping anything younger than a certain number of days is no good, because there could be a burst of activity that creates a lot of files. So we want to make a script that will trim a directory of files down to a specific size.

30 May 2014

Integrating Integrity, part 2

Last post, I made a perl script to generate MD5 checksums for me, while displaying a progress bar. Now I want to expand its functionality to generate a .md5sum file listing the md5 for everything in a given directory, or check all the files in the list to see if their actual md5 matches the 'correct' one. I will also set things up so that any checksum mismatches or other errors are reported at the end of the run so that they aren't pushed off the terminal's scrollback buffer when working with a large list of files.

02 May 2014

Integrating Integrity, part 1

One of the most important parts of any backup solution is being able to identify when files have become corrupt due to a failing disk. Ideally, we'd be able to identify impending failure before the excrement hits the rotational cooling device, but we don't always have that luxury. I intend to cover things like S.M.A.R.T. disk checks in a later post; for today, I want to address per-file integrity checking. Because the only thing worse than having no backup is having a backup of the already-corrupt data.

md5sum has been my go-to tool for this in the past. The checksums it generates are slightly better than the old CRC32 method, and it's ubiquitous. While it is important to realise it is not a cryptographically secure checksum and cannot protect against malicious tampering, it is a very effective way to check a file for damage. However, while the venerable md5sum command works perfectly fine, I really want to make my own version with a few improvements that I find myself wanting.

04 April 2014

Rescue me!

So, I had hard drive problems recently. And then when I went to check out the backup I'd made - lucky me that I even had a semi-recent backup! - the drive that held the backup was also failing and corrupting the backup data.

The moral of this story is: BACKUPS! OMG BACKUPS! BACKUP YOUR SHIT RIGHT NOW! AND THEN BACKUP THE BACKUP! which is of course common sense and you don't need me to tell you that backups are important, gee la, you're not an idiot, you're one of my super-smart well-informed readers whom I respect 100%. Still, though, ... when was that last backup you did, exactly? Can you verify the integrity of the backup? Does it cover every single file you might want to restore in the event of a disksplosion?

I guess there is one upside to this: It's got me in a blogging mood again (Finally!) so you can expect a series of posts dealing with:-

  • Preparing for the inevitable failures
  • Watching for signs of early failure
  • Creating the One True Backup System that can finally give you peace of mind

Today? Let's get the ball rolling with a small post showing you how to set up a GRML rescue system embedded in your Ubuntu or Debian install.

18 November 2013

Naming Names, Part 2

Remember how, in the last post, I invited you all to join me "next week" for part 2? What a funny joke that was! In the meantime I've been busy being sick, getting sucked back into WoW thanks to a "gift" from a "friend", and have been dealing with hilariously bad hardware failures. But I'm finally back to finish what I started, because I owe you all that much.

In Part 1, we converted a crufty old shell script of mine to its Perl 5 equivalent, and then built it up a little to be more smart about how it goes about renaming files.

In Part two, we go nuts with feature-creep. Follow me along for the ride.