Handling data recovery

I took a strange phone call from the field today, asking for advice about creating policies and procedures on data recovery.

There’s no easy answer.
I don’t like trying to create those types of policies and procedures, because there are certain things that a professional should know how to do. Things like making backups, restoring backups, and replacing disks are pretty basic system administration tasks. It’s like worrying about whether an electrician knows what gauge of wire to use. If I don’t already know the questions to ask to determine whether an electrician is qualified or competent, a five-minute phone call isn’t going to give me those questions. And if an electrician does something obviously incompetent, you don’t need policies and procedures–you need a new electrician.

There are really two issues at play here: Recovering the data, and finding qualified sysadmins so you won’t have to recover the data another time. Let’s tackle the two issues separately.

If you’re in a bind and you have to use an outside firm, you call Ontrack. Period.  Yes, you can probably find somebody closer and cheaper, but when it comes to data recovery, close and cheap aren’t what you want.

The average black-market value of a business laptop is $50,000. The laptop is worth a few hundred bucks. The data is worth the rest.

So the data on your server is worth more than that. The money you’ll pay Ontrack to get back what they can–which could be a few hundred or a few thousand dollars–isn’t much compared to that. It’s not worth messing around with the guy who says he’ll see what he can do for $200. The fewer people who mess with it, the more data Ontrack will be able to get back for you.

That leaves the question of how you find competent IT professionals. I’ve seen several approaches to that, but the one I like best is something I read in a trade magazine a few years ago. The columnist (and I forget who it was) related how hospitals hire management. First and foremost, the person running the hospital is a practicing doctor. And the only people who can assess a doctor’s competence are other doctors. So part of the interview is sitting the candidate down with several respected doctors, letting them talk shop, then asking those doctors what they think.

The same approach works in IT. I’ve been working in IT professionally for 17 years now (I know because my boss just asked me last week). Let me talk shop for a few minutes, and I know a competent IT professional when I’m talking to one. I also know an incompetent one when I’m talking to one.

The first shop I worked in discovered this accidentally. For lack of any other way to interview people, they’d sit candidates down with me and whoever else was available and just let us talk shop for a few minutes. If two candidates had roughly equal qualifications on paper, they hired the guy who made me frown less in the interview.

What kinds of questions did I ask? It really depends, but basically we’d end up talking war stories. Tell me about a problem you fixed. And I’m a whole lot less concerned about how long it took to fix the problem than I am with the methodology. Because if the candidate tried 25 different things, and those are 25 things that usually work, that’s good. Knowing to try those 25 different things gives you good coverage in the day-to-day operations, because those problems that take a week working around the clock to solve are exceptionally rare. Hire someone who knows how to solve 25 different problems, and chances are there’ll be a lot of problems that never show up on the radar because that person solved them before they turned into big problems.

Everyone gets in over their head sometimes. Hearing what they did to get out is an easy way get a good idea in a few minutes how deep the guy’s knowledge is.

I’ve also interviewed for jobs that hammer you with questions. Questions like what’s the difference between RAID 0 and RAID 1 (RAID 0 is striping with no redundancy; RAID 1 is mirroring) and what’s the only thing root can’t do (write to a read-only device–yes, it’s a terrible question). That tests someone’s book knowledge, and it gives you some idea of what the person is like under stress, but it really doesn’t tell you much about how the person solves everyday problems. I learn a lot more when I hear about the 25 things the interviewee tried when a database server died suddenly in the middle of the night.

So how do I know to recognize whether those 25 things would actually do any good? From spending 18 years in the trenches of IT and getting in over my head myself. You won’t find it in a book anywhere. If you don’t have in-the-trenches experience yourself, you’ll have to rely on someone who does for part of the interview process.

It’s one reason I’m doing my current project. I could have bought a refurbished PC and loaded a server appliance on it, but the hands-on experience helps keep me current. In the long run, it’s better for me to flail around a few days trying to get Nginx and PHP talking to MySQL, then getting my data into it.

That’s how you find good people. And then the very best thing you can do is stay out of their way and let them do their job.

I don’t micromanage the electrician and the plumber when they work on one of my houses, and I don’t micromanage my accountant when he does my taxes. I tell them what I need done, give them my phone number in case there’s something they need to tell me, and I rely on their professional judgment and experience. The same approach works with technicians and sysadmins.

If you found this post informative or helpful, please share it!