Why you need to guard your Backup Exec servers

If you have a Windows domain, there’s a fairly good chance you have Backup Exec servers, because you probably want to take backups. Because you need them. (As a security guy, I no longer care how you get backups; just that you’re getting them somehow.) Backup Exec is a popular solution for that. But there’s a problem.

A security problem, that is. The quality of Backup Exec as a product hasn’t been my problem since 2005. The problem I have with it now is that Backup Exec stores its passwords in a database. The passwords are encrypted, but it’s possible to decrypt the backup copy, if you’re determined enough.

Read more

Microsoft getting into the backup business?

I take issue with this Register story, which says Veritas has a better name in the storage arena than Microsoft.

Enron has a better name in the storage arena than Veritas. Ditto BALCO and FEMA and Michael Jackson and Martha Stewart.

So Microsoft wants to get into the backup business? Good.I gave three of the best years of my life to the shrink-wrapped stool sample that is Backup Exec. I believed, wrongly, that the Constitution protects sysadmins like me from that piece of software in the clause that mentions cruel and unusual punishment.

After that last job put me out with Thursday night’s garbage, one question I always asked on job interviews was what they used for tape backups. Had anyone said Backup Exec, I would have walked out of the room immediately.

Nobody did. That was good. There are still some smart people in the world. My confidence in humanity was somewhat restored.

Microsoft’s offering will no doubt have problems, but when batch files and Zip drives are more reliable than your competition, who cares? Backup software is one area that desperately needs some competition. Microsoft entering with its usual less-than-mediocre offering will force everyone else with their less-than-mediocre offerings to either improve or die, because Microsoft’s offering will be cheaper, and there will be people who will assume that Microsoft’s offering will work better with Windows because nobody knows Windows better than Microsoft. (In this case, that assumption might actually be true.)

What’s wrong with Backup Exec? Ask your friendly neighborhood Veritas sales rep what they’ve done about these issues:

If a Backup Exec job backing up to disk contains both disk and system state data and it’s the second job to run on a given night, it will fail just as certainly as the sun coming up the next morning. Unless they finally managed to fix that bug, but I doubt it. I sure reported it enough times.

Remote backups happening over second-tier switches (D-Link, Linksys, Netgear, and other brands you find in consumer electronics stores) usually fail. Not every time. But more than half the time.

Those are just the problems I remember clearly. There were others. I remember the Oracle agent liked to die a horrible death for weeks at a time. I’d do everything Veritas support told me to do and it’d make no difference. Eventually it’d right itself and inexplicably run fine for a few months.

Maybe competition will fix what support contracts wouldn’t. And if it doesn’t, maybe Backup Exec will die.

And if Backup Exec must die, I want to be part of that execution squad. Remember that scene in Office Space with the laser printer and the baseball bat?

I never thought I’d say this, but now I’m saying it.

Welcome, Microsoft.

Fixing Backup Exec with Hisecweb installed

If you run your web servers on Windows under IIS, you’d better install the Hisecweb security template unless you want to find yourself hosting a warez site.

But Hisecweb breaks Backup Exec. So what do you do when upgrading to Apache and Linux isn’t a solution?The problem is that Hisecweb makes the system state (shadow copy components in Windows 2003) and SQL server not show up in the selection list. Not only does it not show up in the selection list, Backup Exec cannot find the resources. So backups fail, and if you have to restore from them, you won’t have the registry or a number of system files, which vastly reduces the value of your backup.

The solution is to tell Backup Exec not to use null sessions on those components, which seem to be one of the many things disabled by Hisecweb. On the server being backed up, go into Services and disable your Backup Exec Remote Agent. Now, fire up Regedit. Navigate to HKLM\Software\Veritas\Backup Exec\Engine\NTFS and locate the key called Restrict Anonymous Support. Set this value to 1. Close the registry editor and restart the Backup Exec Remote Agent service.

SQL Server and the system state or shadow copy components should now show up in the selection list for the server you just changed.

This registry hack can also fix visibility problems when the two machines are on different sides of a firewall.

Moral Dilemma

I saw the following in one of my Backup Exec failure logs (directory names changed slightly to protect the client’s name, and me):

Directory F:\ITWEB\Flash Stuff\Welcome Page Animations was not found, or could not be accessed.
None of the files or subdirectories contained within will be backed up.

Hmm. Flash animations.I’m torn. My duty to the client who is paying me, of course, is to fix the problem so the file is backed up.

But they’re blinky, annoying Flash animations. Flash, of course, is the third worst thing to ever happen to the Internet, behind popups and spam. OK, it’s the fourth worst thing. I’ll put it behind spam. But I’ll even put it ahead of Microsoft Internet Exploiter.

So an opportunity to snuff out some blinky Flash animations that have been foisted on the world is a great temptation.

Or am I the only one who feels this way about Flash?

Incidentally, I turn off animated GIFs too–I find a Web without animated GIFs and Flash is a much more pleasant place. I don’t know if that makes me boring and extremist or what.

I’ve been messing around with Backup Exec 10

Veritas is trying mightily to unseat Microsoft as my least-favorite software company. I do believe Backup Exec to be the worst piece of software of any kind on the market. In fact, babysitting Backup Exec is the reason I haven’t been around much.

I’m looking to version 10 for some relief (and the much-needed 1.0 quality that Microsoft usually delivers around version 3–when Veritas will deliver it probably is an interesting Calculus problem).The downside to version 10: I’m told there’s no more Windows NT 4.0 support. Can’t back ’em up. I haven’t actually tried installing the remote agent on an NT4 box to see if it’s unsupported as in we-won’t-help-when-it-breaks or unsupported as in no-can-do. Smart businesses hocked their NT4 servers a couple of years ago. I won’t say anything else, except that not every business is smart.

More downside: If a tape fills up and you can’t change it because the server is offsite and/or behind locked doors that require approval from 14 middle managers and a note from your mother to get to, under some circumstances Backup Exec 10 will hang indefinitely while cancelling the job. Version 9 had the same problem. Bouncing the services will usually relieve the hang, but sometimes you have to reboot.

It’s tempting to put Backup Exec and your tape drive on your biggest file server to get faster backups. But trust me, if you put it on a server that’s dedicated to backups–its day job can be as a domain controller or some other task that’s shared by multiple, redundant mahcines–you’ll thank yourself. It’s very nice to be able to reboot your Backup Exec server without giving your seven bosses something else besides the cover sheet on your TPS reports to grumble about.

If you must put Backup Exec on your file server, set up DFS and mirror the file shares to another server. It doesn’t have to be anything fancy–just something that can prop things up while the server’s rebooting. And run Windows 2003, because it boots fast.

The upside: I can make Backup Exec 9.1 die every time by creating a direct-to-tape job and running it concurrently with a disk-to-disk-to-tape job. The tape portion of the second job will bomb every time. Veritas technical support tells me that bug was fixed in 9.1SP1. It wasn’t. But it’s fixed in 10.

There are some other features in 10, like synthetic backups, that promise to speed backups along. That would be very nice. It would also be nice if it would be reliable.

I’m not going to put it in production yet–when I first deployed 9, it fixed a lot of problems but it made a whole bunch of new ones–but maybe, just maybe, Backup Exec 10 will do what it’s supposed to do well enough that I can work something close to regular hours again.

Otherwise I’ll look forward to Backup Exec 11 and hope that it features more changes than just a new Symantec black-and-gold color scheme and wizards featuring Peter Norton. We’ll see.

Wake up your Backup Exec remote agent

Usually when a Backup Exec remote agent refuses to respond and stopping and starting the service does no good (verifiable by creating a new job and attempting to connect to the remote server, only to find the drive selection boxes greyed out), the solution is to reboot.

There’s a last-resort method more appropriate for production servers.Telnet to the remote server on port 10000. As in:

telnet 192.168.1.2 10000

When I did it, I got a bunch of garbage characters. I closed the window, then tried to connect again. This time, the agent was awake.

I have no idea if Veritas sanctions this or not, but it worked for me, and I like the answer a lot better than rebooting.

Resolving an issue with slow Windows XP network printing

There is a little-known issue with Windows XP and network printing that does not seem to have been completely resolved. It’s a bit elusive and hard to track down. Here are my notes and suggestions, after chasing the problem for a couple of weeks.The symptoms are that printing occurs very slowly, if at all. Bringing up the properties for the printer likewise happens very slowly, if at all. An otherwise identical Windows 2000 system will not exhibit the same behavior.

The first idea that came into my head was disabling QoS in the network properties, just because that’s solved other odd problems for me. It didn’t help me but it might help you.

Hard-coding the speed of the NIC rather than using autonegotiate sometimes helps odd networking issues. Try 10 mB/half duplex first, since it’s the least common denominator.

Some people have claimed using PCL instead of PostScript, or vice versa, cleared up the issue. It didn’t help us. PCL is usually faster than PostScript since it’s a more compact language. Changing printer languages may or may not be an option for you anyway.

Some people say installing SP2 helps. Others say it makes the problem worse.

The only reliable answer I have found, which makes no sense to me whatsoever, is network equipment. People who are plugged in to switches don’t have this problem. People who are plugged into hubs often have this problem, but not always.

The first thing to try is plugging the user into a different hub port, if possible. Sometimes ports go bad, and XP seems to be more sensitive to an deterriorating port than previous versions of Windows.

In the environment where I have observed this problem, the XP users who are plugged into relatively new (less than 5 years old) Cisco 10/100 switches do not have this problem at all.

This observation makes me believe that Windows XP may also like aging consumer-grade switches, like D-Link, Belkin, Linksys, and the like, a lot less than newer and/or professional grade, uber-expensive switches from companies like Cisco. I have never tried Windows XP with old, inexpensive switches. I say this only because I have observed Veritas Backup Exec, which is very network intensive, break on a six-year-old D-Link switch but work fine on a Cisco.

I do not have the resources to conduct a truly scientific experiment, but these are my observations based on the behavior of about a dozen machines using two different 3Com 10-megabit hubs and about three different Cisco 10/100 switches.

Undocumented Backup Exec error

I got an odd Backup Exec error message on Thursday night that I wasn’t able to find in Veritas’ knowledge base.

The error code is 0x3a18 – 0x 3a18 (14872). Since it seems otherwise undocumented, I might as well document what I know about it.In my case at least, the cause of the error seems to have been insufficient disk space. The drive where Backup Exec was storing its catalogs was filling up, and this cryptic error message was the result. When I reran the job that failed, I got an "Insufficient disk space to write catalogs" error in the popup, but not in the system log. That doesn’t help you if you happen to not be logged in at the time of the error. Seeing as this error happened at 12:30 AM, I wasn’t.

This error was especially nasty because it caused Backup Exec to not recognize that the tape was allocated, so it overwrote the three good jobs it had completed that night with two bad jobs. If there’s anything more enraging than a failed backup, it’s a failed backup that took a bunch of others down with it.

Many other Backup Exec errors are caused by low disk space. This is so simple that it ought to be the first thing I check, but more often than not I forget. I need to remind myself.

How frequently you run out of disk space on your system drive, of course, increases exponentially with each person who has admin rights on the server.

Backup Exec misadventures

(Subtitle: My coworkers’ favorite new Dave Farquhar quote)

If your product isn’t suitable for use on production servers, then why didn’t you tell us that up front and save us all a lot of wasted time?

(To a Veritas Backup Exec support engineer when he insisted that I reboot four production web servers to see if that cleared up a backup problem.)When I refused to reboot my production web servers, he actually gave me a bit of useful information. Since Veritas doesn’t tell you this anywhere on their Web site, I don’t feel bad at all about giving that information here.

When backing up through a firewall, you have to tell Backup Exec what ports to use. It defaults to ports in the 10,000 range. That’s changeable, but changing it through the user interface (Tools, Options, Network) doesn’t do it. It takes an act of Congress to get that information out of Veritas.

What Veritas doesn’t tell you is that the media server (the server with the tape drive) should talk on a different range of ports than the remote servers you’re backing up. While it can still work if you don’t, chances are you’ll get a conflict.

The other thing Veritas doesn’t tell you is that you need a minimum of two, and an ideal of four, ports per resource being backed up. So if the server has four drives and a system registry, which isn’t unusual, it takes a minimum of 10 TCP ports to back it up, and 40 is safer.

Oh, and one other thing: If anyone is using any other product to back up Windows servers, I would love to hear about it.

Maintaining a healthy distance

Yesterday we managed to back up our 40 or so NT servers without incident for the first time in years. OK, months. It seemed like years. It wasn’t that long ago that I nearly woke up my neighbors after receiving my fourth 2-am backup failure phone call that week. As I walked through the hallway to fire up the laptop and log in, I pounded the wall in frustration and screamed, “Just once, let me sleep through the night without bothering me. Just once!”
Microsoft is my least favorite software company and has been for years. But once I had to deal with Backup Exec on a daily–who am I kidding?–nightly basis, Veritas quickly rocketed past Network Associates and Adobe to get the #2 spot.

To anyone else struggling with Backup Exec, I offer this bit of advice: Tell the first PHB who comes around that you’d be working on it if you weren’t busy talking to him or her, then take your phone off the hook and deal with the problem one backup job at a time. Better yet, narrow it down to one directory at a time. Keep in mind that Backup Exec seems to subscribe to the domino theory–one failure causes eight. OK, two or three. And if Backup Exec is flagging jobs as failures because it can’t back up the DHCP database, then exclude the DHCP database. If you have to do a restore and that file’s gone, the OS will regenerate it. It’s easier to explain that to the PHBs than it is repeated failures. If they insist on 100% identical hot backups, tell them they’re going to have to swallow hard and buy you a SAN with snapshot capability. If they don’t have $50,000 laying around, I can come up with creative ways to get it–eliminating a layer or two of management would probably pay for several SANs–but I don’t know of a tactful way to say that.

If I seem a bit disconnected these days, I am. A few weeks ago I realized I was letting a Microsoft lackey from Utah with all the class of that thing you find behind a horse’s tail set my agenda. And I decided I wouldn’t let him set my agenda, or anyone else, for that matter. And I quit looking at my site statistics. And I haven’t even looked at comments since Saturday.

Daily hits are nice, and they’re great for the ego. But prime time for writing used to start around 9 pm. That also happens to be the time when my girlfriend can call me for free. So guess what budged? I’ll adjust eventually, but that’s not all that’s changed. A year ago, I’d ask myself several times a week what I was going to write about the next day. I never ask myself that question anymore. Nowadays I sit down and write when I’ve found something interesting, or I do what I’m doing now–force myself to sit down and write something, anything.

And of course, on the nights when she comes over or we go out, I don’t write anything.

So I’m not writing my best-ever content these days, but it’s because I have other priorities. That includes keeping the girlfriend happy, but truth be told, I’m at least as happy writing a Wikipedia entry as I’ve ever been writing stuff here. So a lot of energy that would normally go here goes elsewhere. Cracking the upper ranks of Technorati or another blogging community just isn’t high on my priority list anymore, if it ever was.

But I’m still in my 20s, and I’m still just as moody as I’ve ever been. Everything’s subject to change with as little notice as St. Louis weather patterns.

I know this will be interpreted as me saying I quit, so let me make one thing clear: I don’t quit. I may or may not write something tomorrow (I probably will). But if I don’t, I’ll be back later in the week. And I might even read comments that time.