Fixing Backup Exec with Hisecweb installed

If you run your web servers on Windows under IIS, you’d better install the Hisecweb security template unless you want to find yourself hosting a warez site.

But Hisecweb breaks Backup Exec. So what do you do when upgrading to Apache and Linux isn’t a solution?The problem is that Hisecweb makes the system state (shadow copy components in Windows 2003) and SQL server not show up in the selection list. Not only does it not show up in the selection list, Backup Exec cannot find the resources. So backups fail, and if you have to restore from them, you won’t have the registry or a number of system files, which vastly reduces the value of your backup.

The solution is to tell Backup Exec not to use null sessions on those components, which seem to be one of the many things disabled by Hisecweb. On the server being backed up, go into Services and disable your Backup Exec Remote Agent. Now, fire up Regedit. Navigate to HKLM\Software\Veritas\Backup Exec\Engine\NTFS and locate the key called Restrict Anonymous Support. Set this value to 1. Close the registry editor and restart the Backup Exec Remote Agent service.

SQL Server and the system state or shadow copy components should now show up in the selection list for the server you just changed.

This registry hack can also fix visibility problems when the two machines are on different sides of a firewall.

Optimizing dynamic Linux webservers

Linux + Apache + MySQL + PHP (LAMP) provides an outstanding foundation for building a web server, for, essentially, the value of your time. And the advantages over static pages are fairly obvious: Just look at this web site. Users can log in and post comments without me doing anything, and content on any page can change programmatically. In my site’s case, links to my most popular pages appear on the front page, and as their popularity changes, the links change.

The downside? Remember the days when people bragged about how their 66 MHz 486 was a perfectly good web server? Kiss those goodbye. For that matter, your old Pentium-120 or even your Pentium II-450 may not be good enough either. Unless you know these secrets…

First, the simple stuff. I talked about a year and a half ago about programs that optimize HTML by removing some extraneous tags and even give you a leg up on translating to cascading style sheets (CSS). That’s a starting point.

Graphics are another problem. People want lots of them, and digital cameras tend to add some extraneous bloat to them. Edit them in Photoshop or another popular image editor–which you undoubtedly will–and you’ll likely add another layer of bloat to them. I talked about Optimizing web graphics back in May 2002.

But what can you do on the server itself?

First, regardless of what you’re using, you should be running mod_gzip in order to compress your web server’s output. It works with virtually all modern web browsers, and those browsers that don’t work with it negotiate with the server to get non-compressed output. My 45K front page becomes 6K when compressed, which is better than a seven-fold increase. Suddenly my 128-meg uplink becomes more than half of a T1.

I’ve read several places that it takes less CPU time to compress content and send it than it does to send uncompressed content. On my P2-450, that seems to definitely be the case.

Unfortunately, mod_gzip is one of the most poorly documented Unix programs I’ve ever seen. I complained about this nearly three years ago, and the situation seems little improved.

A simple apt-get install libapache-mod-gzip in Debian doesn’t do the trick. You have to search /etc/apache/httpd.conf for the line that begins LoadModule gzip_module and uncomment it, then you have to add a few more lines. The lines to enable mod_gzip on TurboLinux didn’t save me this time–for one thing, it didn’t handle PHP output. For another, it didn’t seem to do anything at all on my Debian box.

Charlie Sebold to the rescue. He provided the following lines that worked for him on his Debian box, and they also worked for me:

# mod_gzip settings

mod_gzip_on Yes
mod_gzip_can_negotiate Yes
mod_gzip_add_header_count Yes
mod_gzip_minimum_file_size 400
mod_gzip_maximum_file_size 0
mod_gzip_temp_dir /tmp
mod_gzip_keep_workfiles No
mod_gzip_maximum_inmem_size 100000
mod_gzip_dechunk Yes

mod_gzip_item_include handler proxy-server
mod_gzip_item_include handler cgi-script

mod_gzip_item_include mime ^text/.*
mod_gzip_item_include mime ^application/postscript$
mod_gzip_item_include mime ^application/ms.*$
mod_gzip_item_include mime ^application/vnd.*$
mod_gzip_item_exclude mime ^application/x-javascript$
mod_gzip_item_exclude mime ^image/.*$
mod_gzip_item_include mime httpd/unix-directory
mod_gzip_item_include file .htm$
mod_gzip_item_include file .html$
mod_gzip_item_include file .php$
mod_gzip_item_include file .phtml$
mod_gzip_item_exclude file .css$

Gzipping anything below 400 bytes is pointless because of overhead, and Gzipping CSS and Javascript files breaks Netscape 4 part of the time.

Most of the examples I found online didn’t work for me. Charlie said he had to fiddle a long time to come up with those. They may or may not work for you. I hope they do. Of course, there may be room for tweaking, depending on the nature of your site, but if they work, they’re a good starting point.

Second, you can use a PHP accelerator. PHP is an interpreted language, which means that every time you run a PHP script, your server first has to translate the source code into machine language and run it. This can take longer than the output itself takes. PHP accelerators serve as a just-in-time compiler, which compiles the script and holds a copy in memory, so the next time someone accesses the page, the pre-compiled script runs. The result can sometimes be a tenfold increase in speed.

There are lots of them out there, but I settled on Ion Cube PHP Accelerator (phpa) because installation is a matter of downloading the appropriate pre-compiled binary, dumping it somewhere (I chose /usr/local/lib but you can put it anywhere you want), and adding a line to php.ini (in /etc/php4/apache on my Debian box):

zend_extension=”/usr/local/lib/php_accelerator_1.3.3r2.so”

Restart Apache, and suddenly PHP scripts execute up to 10 times faster.

PHPA isn’t open source and it isn’t Free Software. Turck MMCache is, so if you prefer GPL, you can use it.

With mod_gzip and phpa in place and working, my web server’s CPU usage rarely goes above 25 percent. Without them, three simultaneous requests from the outside world could saturate my CPU.

With them, my site still isn’t quite as fast as it was in 2000 when it was just serving up static HTML, but it’s awfully close. And it’s doing a lot more work.

 

Using your logs to help track down spammers and trolls

It seems like lately we’ve been talking more on this site about trolls and spam and other troublemakers than about anything else. I might as well document how I went about tracking down two recent incidents to see if they were related.
WordPress and b2 store the IP address the comment came from, as well as the comment and other information. The fastest way to get the IP address, assuming you haven’t already deleted the offensive comment(s), is to go straight to your SQL database.

mysql -p
[enter the root password] use b2database;
select * from b2comments where comment_post_id = 819;

Substitute the number of your post for 819, of course. The poster’s IP address is the sixth field.

If your blogging software records little other than the date and time of the message, you’ll have to rely on your Apache logs. On my server, the logs are at /var/log/apache, stored in files with names like access.log, access.log.1, and access.log.2.gz. They are archived weekly, with anything older than two weeks compressed using gzip.

All of b2’s comments are posted using a file called b2comments.post.php. So one command can turn up all the comments posted on my blog in the past week:

cat /var/log/apache/access.log | grep b2comments.post.php

You can narrow it down by piping it through grep a bit more. For instance, I knew the offending comment was posted on 10 November at 7:38 pm.

cat /var/log/apache/access.log | grep b2comments.post.php | grep 10/Nov/2003

Here’s one of my recent troublemakers:

24.26.166.154 – – [10/Nov/2003:19:38:28 -0600] “POST /b2comments.post.php HTTP/1.1” 302 5 “https://dfarq.homeip.net/index.php?p=819&c=1” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 Firebird/0.7”

This line reveals quite a bit: Besides his IP address, it also tells his operating system and web browser.

Armed with his IP address, you can hunt around and see what else your troublemaker’s been up to.

cat /var/log/apache/access.log | grep 24.26.166.154
zcat /var/log/apache.access.log.2.gz | grep 24.26.166.154

The earliest entry you can find for a particular IP address will tell where the person came from. In one recent case, the person started off with an MSN search looking for information about an exotic airplane. In another, it was a Google search looking for the words “Microsoft Works low memory.”

You can infer a few things from where a user originally came from and the operating system and web browser the person is using. Someone running the most recent Mozilla Firebird on Linux and searching with Google is likely a more sophisticated computer user than someone running a common version of Windows and the version of IE that was supplied with it and searching with MSN.

You can find out other things about individual IP addresses, aside from the clues in your logs. Visit ARIN to find out who owns the IP address. Most ARIN records include contact information, if you need to file a complaint.

Visit Geobytes.com IP Locator to map the IP address to a geographic region. I used the IP locator to determine that the guy looking for the airplane was in Brooklyn, and the Microsoft guy was in Minneapolis.

Also according to my Apache logs, the guy in Brooklyn was running IE 6 on Windows XP. The guy in Minneapolis was running Mozilla Firebird 0.7 on Linux. (Ironic, considering he was looking for Microsoft information.) It won’t hold up in a court of law, but the geographic distance and differing usage habits give at least some indication it’s two different people.

A coupla MP3 jukebox solutions

I’ve been playing with MP3 jukebox solutions. Grind! looks perfect, except for the life of me I can’t get it to work, which puts a bit of a damper on things. It acts like it’s playing, but the sound never comes out of the sound card. The sound card works fine. The www-data account (Apache’s user) has access to the sound card. The MP3 player software runs as www-data. It works fine when I log in and su into the www-data account. But when I hit the web page to control it, the music never plays.
So I’m about to give up for a while and give Gina a look. Gina’s got lots of cool features. But I’d rather have a computer that plays the music rather than streaming it–I want to hook up a headless computer to my stereo. I suppose I could put an MP3 server in my basement and put a headless computer on my stereo and control it remotely using remote X or VNC or something. It doesn’t do scoring of music the way Grind! does, but I think I can hack that in. You know, create another database of songs, assign a score to each, then when it picks a track, discard it if its score is zero, and when it picks one track, pick two instead, play the higher-scored track and put the other one back in the queue (unless it’s a zero, in which case you discard it). I think I can code that. And that way I’ll hear U2’s “I Will Follow” a lot more frequently than U2’s “Mysterious Ways” (which, don’t get me wrong, was a good song… THE FIRST 3 BILLION TIMES I HEARD IT).

And hey, maybe I can figure out how to hack Gina to play the song instead of streaming it. Because it does lots of other cool stuff. Click the link, check it out.

I wrote up a bunch of stuff today but technical difficulties prevent me from posting it. I’ll post tomorrow.

Optimizing a web server

Promises of better Apache performance have me lusting after lingerd, a very obscure utility that increases performance for dynamic content. It’s been used on a handful of little sites you might have heard of: Slashdot, Newsforge, and LiveJournal.
Unfortunately there’s no Debian package, which means compiling it myself, which means compiling Apache myself, which also means compiling PHP and MySQL, which means a big ol’ pain, but potentially better performance since I could go crazy on the GCC optimization flags. Hello, -O3 -march=i686!

And if I’m going to compile all that myself, I figure I might as well compile it all myself and get the high performance across the board and get GCC 3.2x into the picture for even better performance. The easy way to do that is with lfs-install, which builds a system based on Linux From Scratch. For workstations I’d rather use something along the lines of Gentoo, but for servers, LFS is small, mature, and reasonably conservative.

Supposedly metalog offers improved performance over the more traditional syslogd or sysklogd. The good news is that those who are more sane than me and sticking with Debian for everything can take advantage of a Debian package (at least in unstable), and just apt-get away.

If I have any sanity left, I’ll think about minit to replace SystemVInit and save me about 400K of memory in a process that’s always running, and fgetty to save me a little more. I’ve tried fgetty in the past without success; it turns out fgetty requires DJB’s checkpassword in order to work.

Keep in mind I haven’t tried any of this yet. But the plan sounds so good in my current sleep-deprived state I couldn’t help but share it.

SQLSlammer takes its toll on the ‘Net

If the ‘Net was slow today, it was because of a new worm, called SQLSlammer, that infected vulnerable Windows servers running Microsoft’s SQL database.
The exploit it used was old, but it was made possible because Microsoft’s cumulative hotfixes not being cumulative, and one of the patches not included, if applied afterward, reverted the server back to its vulnerable state. This was not mentioned clearly in the documentation for the hotfixes. Probably Microsoft didn’t know–until it was too late.

But in some cases it’s not Microsoft’s fault. Try getting a pointy-haired boss to give you 15 minutes’ downtime per server so you can roll necessary security patches across your enterprise. Since many people who ultimately make IT decisions never actually administered a Windows server in their careers, a lot of bad decisions get made and servers stay unpatched, as a matter of policy, either out of fear that a patch that closes a security hole might create a new bug, or that some remote VPN user in Kenya might be trying to work during that proposed scheduled time.

Linux got a bad rap in the security press last year because it allegedly had more security vulnerabilities than Windows did last year–never mind that a vulnerability in, say, BIND would get counted several times because it’s included in every Linux distribution, so whereas a vulnerability in IIS would get counted once against Windows’ total, a vulnerability in BIND might get counted 8 times.

We’ll ignore that. Fine. Linux has a larger number of security problems and vulnerabilities than Windows does. Fact. Undeniable. Fine. Answer this question then: Has any worm affecting Linux ever had the devastating effect that SQLSlammer had? That Nimda had? The most notorious worm that affected Linux was called Slapper. Do you remember it? More than 60% of the servers on the ‘Net run on Apache. A worm affecting Apache should have been huge. It wasn’t.

Statistics are, well, statistics. Just because I can find you a set of numbers that suggests the sky is pink doesn’t make it any less blue.

Why anyone, anywhere, has a Windows server on the ‘Net with anything more than port 80 exposed is beyond me.

Trustworthy Computing? Nice buzzwords. Billy Gates has yet to put any meaning into them.

And incompetence rises. Managers didn’t learn from Nimda, so they won’t learn from this either.

Great combination. What does it mean? History will repeat itself. Something like this will happen again. Probably sooner rather than later.

The low-end server

Here’s a good question: What should a small operation do when it gets fed up with its network and is tempted to just chuck it all and start over?
Well, my advice is to start over. But I don’t agree that starting over requires one to chuck everything.

We’ll start with the server. Chances are, these days, you need one. If you’re doing Web and e-mail, you absolutely need one. But to a lot of people, servers are a mystical black box that costs more money than a desktop PC but runs a similar operating system. And that’s all they know.

Here’s what you need to know: A corporate server is built to stricter tolerances than a desktop PC and sometimes uses higher-quality parts (common examples are ServerWorks chipsets instead of Intel chipsets, SCSI instead of IDE, and error-correcting memory instead of the cheap nonparity stuff). You also often get niceties like hot-swap drive cages, which allow you to add or replace hard drives without powering down or opening the case.

They’re generally also better tested, and you can get a support contract on them. If you’re running an enterprise with hundreds or thousands of people relying on your server, you should buy server-grade stuff, and building your own server or repurposing a desktop PC as a server ought to be grounds for dismissal. The money you save isn’t worth it–you’ll pay more in downtime.

But a dozen people won’t hit a server very hard. This Web site runs on a Dell OptiPlex Pentium II/450 workstation. A workstation is a notch above a desktop PC but a notch below a server, in the pecking order. The biggest difference between my Optiplex and the PC that was probably sitting on your desk at work a year or two ago is that my Optiplex has a SCSI hard drive in it and it has a 3Com NIC onboard.

A small office can very safely and comfortably take a reasonably powerful name-brand PC that’s no longer optimal for someone’s desk (due to an aging CPU) and turn it into a server. A Pentium II-350 or faster, outfitted with 256 MB of RAM, a SCSI host adapter and a nice SCSI hard drive, and a 3Com or Intel 100-megabit Ethernet card will make a fine server for a couple of dozen people. (My employer still has a handful of 200 MHz Pentium Pro servers on its network, serving a couple hundred people in some cases.)

This server gets hit about as hard as a typical small business or church office server would. So far this month I’ve been getting between 500 and 550 visitors per day. I’ve served about 600 megabytes’ worth of data. My average CPU usage over that time period is in the single digits. The biggest bottleneck in this server is its 7200-rpm SCSI disk. A second disk dedicated to its database could potentially speed it up. But it’s tolerable.

Hot swappable hard drives are nice to have, but with an office of a dozen people, the 5-10 minutes it takes to power down, open the case, swap drives, and close the case back up and boot again probably doesn’t justify the cost.

A business or church office that wanted to be overly cautious could buy the very least expensive sever it can find from a reputable manufacturer (HP/Compaq, Dell, IBM). But when you do that, you’re paying for a lot of power that’s going to sit there unused most of the time. The 450 MHz CPU in this box is really more than I need.

Jeremy Hendrickson e-mailed me asking about whether his church should buy a new server, and whether it really needed two or three servers, since he was talking about setting up a Samba server for file serving, Apache for Web serving, and a mail server. Running file and Web services on the same box won’t be much of a problem. A dozen people just won’t hit the server that hard. You just make sure you buy a lot of disk space, but most of that disk space will go to file serving. The database that holds all of the content on this site is only a few megabytes in size. Compressed, it fits on a floppy disk with lots of room to spare. Yes, I could realistically do nightly backups of my Web server on floppies. If floppies were at all reliable, that is.

I flip-flop on whether e-mail belongs on the same server. The security vulnerabilities of Web servers and mail servers are a bit different and it would be nice to isolate them. But I’m a lot more comfortable about a Linux box running both being exposed on the ‘Net than I am a Windows box running one or the other. If I had two boxes, and could afford to be paranoid, I’d use two.

Jeremy said his church had a P3-733 and a P2-450, both Dells, due for retirement. I’d make the P3 into a file/print/Web server and the P2 into a mail server and spend the money budgeted for a new server or servers to buy lots of disk space and a nice tape backup drive, since they’d get lots of use out of both of those. A new $1200 server would just buy lots of CPU power that’ll sit idle most of the time and you’d still have to buy disks.

As far as concern about the reliability of reusing older systems, the things that tend to wear out on older PCs are the hard drive and the operating system. Windows deterriorates over time. Server operating systems tend not to have this problem, and Linux is even more immune to it than Microsoft server operating systems. So that’s not really a concern.

Hard disks do wear out. I read a suggestion not long ago that IDE hard disks should be replaced every 3 years whether they seem to need it or not. That’s a little extreme, but I’ve found it’s hard to coax much more than four years out of an IDE disk. Dropping a new SCSI disk or two or three into an old workstation before turning it into a server should be considered mandatory. SCSI disks give better performance in multiuser situations, and are generally designed to run for five years. In most cases, the rest of the PC also has several years left in it.

Later this week, we’ll talk about Internet connectivity and workstations.

A b2 user looks longingly at Movable Type

This web site is in crisis mode.
I’ve been talking the past few days with a lot of people about blogging systems. I’ve worked with a lot of them. Since 1999, I’ve gone from static pages to Manilla to Greymatter to b2, and now, I’m thinking about another move, this time to Movable Type.

At the time I made each move, each of the solutions I chose made sense.

I really liked Manilla’s calendar and I really liked having something take care of the content management for me. I moved to Greymatter from Manilla after editthispage.com had one too many service outages. (I didn’t like its slow speed either. But for what I was paying for it, I couldn’t exactly complain.) Greymatter did everything Manilla would do for me, and it usually did it faster and better.

Greymatter was abandoned right around the time I started using it. But at the time it was the market leader, as far as blogs you ran on your own servers went. I kept on using it for a good while because it was certainly good enough for what I wanted to do, and because it was super-easy to set up. I was too chicken at the time to try anything that would require PHP and MySQL, because at the time, setting up Apache, PHP and MySQL wasn’t exactly child’s play. (It’s still not quite child’s play but it’s a whole lot easier now than it used to be.)

Greymatter remained good enough until one of my posts here got a hundred or so responses. Posting comments to that post became unbearably slow.

So I switched to b2. Fundamentally, b2 was pretty good. Since it wasn’t serving up static pages it wasn’t as fast as Greymatter, but when it came to handling comments, it processed the 219th comment just as quickly as it processed the first. And having a database backend opened up all sorts of new possibilities, like the Top 10 lists on the sidebar (courtesy of Steve DeLassus). And b2 had all the basics right (and still does).

When I switched to b2, a handful of people were using a new package called Movable Type. But b2 had the ability to import a Greymatter site. And Movable Type was written in Perl, like Greymatter, and didn’t appear to use a database backend, so it didn’t appear to be a solution to my problem.

Today, Movable Type does use a MySQL backend. And Movable Type can do all sorts of cool stuff, like pingbacks, and referrer autolinks. Those are cool. If someone writes about something I write and they link to it, as soon as someone follows the link, the link appears at the bottom of my entry. Sure, comments accomplish much the same thing, but this builds community and it gives prolific blogs lots of Googlejuice.

And there’s a six-part series that tells how to use Movable Type to implement absolutely every good idea I’ve ever had about a Weblog but usually couldn’t figure out how to do. There are also some ideas there I never conceived of.

In some cases, b2 just doesn’t have the functionality. In some cases (like the linkbacks), it’s so easy to add to b2 even I can do it. In other cases, like assigning multiple categories to a post, it’s difficult. I don’t doubt b2 will eventually get most of this functionality. But when someone else has the momentum, what to do? Do I want to forever be playing catch-up?

And that’s my struggle. Changing tools is always at least a little bit painful, because links and bookmarks go dead. So I do it only when it’s overwhelmingly worthwhile.

Movable Type will allow you to put links to related entries automatically. Movable Type will help you build meaningful metatags so search engines know what to do with you (MSN had no idea what to do with me for the longest time–I re-coded my page design a couple of weeks ago just to accomodate them). MT will allow you to tell it how much to put into your RSS feed (which I’m sure will draw cheers from the poor folks who are currently pulling down the entire story all the time).

MT doesn’t have karma voting, like Greymatter did (and I had Steve add to b2). I like it but I can live without it. I can probably get the same functionality from page reads. Or I can just code up a “best of” page by hand, using page reads, feedback, and gut feeling as my criteria.

The skinny: I’m torn on whether I should migrate. I stand to gain an awful lot. The main reason I have to stay with what I have is Steve’s custom code, which he worked awfully hard to produce, and some of it gives functionality that MT doesn’t currently have. Then again, for all I know it might not be all that hard to adapt his code to work with MT.

I know Charlie thought long and hard about switching. He’s one of the people I’ve been talking with. And I suspected he would be the first to switch. The biggest surprise to me when he did was that it took him until past 3 p.m. today to do it.

And I can tell you this. If I were starting from scratch, I’d use Movable Type. I doubt I’d even look at anything else.

Patch your Linux distros

There’s a nasty vulnerability in recent SSL libraries that an Apache-based worm is currently exploiting. The patch is obviously the most critical on machines that are running secure Apache sites. But if you don’t like vulnerabilities, and you shouldn’t, go get your distribution’s latest updates.
This is why I like Debian; a simple apt-get update && apt-get upgrade brings me right up to speed.

CERT pointed out that Apache installations that contain the ServerTokens ProductOnly directive in their httpd.conf file aren’t affected. (I added it under the ServerName directive in my file–it’s not present at all in Debian by default.) This will hurt Linux’s standings in Netcraft, but are you more interested in security or advocacy? Increasingly, I’m more interested in security. No point in bragging that you’re more secure than Windows. Someone might make you prove it. I’d rather let someone else prove it.

While you’re making Apache volunteer as little information as possible, you might as well make the rest of your OS as quiet as possible too. You can find some information on that in an earlier post here.

So you think Linux is unproven?

I’ve had arguments at work with one of the managers as to whether Linux is up to the task of running an enterprise-class Web server. When I mention my record with Linux running this site, the manager dismisses it, never mind that this site gets more traffic than a lot of the sites we run at work. So I went looking this afternoon for some sites that run on Linux, Apache, and PHP, like this one does.
I found a bunch of small-timers. Read more