Cleaning the Windows registry – and optimizing

Cleaning the Windows registry is a popular and controversial topic. Many pundits tell you never to do it. When I wrote a book about Windows back in 1999, I dedicated most of one chapter to the topic. But today the pundits have a point. Most registry cleaning utilities do much more harm than good. I don’t recommend you clean your registry, per se, but I do recommend you maintain it.

I don’t want to dismiss the concept completely out of hand. There’s a difference between a bad idea and a bad implementation. Registry cleaning and maintenance is a victim of bad implementation. But that doesn’t mean it was a bad idea. So let’s talk about how to get the benefit while minimizing the drawbacks.

Read more

How to increase the capacity of a Log Logic appliance by 45%

My 9-5 gig revolves primarily around Tibco LogLogic (I’ll write it as Log Logic going forward, as I write in English, not C++), which is a centralized logging product. The appliances collect logs from a variety of dissimilar systems and present you with a unified, web-based interface to search them. When something goes wrong, having all of the logs in one place is invaluable for figuring it out.

That value comes at a price. I don’t know exactly what these appliances cost, but generally speaking, $100,000 is a good starting point for an estimate. So what if I told you that you could store 45% more data on these expensive appliances, and increase their performance very modestly (2-5 percent) in the process? Read on.

Read more

How I changed servers midstream

When upgrading this site, I replaced the underlying hardware as well. The old server was just a dead end in too many regards to be worth upgrading in place, and besides, being able to run new and old side by side for a time is helpful.

This type of maneuver is routine work for a professional sysadmin. But it’s been at least two years since I’ve done a similar maneuver at all, and at least five years since I did it with Linux.

When I built the new machine, I gave it a unique IP address. Turnkey Linux makes getting an operational LAMP stack trivial, and depending on what you want to run on that stack, you may even be able to get that installed for you too.

Unfortunately for me, the Geeklog migration tool doesn’t seem to work with WordPress 3.0.1. So I had to get WordPress running on my old hardware in order to migrate. I chose WordPress 2.0.11 because the 2.0 branch appeared to be the current branch when Justdave wrote his migration tool, and 2.0.11 ran without complaint on the dated versions of PHP and MySQL that were on my old server.

After importing the content, I used mysqldump to export my databases. Specifically:

mysqldump --opt -u [mysql username] -p [database name, probably wordpress] > wordpress.sql

I should have gzipped the file, but I didn’t.

gzip wordpress.sql

I then connected to the old server via FTP and transferred the file. Use your favorite file transfer method; I happened to have FTP set up for my internal network.

Uncompress the file if you compressed it:

gunzip wordpress.sql.gz

Then restore the file:

mysql -u [mysql username] -p [database name] < wordpress.sql

Or, if the database already exists, like in my case:

mysqlimport -u [uname] -p [database name] wordpress.sql

Then I connected to the webserver via my web browser. WordPress 3.0.1 saw the WordPress 2.0.11 database and informed me that it needed to be upgraded. So I let it do its thing, and a few minutes later, I had a functioning WordPress site with 10 years’ worth of legacy entries.

I messed around with it for a while. Finally, I decided to go live. And at this point, I should have physically moved the new server into its permanent home. I didn’t do that, so now when I decide to move the server, I’m going to have some downtime.

To flip the IP addresses, you need to know where your Linux box stores its IP address. Debian and Ubuntu both store it in /etc/network/interfaces. As far as I can tell, Red Hat and derivatives like CentOS store it in /etc/sysconfig/network-scripts/ifcfg-eth0, but I haven’t used Red Hat or a derivative in a long time, perhaps 2003.

If worse comes to worse, try something like this to determine where it’s stored:

grep -r [ip address] /etc/

I edited the appropriate file on both boxes, changing the IP address while leaving all of the other parameters unchanged.

I then issued the command ifdown eth0 on both machines.

On my new production server, I then issued the command ifup eth0. Depending on the Linux distribution, it might also be necessary to re-issue a default route command. I didn’t have to do that.

Depending on how much Linux/Unix cred you have at stake, you could just do it the Windows way and reboot the box. Or both of them.

Once I was satisfied everything was working, I powered down the old server and celebrated.

Optimizing dynamic Linux webservers

Linux + Apache + MySQL + PHP (LAMP) provides an outstanding foundation for building a web server, for, essentially, the value of your time. And the advantages over static pages are fairly obvious: Just look at this web site. Users can log in and post comments without me doing anything, and content on any page can change programmatically. In my site’s case, links to my most popular pages appear on the front page, and as their popularity changes, the links change.

The downside? Remember the days when people bragged about how their 66 MHz 486 was a perfectly good web server? Kiss those goodbye. For that matter, your old Pentium-120 or even your Pentium II-450 may not be good enough either. Unless you know these secrets…

First, the simple stuff. I talked about a year and a half ago about programs that optimize HTML by removing some extraneous tags and even give you a leg up on translating to cascading style sheets (CSS). That’s a starting point.

Graphics are another problem. People want lots of them, and digital cameras tend to add some extraneous bloat to them. Edit them in Photoshop or another popular image editor–which you undoubtedly will–and you’ll likely add another layer of bloat to them. I talked about Optimizing web graphics back in May 2002.

But what can you do on the server itself?

First, regardless of what you’re using, you should be running mod_gzip in order to compress your web server’s output. It works with virtually all modern web browsers, and those browsers that don’t work with it negotiate with the server to get non-compressed output. My 45K front page becomes 6K when compressed, which is better than a seven-fold increase. Suddenly my 128-meg uplink becomes more than half of a T1.

I’ve read several places that it takes less CPU time to compress content and send it than it does to send uncompressed content. On my P2-450, that seems to definitely be the case.

Unfortunately, mod_gzip is one of the most poorly documented Unix programs I’ve ever seen. I complained about this nearly three years ago, and the situation seems little improved.

A simple apt-get install libapache-mod-gzip in Debian doesn’t do the trick. You have to search /etc/apache/httpd.conf for the line that begins LoadModule gzip_module and uncomment it, then you have to add a few more lines. The lines to enable mod_gzip on TurboLinux didn’t save me this time–for one thing, it didn’t handle PHP output. For another, it didn’t seem to do anything at all on my Debian box.

Charlie Sebold to the rescue. He provided the following lines that worked for him on his Debian box, and they also worked for me:

# mod_gzip settings

mod_gzip_on Yes
mod_gzip_can_negotiate Yes
mod_gzip_add_header_count Yes
mod_gzip_minimum_file_size 400
mod_gzip_maximum_file_size 0
mod_gzip_temp_dir /tmp
mod_gzip_keep_workfiles No
mod_gzip_maximum_inmem_size 100000
mod_gzip_dechunk Yes

mod_gzip_item_include handler proxy-server
mod_gzip_item_include handler cgi-script

mod_gzip_item_include mime ^text/.*
mod_gzip_item_include mime ^application/postscript$
mod_gzip_item_include mime ^application/ms.*$
mod_gzip_item_include mime ^application/vnd.*$
mod_gzip_item_exclude mime ^application/x-javascript$
mod_gzip_item_exclude mime ^image/.*$
mod_gzip_item_include mime httpd/unix-directory
mod_gzip_item_include file .htm$
mod_gzip_item_include file .html$
mod_gzip_item_include file .php$
mod_gzip_item_include file .phtml$
mod_gzip_item_exclude file .css$

Gzipping anything below 400 bytes is pointless because of overhead, and Gzipping CSS and Javascript files breaks Netscape 4 part of the time.

Most of the examples I found online didn’t work for me. Charlie said he had to fiddle a long time to come up with those. They may or may not work for you. I hope they do. Of course, there may be room for tweaking, depending on the nature of your site, but if they work, they’re a good starting point.

Second, you can use a PHP accelerator. PHP is an interpreted language, which means that every time you run a PHP script, your server first has to translate the source code into machine language and run it. This can take longer than the output itself takes. PHP accelerators serve as a just-in-time compiler, which compiles the script and holds a copy in memory, so the next time someone accesses the page, the pre-compiled script runs. The result can sometimes be a tenfold increase in speed.

There are lots of them out there, but I settled on Ion Cube PHP Accelerator (phpa) because installation is a matter of downloading the appropriate pre-compiled binary, dumping it somewhere (I chose /usr/local/lib but you can put it anywhere you want), and adding a line to php.ini (in /etc/php4/apache on my Debian box):

zend_extension=”/usr/local/lib/php_accelerator_1.3.3r2.so”

Restart Apache, and suddenly PHP scripts execute up to 10 times faster.

PHPA isn’t open source and it isn’t Free Software. Turck MMCache is, so if you prefer GPL, you can use it.

With mod_gzip and phpa in place and working, my web server’s CPU usage rarely goes above 25 percent. Without them, three simultaneous requests from the outside world could saturate my CPU.

With them, my site still isn’t quite as fast as it was in 2000 when it was just serving up static HTML, but it’s awfully close. And it’s doing a lot more work.

 

Using your logs to help track down spammers and trolls

It seems like lately we’ve been talking more on this site about trolls and spam and other troublemakers than about anything else. I might as well document how I went about tracking down two recent incidents to see if they were related.
WordPress and b2 store the IP address the comment came from, as well as the comment and other information. The fastest way to get the IP address, assuming you haven’t already deleted the offensive comment(s), is to go straight to your SQL database.

mysql -p
[enter the root password] use b2database;
select * from b2comments where comment_post_id = 819;

Substitute the number of your post for 819, of course. The poster’s IP address is the sixth field.

If your blogging software records little other than the date and time of the message, you’ll have to rely on your Apache logs. On my server, the logs are at /var/log/apache, stored in files with names like access.log, access.log.1, and access.log.2.gz. They are archived weekly, with anything older than two weeks compressed using gzip.

All of b2’s comments are posted using a file called b2comments.post.php. So one command can turn up all the comments posted on my blog in the past week:

cat /var/log/apache/access.log | grep b2comments.post.php

You can narrow it down by piping it through grep a bit more. For instance, I knew the offending comment was posted on 10 November at 7:38 pm.

cat /var/log/apache/access.log | grep b2comments.post.php | grep 10/Nov/2003

Here’s one of my recent troublemakers:

24.26.166.154 – – [10/Nov/2003:19:38:28 -0600] “POST /b2comments.post.php HTTP/1.1” 302 5 “https://dfarq.homeip.net/index.php?p=819&c=1” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 Firebird/0.7”

This line reveals quite a bit: Besides his IP address, it also tells his operating system and web browser.

Armed with his IP address, you can hunt around and see what else your troublemaker’s been up to.

cat /var/log/apache/access.log | grep 24.26.166.154
zcat /var/log/apache.access.log.2.gz | grep 24.26.166.154

The earliest entry you can find for a particular IP address will tell where the person came from. In one recent case, the person started off with an MSN search looking for information about an exotic airplane. In another, it was a Google search looking for the words “Microsoft Works low memory.”

You can infer a few things from where a user originally came from and the operating system and web browser the person is using. Someone running the most recent Mozilla Firebird on Linux and searching with Google is likely a more sophisticated computer user than someone running a common version of Windows and the version of IE that was supplied with it and searching with MSN.

You can find out other things about individual IP addresses, aside from the clues in your logs. Visit ARIN to find out who owns the IP address. Most ARIN records include contact information, if you need to file a complaint.

Visit Geobytes.com IP Locator to map the IP address to a geographic region. I used the IP locator to determine that the guy looking for the airplane was in Brooklyn, and the Microsoft guy was in Minneapolis.

Also according to my Apache logs, the guy in Brooklyn was running IE 6 on Windows XP. The guy in Minneapolis was running Mozilla Firebird 0.7 on Linux. (Ironic, considering he was looking for Microsoft information.) It won’t hold up in a court of law, but the geographic distance and differing usage habits give at least some indication it’s two different people.

Trolling the web for nothing in particular

Yes, Brian, baseball will soon return. I hate the things Major League Baseball does (Bob Costas once likened choosing sides between the players and the owners to choosing sides between Iran and Iraq), but we’ve chosen to stay together for the kids. I’m sure everyone who cares (and some who don’t) can guess what I think of Bud Selig, but I’ll tell you anyway, soon enough.
In the meantime, I look like ArsTechnica today. Oh well. I don’t do this very often.

Blogging. Wired News had its take on the phenomenon, and threw out some interesting stats.


In January alone, at least 41,000 people created new blogs using Blogger, and that number is always increasing, [Blogger founder Evan] Williams said. Some have put the total number of weblogs at more than 500,000.

Alongside the boom, however, there have recently been a few faint signs of backlash. As increasing hordes take on the task of trying to keep new sites looking nice, sounding original and free from banalities, more hordes just seem to fail.

Blog critic Dave Linabury offered a recipe for success:


“It really can take a lot of time,” he said. “I spend two hours a day on my weblog. Many people don’t realize this, they think it’s a quick way to get popular. And after awhile they get really discouraged and say, ‘he got 2,300 hits today, I got four.’ The bulk of people out there get less than two dozen hits.”

“I don’t want to be elitist,” Linabury added, “but all these people out there with popular weblogs, they’ve been doing it longer and they stick to their guns.”

I can attest to that. The people who get more traffic than I get almost all have been doing this longer. But I can tell you one thing: It’s never enough. Back when I was getting 80 visits a day I wanted 150. When I was getting 150 visits a day, I wanted 250. Now that I get about 500 visits a day, I’m awfully distressed to see people are getting 2,300. And by the time I reach 2,300, I’m sure there will be people getting 5,000 or even 10,000. (Note that visits are the number of unique visitors; hits are the number of files served up. Hit count is deceptive. I get 500 visits per day but closer to 1,000 or even 1,500 hits per day, due to people visiting, reading comments, and then often reading something from a previous week. And if they do a search, that’s at least two additional hits.)

Link

Another feather in Internet Explorer’s cap. To my knowledge, no new security vulnerabilities have been reported in Internet Explorer this week, but the newest security patch, released last week, contains a bug that can cause a VBscript directive that previously worked to crash the browser.

Microsoft says Webmasters need to modify their pages not to use the directive.

That’s nice (I don’t use VBscript on this site) but there are embedded devices, such as HP’s JetDirect card, that use the directive. So early adopters of this patch may find themselves unable to do their jobs.

Better webmaster recommendation: Don’t use VBscript or ActiveX or other Microsoft-owned languages in your Web pages at all. Better end-user recommendation: Use Mozilla or a derivative instead of Internet Explorer.

Link

Recompiling Debian for your hardware. This thread comes up every so often, and with the popularity of Linux From Scratch and Gentoo, the appeal of a compiled-from-scratch Debian is undeniable. But does the small speed improvement offset the increased difficulty and time in upgrading?

The consensus seems to be that recompiling gzip, bzip2, and gnupg with aggressive options makes sense, as does recompiling your kernel. Recompiling XFree86 may also make some sense. But expending time and energy in the perfectly optimized versions of ls and more is foolhardy. (Especially seeing as speed demons can just get assembly language versions of them from www.linuxassembly.org.)

Link

A Guide to Debian. This is a guide, still incomplete, that gives a number of tips for someone who’s just installed Debian. The tips are applicable to other many other Linux (and even Unix) flavors as well.

Link

Spam. A coworker walked into my cube today and asked me how he could keep web robots from harvesting e-mail addresses from his web site. I found myself referring once again to the definitive piece on the subject, from Brett Glass (who gets my nomination for the greatest computer columnist of all time, for what that’s worth).

Link

The RULE project. A project has emerged to bring Red Hat Linux back to its roots, and allow it to run on older, less-powerful hardware.

From their site:


This install option is meant to benefit primarily two classes of users:

* GNU/Linux newbies who cannot afford modern computers, but still need, to get started more easily, an up to date, well documented distribution.
* System administrators and power users who have no interest in eye candy, and want to run updated software on whatever hardware is available, to minimize costs, or just because it feels like the right thing to do.

I love their FAQ. Check this out:


1.0 Hardware is so cheap today, why bother?

1. This is a very limited and egoistic attitude. Eigthy per cent of the world population still has to work many months or years to afford a computer that can run decently the majority of modern, apparently “Free” software.
2. Many people who could afford a new computer every two years rightly prefer to buy something else, like vacations, for example…. Hardware should be changed only when it breaks, or when the user’s needs increase a lot (for example when one starts to do video editing). Not because “Free” Software requires more and more expensive hardware every year.

These guys have the right idea. I can only hope their work will influence other Linux distributions as well.

Link

Linux uptime. (Sure, a little original content.) When I was rearranging things months ago, I unplugged the keyboard and monitor from my webserver, then I never got around to plugging them back in because I didn’t have to do anything with it.

The other day, I had occasion to plug a keyboard and mouse back into it. I went in, did what I wanted to do, then out of curiosity I typed the uptime command. 255 days, it told me. In other words, I haven’t rebooted since last May, which, as I recall, was about when I put the machine into production.

Monday, 2 July 2001

Some lucky people get a five-day weekend this week. Not me. I’m off Wednesday for Independence Day. About 30 years ago, my dad and his med school buddies used to go to the Missouri River and shoot bottle rockets at barges to celebrate. I’m not sure what I’ll get to do yet. Last year I had to work the 4th. That was a very nice paycheck, since I worked 60 hours that week anyway, on top of 8 hours’ holiday pay.
I found a use for absurdly high-speed CPUs this weekend. My Duron-750 can simulate a 30-team, 162-game baseball season in its entirety in roughly 3 minutes. Of course a faster CPU will do it even faster. Baseball simulation is very CPU-intensive and very disk-intensive. This 750 has a SCSI disk subsystem in it too. It’s old, but I suspect SCSI’s ability to re-order disk requests for speed helps. I haven’t swapped in an IDE drive to see if it makes a difference. So if you’re a statistical baseball junkie, you can actually justify an insanely fast CPU. It feels strange to call the cheapest CPU on the market today insanely fast, but for most things, the Duron-750 really is.

The other use I’ve found for these CPUs is emulating a 50 MHz 68060-based Amiga at full speed. A Duron-750 isn’t quite up to that task.

I talked about PartImage last week. I used it over the weekend to clone 7 PCs. My church’s sister congregation bought 8 Compaq Deskpro EXs earlier this year and just finished the room they’re going in. So I went in, set one of them up (and tweaked it out, of course–the first reaction of one of the members: “Wow, that sure boots fast!”).

Sadly, many companies seem to use non-profit organizations as a way to just get rid of their junk. Here are some of the jewels this church has been “blessed” with: two 386sx laptops with dead batteries and no power adapters, two XTs, two 286s, a pile of 386sxs, and three 486s. Two of the 486s are old Compaq ProSignia servers with big SCSI hard drives, so I can slap in an ISA NIC and install Linux on one of them and make it a file server. The only thing remotely useful that anyone’s ever given them is a pair of Pentium-75s. But one of the 75s had a 40-meg hard drive in it. That’s the better of the two, though. The other had no hard drive, no memory, and no CMOS battery.

Oh, and I shouldn’t forget the large quantities of busted monitors. They’ve got a room full of monitors. About three of them work. What’s anyone going to do with a bunch of monitors that don’t work? Legally, the church can’t throw any of this stuff away (and shouldn’t) because of all the lead content, which makes them hazardous waste. But the church can hardly afford to pay someone to take it away and dispose of it properly either. We’re talking an inner-city church here. Can you say, “blaxploitation?” I knew you could.

The Pentiums did at least come in standard AT cases though, and nice ones at that. They look like Enlights, but they had Sparkle power supplies in them, Whatever the make, they’re nice and thick so they don’t slice you, there’s lots of wide open space inside, and they have 7 drive bays. So I grabbed the diskless Pentium to make into a router/Squid server/content filter. I ripped out the P75 board and dropped in an AT Soyo Socket 370 board with a Celeron-366 on it. It’ll be fabulous.

The best I can do with most of these systems is to try to make X terminals out of them, assuming I can find a machine beefy enough to host StarOffice for a half-dozen systems. It may not be worth the bother.

One of the 386s had a 420-meg hard drive in it for some reason, so I pulled that drive, hooked it up to the first of the Compaqs, and used PartImage to dump it. I used 480 megs on the drive, so with Gzip compression, the image left just 12 megs free on the drive. Tight fit, but we were OK. Then I just ran around to each of the others, hooked up the drive, and pulled the image. I took the drive home with me so I could burn a CD from it.

That’s good use of free software.

What can I say about Tuesday…?

Photography. Tom sent me links to the pictures he took on the roof of Gentry’s Landing a couple of weeks ago. He’s got a shot of downtown, the dome, and the warehouse district, flanked by I-70 on the west and the Mississippi River on the east.
I’m tired. I spent yesterday fighting Mac OS X for a couple of hours. It still feels like beta software. I installed it on a new dual-processor G4/533 with 384 MB RAM, and it took four installation attempts to get one that worked right. Two attempts just flat-out failed, and the installation said so. A third attempt appeared successful, but it felt like Windows 95 on a 16-MHz 386SX with 4 megs of RAM. We’re talking a boot time measured in minutes here. The final attempt was successful and it booted in a reasonable time frame–not as fast as Windows 2000 on similar hardware and nowhere near the 22 seconds I can make Win9x boot in, but faster, I think, than OS 9.1 would boot on the same hardware–and the software ran, but it was sluggish. All the eye candy certainly wasn’t helping. Scrolling around was really fast, but window-resizing was really clunky, and the zooming windows and the menus that literally did drop down from somewhere really got on my nerves.

All told, I’m pretty sure my dual Celeron-500 running Linux would feel faster. Well, I know it’d be faster because I’d put a minimalist GUI on it and I’d run a lot of text apps. But I suspect even if I used a hog of a user interface like Enlightenment, it would still fare reasonably well in comparison.

I will grant that the onscreen display is gorgeous. I’m not talking the eye candy and transparency effects, I’m talking the fonts. They’re all exceptionally crisp, like you’d expect on paper. Windows, even with font smoothing, can’t match it. I haven’t seen Linux with font smoothing. But Linux’s font handling up until recently was hideous.

It’s promising, but definitely not ready for prime time. There are few enough native apps for it that it probably doesn’t matter much anyway.

Admittedly, I had low expectations. About a year ago, someone said something to me about OS X, half in jest, and I muttered back, “If anyone can ruin Unix, it’s Apple.” Well, “ruin” is an awfully harsh word, because it does work, but I suspect a lot of people won’t have the patience to stick with it long enough to get it working, and they may not be willing to take the extreme measures I ultimately took, which was to completely reformat the drive to give it a totally clean slate to work from.

OS X may prove yet to be worth the wait, but anyone who thinks the long wait is over is smoking crack.

Frankly, I don’t know why they didn’t just compile NeXTStep on PowerPC, slap in a Mac OS classic emulation layer, leave the user interface alone (what they have now is an odd hybrid of the NeXT and Mac interfaces that just feels really weird, even to someone like me who’s spent a fair amount of time using both), and release it three years ago.

But there are a lot of things I don’t know.

I spent the rest of the day fighting Linux boot disks. I wanted the Linux equivalent of a DOS boot disk with Ghost on it. Creating one from scratch proved almost impossible for me, so I opted instead to modify an existing one. The disks provided at partimage.org were adequate except they lacked sfdisk for dumping and recreating partition tables. (See Friday if you don’t have the foggiest idea what I’m talking about right about now, funk soul brother.) I dumped the root filesystem to the HD by booting off the two-disk set, mounting the hard drive (mount -t ext2 /dev/hda1 /mnt) and copying each directory (cp -a [directory name] [destination]). Then I made modifications. But nothing would fit, until I discovered the -a switch. The vanilla cp command had been expanding out all the symlinks, bloating the filesystem to a wretched 10 megs. It should have been closer to 4 uncompressed, 1.4 megs compressed. Finally I got what I needed in there and copied it to a ramdisk in preparation for dumping it to a floppy. (You’ve gotta compress it first and make sure it’ll fit.) I think the command was dd if=/dev/ram0 bs=1k | gzip -v9 > [temporary file]. The size was 1.41 MB. Excellent. Dump it to floppy: dd if=[same temporary file from before] of=/dev/fd0 bs=1k

And that’s why my mind feels fried right now. Hours of keeping weird commands like that straight will do it to you. I understand the principles, but the important thing is getting the specifics right.

The new server is live.

The new server is live. And much faster, I might add. We had a dead day in the forums, so I figure this was probably as good a time as any.
So, anyway, I hope everyone enjoys the improved speed that a faster CPU and mod_gzip provide. I know I can definitely get used to this. I notice the difference here.

I’ve extracted the text of the old forums here. At the moment I can’t get a search engine working on it. Sorry. I’ll work on it tomorrow.

How to get mod_gzip working on your Linux/Apache server

My research yesterday found that Mandrake, in an effort to get an edge on performance, used a bunch of controversial Apache patches that originated at SGI. The enhancements didn’t work on very many Unixes (presumably they were tested on Linux and Irix) and were rejected by the Apache group. SGI has since axed the project, and it appears that only performance-oriented Mandrake is using them.
I don’t have any problem with that, of course, except that Mod_Gzip seems to be incompatible with these patches. And Mod_Gzip has a lot of appeal to people like me–what it does is intercept Apache requests, check for HTTP 1.1 compliance, then compress content for sending to browsers that can handle compressed data (which includes just about every browser made since 1999). Gzip generally compresses HTML data by about 80 percent, so suddenly a DSL line has a whole lot more bandwidth–three times as much.

Well, trying to make all of this work by recompiling Apache had no appeal to me (I didn’t install any compilers on my server), so I went looking through my pile-o’-CDs for something less exotic. But I couldn’t find a recent non-Mandrake distro, other than TurboLinux 6.0.2. So I dropped it in, and now I remember why I like Turbo. It’s a no-frills server-oriented distro. Want to make an old machine with a smallish drive into a firewall? The firewall installation goes in 98 megs. (Yes, there are single-floppy firewalls but TurboLinux will be more versatile if you’re up to its requirements.)

So I installed Apache and all the other webserver components, along with mtools and Samba for convenience (I’m behind a firewall so only Apache is exposed to the world). Total footprint: 300 megs. So I’ve got tons of room to grow on my $50 20-gig HD.

Even better, I tested Apache with the command lynx http://127.0.0.1 and I saw the Apache demo page, so I knew it was working. Very nice. Installation time: 10 minutes. Then I tarred up my site, transferred it over via HTTP, untarred it, made a couple of changes to the Apache configuration file, and was up and going, sort of.

I still like Mandrake for workstations, but I think Turbo is going to get the nod the next few times I need to make Linux servers. I can much more quickly and easily tailor Turbo to my precise requirements.

Now, speaking of Mod_Gzip… My biggest complaint about Linux is the “you figure it out” attitude of a lot of the documentation out there, and Mod_Gzip may be the worst I’ve ever seen. The program includes no documentation. If you dig on the Web site, you find this.

Sounds easy, right? Well, except that’s not all you have to do. Dig around some more, and you find the directives to turn on Mod_Gzip:

# [ mod_gzip sample configuration ]

mod_gzip_on Yes

mod_gzip_item_include file .htm$
mod_gzip_item_include file .html$
mod_gzip_item_include mime text/.*
mod_gzip_item_include mime httpd/unix-directory

mod_gzip_dechunk yes

mod_gzip_temp_dir /tmp

mod_gzip_keep_workfiles No

# [End of mod_gzip sample config]

Then, according to the documentation, you restart Apache. When you do, Apache bombs out with a nice, pleasant error message–“What’s this mod_gzip_on business? I don’t know what that means!” Now your server’s down for the count.

After a few hours of messing around, I figured out you’ve gotta add another line, at the end of the AddModule section of httpd.conf:

AddModule mod_gzip.c

After adding that line, I restarted Apache, and it didn’t complain. But I still didn’t know if Mod_Gzip was actually doing anything because the status URLs didn’t work. Finally I added the directive mod_gzip_keep_workfiles yes to httpd.conf and watched the contents of /tmp while I accessed the page. Well, now something was dumping files there. The timestamps matched entries in /var/log/httpd/access_log, so I at least had circumstantial evidence that Mod_Gzip was running.

More Like This: “/cgi-bin/search.cgi?terms=linux&case=insensitive&boolean=and”>Linux

WordPress Appliance - Powered by TurnKey Linux