Optimizing Web graphics

Gatermann told me about a piece of freeware he found on one of my favorite sites, tinyapps.org, called JPG Cleaner. It strips out the thumbnails and other metadata that editing programs and digital cameras put in your graphics that isn’t necessary for your Web browser to render them. Sometimes it saves you 20K, and sometimes it saves you 16 bytes. Still, it’s worth doing, because more often than not it saves you something halfway significant.
That’s great but I don’t want to be tied to Windows, so I went looking for a similar Linux program. There isn’t much. All I was able to find was a command-line program, written in 1996, called jpegoptim. I downloaded the source, but didn’t have the headers to compile it. I went digging and found that someone built an RPM for it back in 1997, but Red Hat never officially adopted it. I guess it’s just too special-purpose. The RPM is floating around, I found it on a Japanese site. If that ever goes away, just do a Google search for jpegoptim-1.1-0.i386.rpm.

I used the Debian utility alien to convert the RPM to a Debian package. It’s just a 12K binary, so there’s nothing to installing it. So if you prefer SuSE or TurboLinux or Mandrake or Caldera, it’ll install just fine for you. And Debian users can convert it, no problem.

Jpegoptim actually goes a step further than JPG Cleaner. Aside from discarding all that metadata in the header, its main claim is that it optimizes the Huffman tables that make up the image data itself, reducing the image in size without affecting its quality at all. The difference varies; I ran it on several megabytes’ worth of graphics, and found that on images that still had all those headers, it frequently shaved 20-35K from their size. On images that didn’t have all the extra baggage (including some that I’d optimized with JPG Cleaner), it reduced the file size by another 1.5-3 percent. That’s not a huge amount, but on a 3K image, that’s 40-50 bytes. On a Web page that has lots of small images, those bytes add up. Your modem-based users will notice it.

And Jpegoptim will also let you do the standard JPEG optimization, where you set the file quality to a numeric value between 1 and 100, the higher being the truest to the original. Some image editors don’t let you adjust the quality in a very fine-grained manner. I’ve found that a level of 70 is almost always perfectly acceptable.

So, to try to get something for nothing, change into an image directory and type this:

jpegoptim -t *

And the program will see what it can save you. Don’t worry if you get a negative number; if the “optimized” file ends up actually being bigger, it’ll discard the results.

To lower the quality and potentially save even more, do this:

jpegoptim -m70 -t *

And once again, it’ll tell you what it saves you. (The program always optimizes the Huffman tables, so there’s no need to do multiple steps.) Be sure to eyeball the results if you play with quality, and back up the originals.

Commercial programs that claim to do what these programs do cost anywhere from $50 to $100. This program may be obscure, but that’s criminal. Go get it and take advantage of it.

Also, don’t forget the general rule of file formats. GIF is the most backward-compatible, but it’s encumbered by patents and it’s limited to 256-color images. It’s good for line drawings and cartoons, because it’s a lossless format (it only compresses the data, it doesn’t change it).

PNG is the successor to GIF, sporting better compression and support for 24-color images. Like GIF, it’s lossless, so it’s good for line drawings, cartoons, and photographs that require every detail to be preserved. Unfortunately, not all browsers support PNG.

JPEG has the best compression, because it’s lossy. That means it looks for details that it can discard to make the image compress better. The problem with this is that when you edit JPEGs, especially if you convert them between formats, you’ll run into generation loss. Since JPEG is lossy, line drawings and cartoons generally look really bad in JPEG format. Photographs, which usually have a lot of subtle detail, survive JPEG’s onslaught much better. The advantage of JPEG is the file sizes are much smaller. But you should always examine a JPEG before putting it on the Web; blindly compressing your pictures with high compression settings can lead to hideous results. There’s not much point in squeezing an image down to 1.5K when the result is something no one wants to look at.

Recovery time.

Taxes. I think I’ve actually filed my taxes on time twice in my adult life. This year isn’t one of them. I filed Form 4868, so Tax Day for me is actually Aug. 15, 2002.
In theory Uncle Sam owes me money this year, so I shouldn’t owe any interest. I’ll have a professional accountant test that theory soon. Make that fairly soon, because it’d be nice to have that money, seeing as I expect to make the biggest purchase of my still-fairly-short life this year.

Some people believe filing a 4868 is advantageous. The thinking is this: Let the IRS meet its quota for audits, then file. That way, the only way you’re going to get audited is if you truly raise red flags, which I shouldn’t because I’m having a professional (and an awfully conservative one at that) figure the forms. That’s good. I’d rather not have to send a big care package off to the IRS to prove I’m not stealing from them.

Adventure. Steve DeLassus and I dove headlong into an adventure on Sunday, an adventure consisting of barbecue and Linux. I think at one point both of us were about ready to put a computer on that barbie.

We’ll talk about the barbecue first. Here’s a trick I learned from Steve: Pound your boneless chicken flat, then throw it in a bag containing 1 quart of water and 1 cup each of sugar and salt. Stick the whole shebang in the fridge while the fire’s getting ready. When the fire’s ready, take the chicken out of the bag and dry thoroughly. Since Steve’s not a Kansas Citian, he doesn’t believe in dousing the chicken in BBQ sauce before throwing it on the grill. But it was good anyway. Really good in fact.

Oh, I forgot. He did spray some olive oil on the chicken first. Whether that helps it brown or locks in moisture or both, I’m not quite sure. But olive oil contains good fats, so it’s not a health concern.

Now, Linux on cantankerous 486s may be a health concern. I replaced the motherboard in Steve’s router Sunday night, because it was a cranky 486SX/20. I was tired of dealing with the lack of a math coprocessor, and the system was just plain slow. I replaced it with a very late model 486DX2/66 board. I know a DX2/66 doesn’t have three times the performance of an SX/20, but the system sure seemed three times faster. Its math coprocessor, L2 cache, faster chipset, and much better BIOS helped. It took the new board slightly longer to boot Linux than it took the old one to finish counting and testing 8 MB of RAM.

But Debian wasn’t too impressed with Steve’s Creative 2X CD-ROM and its proprietary Panasonic interface. So we kludged in Steve’s DVD-ROM drive for the installation, and laughed at the irony. Debian installed, but the lack of memory (I scraped up 8 megs; Steve’s old memory wouldn’t work) slowed down the install considerably. But once Debian was up and running, it was fine, and in text mode, it was surprisingly peppy. We didn’t install XFree86.

It was fine until we tried to get it to act as a dialup router, that is. We never really did figure out how to get it to work reliably. It worked once or twice, then quit entirely.

This machine was once a broadband router based on Red Hat 6.1, but Red Hat installed way too much bloat so it was slow whenever we did have to log into it. And Steve moved into the boonies, where broadband isn’t available yet, so it was back to 56K dialup for him. Now we know that dialup routers seem to be much trickier to set up than dual-NIC routers.

After fighting it for nearly 8 hours, we gave up and booted it back into Freesco, which works reliably. It has the occasional glitch, but it’s certainly livable. Of course we want (or at least Steve wants) more features than Freesco can give you easily. But it looks like he’ll be living with Freesco for a while, since neither of us is looking forward to another marathon Debian session.

Nostalgia. A couple of articles on Slashdot got me thinking about the good old days, so I downloaded VICE, a program that can emulate almost every computer Commodore ever built. Then I played around with a few Commodore disk images. It’s odd what I do and don’t remember. I kind of remember the keyboard layout. I remembered LOAD “*”,8,1 loads most games (and I know why that works too, and why the harder-to-type LOAD “0:*”,8,1 is safer), but I couldn’t remember where the Commodore keyboard layout put the *.

I sure wish I could remember people’s names half as well as I remember this mesozoic computer information.

It stands on shaky legal ground, but you can go to c64.com and grab images for just about any Commodore game ever created. The stuff there is still covered by copyright law, but in many cases the copyright holder has gone out of business and/or been bought out several times over, so there’s a good possibility the true copyright holder doesn’t even realize it anymore. Some copyright holders may care. Others don’t. Others have probably placed the work in the public domain. Of course, if you own the original disks for any of the software there, there’s no problem in downloading it. There’s a good possibility you can’t read your originals anyway.

I downloaded M.U.L.E., one of the greatest games of all time. I have friends who swear I was once an ace M.U.L.E. player, something of an addict. I have absolutely no recollection of that. I started figuring out the controls after I loaded it, but nothing seemed familiar, that’s for sure. I took to it pretty quickly. The strategy is simple to learn, but difficult to master. The user interface isn’t intuitive, but in those days they rarely were. And in those days, not many people cared.

Dave installs Windows XP

We needed an XP box at work for testing. Duty to do the dirty deed fell to me. So after ghosting the Windows 2000 station several of us share, I pulled out an XP CD. It installed surprisingly quickly–less than half an hour. The system is a P3-667 with 128 MB RAM and an IBM hard drive (I don’t know the model).
It found the network and had drivers for all the hardware in the box. That doesn’t happen very often with Microsoft OSs, so it was nice.

I booted into XP, to be greeted by a hillside that was just begging to be overrun by tanks, but instead of tanks, there was this humongo start menu. I right-clicked on the Start button, hit Properties, and picked Classic view. There. I had a Win95-like Start menu. While I was at it, I went back and picked small icons. I don’t like humongous Start menus.

I also don’t like training wheels and big, bubbly title bars. The system was dog slow, so I right-clicked on the desktop to see what I could find to turn off. I replaced the Windows XP theme with the Classic theme. Then I turned off that annoying fade effect.

Still, the system dragged. I went into Control Panel, System, Performance. Bingo. I could pick settings for best appearance (whose choices are certainly debatable–I guess they look good if you like bright colors and have a huge monitor) or best performance. Guess which I picked? Much better.

Next, I went into Networking. I saw some QoS thing. I did a search. It’s intended to improve the quality of your network, at the price of 20% of your bandwidth. Forget that. I killed it.

After I did all that stuff, XP was reasonably peppy. It logs on and off quickly. I installed Office 2000 and it worked fine. The apps loaded quickly–just a couple of seconds. That’s how it should be. If I went in and edited the shortcuts in the Start menu to turn off the splash screens, they’d load instantly.

WinXP brings up a bunch of popups that I don’t like. If I wanted unexpected popup windows, I’d run a Web browser. I couldn’t quickly figure out how to disable those.

I couldn’t run Windows Update. It froze every time I tried.

I found a Windows XP tuning guide at ExtremeTech. I suspect turning off the eye candy will help more than most of the suggestions in that article. I suspect if I dug around I’d find other things. We’ll see if I get some time.

XP isn’t as bad as I expected, I guess. But I’m still not going to buy it.

This, on the other hand, is worth a second look. And a third. You can now run MS Office on Linux. No need to wait for Lindows, no need to abandon your current fave distro (at least if your fave distro is Red Hat, SuSE, Mandrake, Debian, or Caldera).

It’s 55 bucks. It’s available today. It brings Office 97/2000 and Lotus Notes r5 to your Linux desktop. Other Windows apps work, but their functionality isn’t guaranteed.

You can get some screenshots at CodeWeavers. It even makes the apps look like native Linux apps.

Linux reliability

Linux reliability. Steve Mahaffey brought up a good point yesterday, while I was off on a consulting gig, where I learned one of the secrets of the universe–but since it’ll bore a lot of people to tears, I’ll save that for the end.
I’ve found that text-based apps and servers in Linux are extremely reliable. As David Huff’s tagline reads, “Linux: Because reboots are for upgrades.” If you’re running a server, that’s pretty much true. Unless you have to upgrade the kernel or install hardware that requires you to open the case, you can go for months or years without upgrading it.

The problem with Linux workstations is that up until very recently, the GUI apps people want to run the most have been in beta. The developers made no bones about their quality, but companies like Red Hat and Mandrake and SuSE have been shipping development versions of these apps anyway. On one hand, I don’t blame them. People want programs that will do what they’re used to doing in Windows. They want word processors that look like Word and mail clients that look like Outlook, and if they’re good enough–that is, they don’t crash much more than their Windows equivalents and they provide nearly as much functionality, or, in some cases, one or two things MS didn’t think of–they’ll put up with it. Because, let’s face it, for 50 bucks (or for nothing if you just download it off the ‘net) you’re getting something that’s capable of doing the job of Microsoft packages that would set you back at least $1,000. Even if you just use it for e-mail and Web access, you come out ahead.

The bigger bone I have to pick with Red Hat and Mandrake and, to some extent, even SuSE is where they put experimental code. I don’t mind experimental desktop apps–I’ve been running Galeon since around version 0.8 or so. But when you start using bleeding-edge versions of really low-level stuff like the C compiler and system libraries just to try to eke out some more performance, that really bothers me. There are better ways to improve performance than using experimental compilers. Not turning on every possible daemon (server) is a good start.

Compile beta-quality apps with a compiler that’s beta quality itself, and throw in every other bleeding-edge feature you can think of, and you’ll end up with a system that has the potential to rival Windows’ instability. Absolutely.

That’s one reason I like Debian. Debian releases seem to take as long as the Linux kernel does, and that’s frustrating, but reassuring. You can install the current stable Debian package, then add one or more of the more desirable apps from either the testing or unstable tree (despite the name, Debian unstable’s stability seems comparable to Mandrake) and have the best of all worlds. And when a .01 release of something comes out (which it always seems to do, and quickly) it’s two commands to upgrade to it.

It’ll be interesting to see how Lycoris (formerly Redmond Linux) pans out. Lycoris appears to take a more conservative approach, at least with the number of apps they install. If that conservatism extends to the versions of those packages they install, it’ll go a long way towards extending server Linux’s reliability to the desktop.

Debian is intimidating. I find it less intimidating than Slackware, but it does zero handholding during installation. So generally I recommend someone start with SuSE or Mandrake or Red Hat, get comfortable with how things are laid out, and get familiar with PC hardware if not already, and then, once feeling brave, tackle Debian. Debian is hard to install, but its quality is pristine and it’s exceptionally easy to maintain. Debian developers try to justify the difficulty of installing it by saying no one ever has to install it twice on the same PC, and they’re right about the second part. Eventually I expect they’ll take the installer from another distro that’s based on Debian to make it easier, but it won’t be in Debian 3.0 and it may not make it into 3.1 either.

The secret of consulting. My employer sent me off on a consulting gig yesterday. The main reason for it, I suspect, is because of my training as a journalist. It means I can ask questions, keep track of the answers, and make a PowerPoint presentation that looks decent.

Consultants get a bad rap because they’re notorious for not knowing anything. You pay lots of money to have someone who knows nothing about you and potentially nothing about your problem come in and ask questions, then come back later and give you a dog-and-pony show featuring sugar-coated versions of your answers and little else.

I won’t say who my client is, nor will I say who my employer is. What I will say is that my partner in this endeavor knows a whole lot more about the subject matter than I do. I’ll also say that the two of us are good researchers and can learn very quickly. Our regular job titles attest to that. We both have liberal arts degrees but we primarily work as systems administrators. We didn’t learn this stuff in school.

Up until Monday, I knew nothing about our client. Absolutely nothing. Up until yesterday afternoon, I knew nothing meaningful about the client. I knew its name and what its logo looked like, the name of one person who worked there, and I had a vague notion what they wanted to know.

I think that was an advantage. We both asked a lot of questions. I wrote down the answers quickly, along with whatever other information I could gleen. We left three hours later. I had six pages of typewritten notes and enough documents from them to fill a standard manilla file folder. We knew what they didn’t want, and we knew they were willing to throw money at the problem.

There’s such thing as knowing too much. One of the solutions they’re considering is overkill. The other is underkill. The difference in price between them is about 3 times our consulting fee. It took me another hour’s worth of research to find something that will give them the bare minimum of what they need for about $500 worth of additional equipment on top of the low-ball figure. When you’re talking the high-ball figure costing in excess of $40,000, that’s nothing. I found another approach that basically combines the two that will double the cost of the low-ball figure, but still save them enough to more than justify our fee.

I don’t know their internal politics or their priorities on the nice-to-have features. My job isn’t to tell them what to buy. Nor is it my job to give them my opinion on what they should buy. My job is to give them their options, based on the bare, basic facts. Whatever they buy, my feelings won’t be hurt, and there’s every possibility I’ll never see them again. They’ll make a better-informed decision than they would have if they’d never met me, and that’s the important thing to all involved.

I never thought I’d be able to justify a role as a high-priced expert on nothing relevant. But in this case at least, being an expert on absolutely nothing relevant is probably the best thing I could have brought to the table.

And since we haven’t done a whole lot of this kind of consulting before, I’ll get to establish some precedents and blaze a trail for future projects. That’s cool.

That other thing. There’s a lot of talk about the current scandal in Roman Catholicism. It’s not a new scandal; it’s been a dirty little — and not very well-kept — secret for years. There’s more to the issue than we’re reading in the papers. I’ll talk about that tomorrow. I come neither to defend nor condemn the Roman Catholic church. Its problems aren’t unique to Catholicism and they’re not unique to the Christianity either. Just ask my former Scoutmaster, whose filthy deeds earned him some hefty jail time a decade and a half ago.

Stay tuned.

A couple of quick things

Scanners in Windows 2000. While those two pompous, arrogant gits were out romping about and insulting one another, I was helping Gatermann put together an all-SCSI Windows 2000 system. I talked about that earlier this week. After much wrestling, we got the system booting and working, but his expensive Canon film scanner, which was the reason for all of this adventuring in the first place–his eclectic mix of Ultra160 and SCSI-2 and internal and external components was too much for his old card to handle–wouldn’t work under 2000. It worked fine in Win98, however. But if you’re scanning film, you’re pretty serious about your work, and 2000’s lack of stability is bad enough, while Win98’s lack of stability is enraging.
Side note: His scanner worked just fine in Linux with SANE and GIMP. The SANE driver was alpha-quality, but once he figured out the mislabeled buttons, it worked. Though flawed, it was no worse than a lot of drivers people ship for Windows, and it wasn’t any harder to set up either. Not bad, especially considering what he paid for it.

Gatermann, being a resourceful sort, did a search on Google groups and found a suggestion that he update his ASPI drivers. Since he had an Adaptec card, he could freely download and use Adaptec’s ASPI layer. He did, and the scanner started working.

It’s been a long time since I’ve had to do that to get a scanner going, but it’s been a long time since I’ve set up a SCSI scanner too.

Debian. At work on Friday, I booted the computer on my desk into Linux out of protest (more on that later… a lot more) and I figured while I was in Linux reading and responding to e-mail and keeping up with the usual news sources (I wasn’t having to do any NT administration at the time, which was why I was able to protest), I’d run apt-get update and apt-get upgrade. I run Debian Unstable at work, because Debian Unstable, though it’s considered alpha, is still every bit as stable as the stuff Mandrake and Red Hat have been pushing out the door the past 18 months. It’s also about as close to cutting-edge as I want to live on. Well, it had been a while since I did an update, and I was pleasantly surprised to find I suddenly had antialiased text in Galeon. That’s been my only gripe about Galeon until recently; the fonts looked OK, but they looked a whole lot better in Windows or on a Mac. The quality of the antialiasing still isn’t as good as in Windows, which in turn isn’t as good as on the Mac, but it’s better than none at all.

Galeon was already faster than any Windows-based browser I’d seen, but a recent Galeon build combined with the 0.99 build of Mozilla seemed even faster, and Web sites that previously didn’t render quite right (like Dan’s Data now rendered the same way as in that big, ugly browser from that monopolist in Redmond.

I expect with these last couple of updates, I’ll be spending even more time in Linux from here on out. I already have a full-time Linux station, but I use it about half the time and my Windows 2000 station about half the time. I may limit the Windows 2000 station to video editing very soon. And with some of the cool video programs out there for Linux now, it may share time. I suspect I’ll be doing editing on the Windows box, post-production on the Linux box, and then outputting the results to tape on the Windows box.

Adventure in SCSI

Gatermann called me last night. He’d gotten his new Adaptec 19160, but Windows 2000 wouldn’t recognize it. Unfortunately, he’d reformatted his main drive too, so there was no going back and cheating by installing both his old and new SCSI cards side by side, then installing the driver, then pulling the old card and moving the drive.
We tried a couple of things over the phone. No go. I suggested he try installing Linux just to make sure the card was good.

By the time I arrived, he had a working Red Hat 7.2 configuration. Put that in your pipe and smoke it, Microsoft.

We downloaded the latest Adaptec PCI drivers, using his other Linux box. Windows 2000 didn’t like them. We downloaded the previous version. That one, unlike the other, had a dedicated Adaptec 19160 driver. We installed that and it actually worked.

Forty minutes later, we had a working all-SCSI Windows configuration.

I like Linux more and more every day.

Full disclosure and integrity

I feel like I owe it to my readers to disclose a few things, due to the events of recent weeks raising a few questions in some people’s minds.

Read more

Trolling the web for nothing in particular

Yes, Brian, baseball will soon return. I hate the things Major League Baseball does (Bob Costas once likened choosing sides between the players and the owners to choosing sides between Iran and Iraq), but we’ve chosen to stay together for the kids. I’m sure everyone who cares (and some who don’t) can guess what I think of Bud Selig, but I’ll tell you anyway, soon enough.
In the meantime, I look like ArsTechnica today. Oh well. I don’t do this very often.

Blogging. Wired News had its take on the phenomenon, and threw out some interesting stats.


In January alone, at least 41,000 people created new blogs using Blogger, and that number is always increasing, [Blogger founder Evan] Williams said. Some have put the total number of weblogs at more than 500,000.

Alongside the boom, however, there have recently been a few faint signs of backlash. As increasing hordes take on the task of trying to keep new sites looking nice, sounding original and free from banalities, more hordes just seem to fail.

Blog critic Dave Linabury offered a recipe for success:


“It really can take a lot of time,” he said. “I spend two hours a day on my weblog. Many people don’t realize this, they think it’s a quick way to get popular. And after awhile they get really discouraged and say, ‘he got 2,300 hits today, I got four.’ The bulk of people out there get less than two dozen hits.”

“I don’t want to be elitist,” Linabury added, “but all these people out there with popular weblogs, they’ve been doing it longer and they stick to their guns.”

I can attest to that. The people who get more traffic than I get almost all have been doing this longer. But I can tell you one thing: It’s never enough. Back when I was getting 80 visits a day I wanted 150. When I was getting 150 visits a day, I wanted 250. Now that I get about 500 visits a day, I’m awfully distressed to see people are getting 2,300. And by the time I reach 2,300, I’m sure there will be people getting 5,000 or even 10,000. (Note that visits are the number of unique visitors; hits are the number of files served up. Hit count is deceptive. I get 500 visits per day but closer to 1,000 or even 1,500 hits per day, due to people visiting, reading comments, and then often reading something from a previous week. And if they do a search, that’s at least two additional hits.)

Link

Another feather in Internet Explorer’s cap. To my knowledge, no new security vulnerabilities have been reported in Internet Explorer this week, but the newest security patch, released last week, contains a bug that can cause a VBscript directive that previously worked to crash the browser.

Microsoft says Webmasters need to modify their pages not to use the directive.

That’s nice (I don’t use VBscript on this site) but there are embedded devices, such as HP’s JetDirect card, that use the directive. So early adopters of this patch may find themselves unable to do their jobs.

Better webmaster recommendation: Don’t use VBscript or ActiveX or other Microsoft-owned languages in your Web pages at all. Better end-user recommendation: Use Mozilla or a derivative instead of Internet Explorer.

Link

Recompiling Debian for your hardware. This thread comes up every so often, and with the popularity of Linux From Scratch and Gentoo, the appeal of a compiled-from-scratch Debian is undeniable. But does the small speed improvement offset the increased difficulty and time in upgrading?

The consensus seems to be that recompiling gzip, bzip2, and gnupg with aggressive options makes sense, as does recompiling your kernel. Recompiling XFree86 may also make some sense. But expending time and energy in the perfectly optimized versions of ls and more is foolhardy. (Especially seeing as speed demons can just get assembly language versions of them from www.linuxassembly.org.)

Link

A Guide to Debian. This is a guide, still incomplete, that gives a number of tips for someone who’s just installed Debian. The tips are applicable to other many other Linux (and even Unix) flavors as well.

Link

Spam. A coworker walked into my cube today and asked me how he could keep web robots from harvesting e-mail addresses from his web site. I found myself referring once again to the definitive piece on the subject, from Brett Glass (who gets my nomination for the greatest computer columnist of all time, for what that’s worth).

Link

The RULE project. A project has emerged to bring Red Hat Linux back to its roots, and allow it to run on older, less-powerful hardware.

From their site:


This install option is meant to benefit primarily two classes of users:

* GNU/Linux newbies who cannot afford modern computers, but still need, to get started more easily, an up to date, well documented distribution.
* System administrators and power users who have no interest in eye candy, and want to run updated software on whatever hardware is available, to minimize costs, or just because it feels like the right thing to do.

I love their FAQ. Check this out:


1.0 Hardware is so cheap today, why bother?

1. This is a very limited and egoistic attitude. Eigthy per cent of the world population still has to work many months or years to afford a computer that can run decently the majority of modern, apparently “Free” software.
2. Many people who could afford a new computer every two years rightly prefer to buy something else, like vacations, for example…. Hardware should be changed only when it breaks, or when the user’s needs increase a lot (for example when one starts to do video editing). Not because “Free” Software requires more and more expensive hardware every year.

These guys have the right idea. I can only hope their work will influence other Linux distributions as well.

Link

Linux uptime. (Sure, a little original content.) When I was rearranging things months ago, I unplugged the keyboard and monitor from my webserver, then I never got around to plugging them back in because I didn’t have to do anything with it.

The other day, I had occasion to plug a keyboard and mouse back into it. I went in, did what I wanted to do, then out of curiosity I typed the uptime command. 255 days, it told me. In other words, I haven’t rebooted since last May, which, as I recall, was about when I put the machine into production.

Linkfest Friday…

Let’s start things off with some links. Web development’s been on my mind the last few days. There’s a whole other world I’ve been wanting to explore for a couple of years, and I’ve finally collected the information that’ll let me do it.
Redirecting virus attacks — Your neighbor’s got Nimda? Here’s how to get his IIS server to quit harassing your Apache server. (Suggests redirecting to a bogus address; I’m inclined to redirect either to 127.0.0.1 or www.microsoft.com, personally.)

DJG’s help setting up MySQL. Apache, MySQL and PHP are a fabulous combination, but bootstrapping it can be a painful process. People talk about writing a sendmail.cf file as their loss of innocence, but I’ve written one of those and I’ve tried to set up the LAMP quartet. The sendmail.cf file was easier because there’s a whole lot more written about it.

Short version: Use Debian. Forget all the other distributions, because they’ll install the pieces, but rarely do they put the conduits in place for the three pieces to talk. It’s much easier to just download and compile the source. If that doesn’t sound like fun to you, use Debian and save some heartache. If you’re stuck with the distro you have, download ApacheToolbox and use it. You’ll probably have to configure your C/C++ compiler and development libraries. That’s not as bad as it sounds, but I’m biased. I’ve compiled entire distributions by hand–to the point that I’ve taken Linux From Scratch, decided I didn’t like some of the components they used because they were too bloated for me, and replaced them with slimmer alternatives. (The result mostly worked. Mostly.) You’ve gotta be a bit of a gearhead to take that approach.

Debian’s easier. Let’s follow that. Use this command sequence:

apt-get install apache
apt-get install php4-mysql
apt-get install mysql-server

Next, edit /etc/apache/httpd.conf. There’s a commented-out line in there that loads the php4 module. Uncomment that. Just search for php. It’ll be the third or fourth instance. Also, search for index.html. To that line, add the argument index.php. If you make index.php the first argument, access to PHP pages will be slightly faster. Pull out any filetypes you’re not using–if you’ll never make an index page called anything but index.html or index.php, pull the others and Apache will perform better.

Got that? Apache’s configured. Yes, the php installation could make those changes for you. It doesn’t. I’m not sure why. But trust me, this is a whole lot less painful than it is under Red Hat.

But you’re not ready to go just yet. If you try to go now, MySQL will just deny everything. Read this to get you the rest of the way.

Once you’ve got that in place, there are literally thousands of PHP and PHP/MySQL apps and applets out there. If you can imagine it, you can build it. If HTML is a 2D world, PHP and MySQL are the third and fourth dimension.

Am I going to be playing in that world? You’d better believe it. How soon? It depends on how quickly I can get my content whipped into shape for importing.

This is the holy grail. My first editing job was doing markup for the Digital Missourian, which the faculty at the University of Missouri School of Journalism believe was the first electronic newspaper (it came into being in 1986 or so). By the time I was working there in the late summer of 1995, it had been on the ‘Net for several years. About eight of us sat in a room that was originally a big storage closet, hunched in front of 486s, pulling stories off the copydesk, adding HTML markup, and FTPing them to a big Unix cluster on the MU campus. We ran a programmable word processor called DeScribe, and we worked out some macros to help speed along the markup.

No big operation works that way anymore. There aren’t enough college students in the world. You feed your content to a database, be it Oracle or IBM DB2 or Microsoft SQL Server or MySQL or PostgresSQL. Rather than coding in straight HTML, you use a scripting language–be it PHP or ASP–that queries the database, pulls the content, applies a template, and generates the HTML on the fly. The story goes from the copy editor’s desk to the Web with no human intervention.

There are distinct advantages to this approach even for a small-time operation like me. Putting the content in a database gives you much more versatility. Some people want overdesigned Web sites. Some want something middle-ground, like this one. Others want black text on a gray background like we had in 1994. You can offer selectable formats to them. You can offer printer-friendly pages. You can even generate PDFs on the fly if you want–something some sites are doing now in an effort to gain revenue. If you have content from various sources, you can slice and dice and combine it in any imaginable way.

I can’t wait.

Optimizing a Linux box in-place

Here’s the Linux bit I promised yesterday. I wrote it much earlier, so I might as well throw it out there.
Our test firewall at work is an old Pentium-200 running Red Hat Linux and a commercial firewall app. (No, I won’t disclose which one. Security, you know.) It’s a bit slow. A P200 is severe overkill for the firewall built into the Linux kernel (Steve DeLassus and I made a firewall out of the first PC he ever bought, a 486SX/20 of 1992 vintage, which, save the loss of the original power supply in an electrical storm, has never required any service), but this commercial package does a lot more than the simple firewalls built into Unixish kernels do.

It had 72 megs of RAM in it and swapped mercilessly. Its speed seemed to be OK once it was booted, but seeing as this is a testbed, it tends to get rebooted an awful lot. I needed to do something for it.

So I trekked into the PC graveyard to see what I could dig up. I found a Compaq 386DX/20. I left that alone. That’ll be useful if I ever need to pillage a pair of Compaq drive rails, which has happened before. Unfortunately those rails are worth more than the rest of the computer. I also spotted a Mac SE. That’ll be handy if I ever need a doorstop. Then I found a Pentium-75 and another Pentium of unknown speed. I opened them up. The 75 had a pair of 16-meg sticks. I opened up the unknown Pentium and looked inside. Ugh. Socket 4. That meant it was a Pentium-60, or, at best, a Pentium-66. It had a pair of 8-meg sticks.

I pulled the memory sticks out of the 75. The 60 didn’t have anything usable in it, save a pair of hard drives, both 540 megs, one a Quantum and the other a Seagate. I took the Seagate because it was easier to unbolt. I don’t have any way of knowing at this late date which of those drives was the better performer, and it probably doesn’t make much difference anymore.

The idea was to add some memory, and put in a second hard drive dedicated to virtual memory. Since the likelihood of the machine needing to read data from a drive and simultaneously hit virtual memory was fairly high, I wanted the virtual memory on its own drive. Furthermore, Linux’s partition-read
mechanism isn’t terribly efficient. This doesn’t matter for SCSI drives, which re-order I/O events, but for IDE drives it matters a lot. So getting the swap partition onto a dedicated drive was likely to improve performance a fair bit. (If this were a production system, it would probably have a SCSI
drive in it.)

So I swapped in the 16s for the 4s and found an empty bay to hold the 540, which I put on the second IDE channel as master (another performance trick), and booted Linux. The next trick is to use your favorite disk partitioning tool (I like cfdisk, but I can navigate plain old fdisk) to blow away whatever partition is on the new drive (this one was /dev/hdc) and create a single partition. I just made it the size of the drive, since 2.4 can deal with large swap partitions and Linux is smart enough to use whatever virtual memory it needs, not just automatically use all it has available. Then I set
it to type 82. Linux can do swapfiles, but a filesystemless dedicated swap partition gives better performance.

Next, I edited /etc/fstab. I found an entry for the swap partition pointing at /dev/hda2. I changed that to /dev/hdc1. That means I now have a small swap partition just sitting on the first drive unused, but that’s not a big deal to me. The system’s not using the disk space it has. While I was there, I noticed the CD-ROM drive was pointing at /dev/cdrom. I asked Charlie, our Unix/Linux guru, if Red Hat had some intelligence I didn’t know about. He said /dev/cdrom was just a symlink. I changed the entry to read /dev/hdd, which is where the CD-ROM drive ended up after my shuffle. Better to just code things directly than try to track symlinks, in my estimation.

Next, I issued the command mkswap /dev/hdc1 to initialize the swap partition. Then I rebooted and listened.

Indeed, during boot, the second drive was getting activity. I logged in and ran top, then hit shift-M to have a look at memory usage. The firewalling software was eating up a lot. But swap usage was down.

I decided to try cutting memory usage down a little more. I loaded /etc/inittab into vi. Red Hat by default gives you six virtual consoles. This machine has little need for more than two. Pulling the extras saves you a couple of megs. Near the end of the file you’ll see several lines that look something like this:

1:2345:respawn:/sbin/mingetty 38400 tty1

I commented out the last four of those. Hit the i key to put vi in insert mode, scroll down to those lines, add a # to the beginning of them, then hit ESC, then hit ZZ (shift-Z twice) to rapidly save the file, no questions asked. (I know, vi ain’t friendly, but it’s there.)

Then I had a look at /etc/rc3.d to see what daemons were running. I found apmd, sendmail, and gpm running. That was a waste of a couple megs, not to mention a possible security risk. I vaguely remember all three of them having had security issues in the past, and sendmail is one of those programs that should never be running unless you need it. Yes, this machine’s just practice, but Hall of Fame catcher Johnny Bench found that if he got sloppy and just let wild pitches go while he was warming up pitchers, he wasn’t as sharp at blocking potential wild pitches during the game when
it counted. So he worked just as hard during practice as he did during the game. Now he’s considered the greatest catcher of all time.

So I applied the Johnny Bench principle and disabled them with the following command sequence:

mv /etc/rc3.d/S26apmd /etc/rc3.d/K26apmd
mv /etc/rc3.d/S80sendmail /etc/rc3.d/K80sendmail
mv /etc/rc3.d/S85gpm /etc/rc3.d/K85gpm

I rebooted to find memory usage down by about 4 megs and the system booted a little faster. It was also more secure.

Total downtime: About 45 minutes.

That was time well spent. I may end up having to just bite the bullet and get some memory, but the system will perform better with these changes no matter how much memory is in it. And, more importantly, performing this exercise made me notice something I hadn’t noticed before. It let me tighten up security.

Had I blindly just ordered some memory to put in the system, or a new PC, like some people unfortunately advocate, I wouldn’t have necessarily noticed that as quickly.


Speaking of Linux, I did finally get Apache, PHP, and MySQL all talking together on my church’s 486. I used phpWeblog, which is an awfully nice package. Pages load in an acceptable two seconds. I notice the machine is paging, so a little more memory will probably help that. It’s amazing that people are throwing away Pentium-class machines when even a 486 has enough power to be a decent intranet server.

Not everyone’s so fortunate as you and me. Give ’em to someone who can use them if you don’t want them.