Optimizing Web graphics

Gatermann told me about a piece of freeware he found on one of my favorite sites, tinyapps.org, called JPG Cleaner. It strips out the thumbnails and other metadata that editing programs and digital cameras put in your graphics that isn’t necessary for your Web browser to render them. Sometimes it saves you 20K, and sometimes it saves you 16 bytes. Still, it’s worth doing, because more often than not it saves you something halfway significant.
That’s great but I don’t want to be tied to Windows, so I went looking for a similar Linux program. There isn’t much. All I was able to find was a command-line program, written in 1996, called jpegoptim. I downloaded the source, but didn’t have the headers to compile it. I went digging and found that someone built an RPM for it back in 1997, but Red Hat never officially adopted it. I guess it’s just too special-purpose. The RPM is floating around, I found it on a Japanese site. If that ever goes away, just do a Google search for jpegoptim-1.1-0.i386.rpm.

I used the Debian utility alien to convert the RPM to a Debian package. It’s just a 12K binary, so there’s nothing to installing it. So if you prefer SuSE or TurboLinux or Mandrake or Caldera, it’ll install just fine for you. And Debian users can convert it, no problem.

Jpegoptim actually goes a step further than JPG Cleaner. Aside from discarding all that metadata in the header, its main claim is that it optimizes the Huffman tables that make up the image data itself, reducing the image in size without affecting its quality at all. The difference varies; I ran it on several megabytes’ worth of graphics, and found that on images that still had all those headers, it frequently shaved 20-35K from their size. On images that didn’t have all the extra baggage (including some that I’d optimized with JPG Cleaner), it reduced the file size by another 1.5-3 percent. That’s not a huge amount, but on a 3K image, that’s 40-50 bytes. On a Web page that has lots of small images, those bytes add up. Your modem-based users will notice it.

And Jpegoptim will also let you do the standard JPEG optimization, where you set the file quality to a numeric value between 1 and 100, the higher being the truest to the original. Some image editors don’t let you adjust the quality in a very fine-grained manner. I’ve found that a level of 70 is almost always perfectly acceptable.

So, to try to get something for nothing, change into an image directory and type this:

jpegoptim -t *

And the program will see what it can save you. Don’t worry if you get a negative number; if the “optimized” file ends up actually being bigger, it’ll discard the results.

To lower the quality and potentially save even more, do this:

jpegoptim -m70 -t *

And once again, it’ll tell you what it saves you. (The program always optimizes the Huffman tables, so there’s no need to do multiple steps.) Be sure to eyeball the results if you play with quality, and back up the originals.

Commercial programs that claim to do what these programs do cost anywhere from $50 to $100. This program may be obscure, but that’s criminal. Go get it and take advantage of it.

Also, don’t forget the general rule of file formats. GIF is the most backward-compatible, but it’s encumbered by patents and it’s limited to 256-color images. It’s good for line drawings and cartoons, because it’s a lossless format (it only compresses the data, it doesn’t change it).

PNG is the successor to GIF, sporting better compression and support for 24-color images. Like GIF, it’s lossless, so it’s good for line drawings, cartoons, and photographs that require every detail to be preserved. Unfortunately, not all browsers support PNG.

JPEG has the best compression, because it’s lossy. That means it looks for details that it can discard to make the image compress better. The problem with this is that when you edit JPEGs, especially if you convert them between formats, you’ll run into generation loss. Since JPEG is lossy, line drawings and cartoons generally look really bad in JPEG format. Photographs, which usually have a lot of subtle detail, survive JPEG’s onslaught much better. The advantage of JPEG is the file sizes are much smaller. But you should always examine a JPEG before putting it on the Web; blindly compressing your pictures with high compression settings can lead to hideous results. There’s not much point in squeezing an image down to 1.5K when the result is something no one wants to look at.

Linkfest.

I felt downright awful yesterday, but it’s my own fault. I remember now why I don’t take vitamins with breakfast. Very bad things happen.
So I’m whupped, and I’m not going to post anything original today. Just some stuff I’ve found lately and haven’t gotten around to posting anywhere.

But first, something to keep in the back of your mind: If The Good News Players, a drama troupe from the Concordia University system, is ever visiting a Lutheran church near you, be sure to go check it out. They are amazing. I put myself together enough to catch them at my church last night and I didn’t regret it in the least. They tell Bible stories in the form of mini-musicals; they’re easy to understand, professional, and just plain funny.

Linux OCR. This is huge. It’s not quite production-quality yet, but then again, neither is the cheap OCR software shipped with most cheap scanners. Check it out at claraocr.org.

It would seem to me that this is the missing link for a lot of small offices to dump Windows. Linux has always been a good network OS, providing fileshares, mail and Web services. Put Zope on your Web server and you can update your company’s site without needing anything like FrontPage. WordPerfect for Linux is available, and secretaries generally love WordPerfect, as do lawyers. ClaraOCR provides an OCR package. SANE enables a large number of scanners. GIMP is available for graphics work. And we’re close to getting a good e-mail client. And the whole shebang costs less than Windows Me.

Linux VMs, without VMware. This is just plain cool. If, for security reasons, you want one service per server, but you don’t have the budget or space for 47 servers in your server room, you can use the User-Mode Linux kernel. (The load on most Linux servers is awfully light anyway, assuming recent hardware.) This Linux Magazine article describes the process. I could see this being killer for firewalls. On one machine, create several firewalls, each using a slightly different distribution and ruleset, and route them around. “Screw you, l337 h4x0r5! You are in a maze of twisty passages, all alike!”

And a tip. I find things by typing dir /s [whatever I’m looking for] from a DOS prompt. I’m old-fashioned that way. There’s no equivalent syntax for Unix’s ls command. But Unix provides find. Here’s how you use it:

find [subdirectory] -name [filename]

So if I log in as root and my Web browser goes nuts and saves a file somewhere it shouldn’t have and I can’t find it, I can use:

find / -name obnoxious_iso_image_I’d_rather_not_download_again.iso

Or if I put a file somewhere in my Web hierarchy and lose it:

find /var/www -name dave.jpg

Windows XP activation cracked. Here’s good news, courtesy of David Huff:

Seems that the staff of Germany’s Tecchannel has demonstrated that WinXP’s
product activation scheme is full of (gaping) holes:

WinXP product activation cracked: totally, horribly, fatally and
Windows Product Activation compromised (English version)