Data compression, 1980s-style–and why PKZIP won

My employer has me doing some very gray-hat work that I don’t want to describe in detail, because the information has a tremendous potential for misuse. But suffice it to say I’ve been trying to send data places the data shouldn’t go, and I tried to do it by going all 1987 on it by compressing the data with obsolete compression programs. Ever heard of security by obscurity? I was trying to bypass security by using obscurity. In the process, I learned why PKZIP won the compression wars.

This is a very good reason to deploy 64-bit Windows 7 in your enterprise, because these obsolete 16-bit compression programs won’t run under Windows 7. I found that out the hard way once I got the data through to the other side and tried to decompress it. Oops. But tell me, what’s the legitimate business need to run 16-bit DOS applications in 2014? Maybe in a sizable company, one or two people have that need. Find some other way to accommodate them, and make life difficult for attackers, OK?

I say this because I was able to get the data where I wanted it to go. What I found was that once I got the data where I wanted it, and moved to a machine that could run my 16-bit decompression program (back then the compressor and decompressor were often different programs), the data was corrupted more often than not.

Of course, in my BBSing days, it sure seemed like a lot of my downloads wouldn’t decompress correctly, or they’d decompress but the program wouldn’t run. I always blamed my modem and line noise, the bane of BBSing in days of yore. But, for some reason, after PKZIP came along and became popular, downloads worked a lot better. Then along came some other programs like LHARC and its cousins, and they were perfectly reliable too, and tended to compress better than PKZIP did. Naturally, I became a fan. If it’s better and doomed to fail, I always like it. PKZIP of course was the first one to be really reliable, so it quickly became entrenched, and its format won. You don’t see .LZH or .LHA files in the English-speaking world anymore.

So I guess I owe my modems an apology. In an environment free of line noise, those early, finicky, boy band-loving compression programs still failed too often for me to do what I wanted to do.

On a semi-related note, the algorithm could sometimes compress better than the original program could. Here’s some info on alternative ZIP utilities that compress better.

The ultimate command-line ZIP utility

I accidentally find Ken Silverman’s utility page from time to time and can never find it again when I want it, so if you need the ultimate command-line ZIP utility (KZIP), or the ultimate PNG optimizer (PNGOUT), to squeeze just as many bytes as possible out of your recompressed archives or your images while maintaining 100% compatibility, save this link. You’ll thank me later when you need it badly, like when you’re e-mailing an archive and it’s a few dozen bytes larger than your e-mail system allows.

Also check out his clever ZIPMIX utility.

What makes his approach to ZIP archiving special is that he emphasizes file size over speed. His software is built to take a few extra seconds to save a few bytes, if it’s possible to do so. Mainstream Zip/Unzip programs will still decompress his archives just fine; they just won’t match it for compression ratio most of the time. And in the rare event that they do, his ZIPMIX utility will take advantage of that. Just zip up the same files with both programs, then run ZIPMIX on the two archives. So Ken Silverman’s utilities win even when he loses.

I first noticed this phenomenon when using Info-Zip, when I found its -9 option produced smaller archives than PKzip’s -max option. The first thing I did was make sure PKzip could uncompress the Info-Zip archive I’d created. It did, so I never used PKzip to create an archive again. And every once in a while I find another tool that does better than the last best one I found. Right now Ken Silverman’s utilities are it.

I have an unusual appreciation of smaller archives. That’s because I’m old enough to have downloaded files over a 300-baud modem (but also young enough to remember having done so). Ken Silverman practices a lost art, and maybe there aren’t a lot of people left who appreciate that, but I still do.

A Free, Open-Source alternative to WinZip

Free graphical Zip/Unzip programs for Windows have come and gone. I’m always looking for one because I don’t use a graphical one all that often, preferring the command-line utilities from Info-Zip that I’ve been using since 1991.

But sometimes the graphical interface makes things easier. Info-Zip has a GUI front-end, but it’s difficult to install, at least compared to the typical Windows program. Power Archiver used to be free, but it’s slow, and now it’s shareware, and frankly, I don’t think it offers much of anything that WinZip or PKZip for Windows doesn’t.

Enter 7zip. It’s easy, it’s GPL, it handles all the common file formats, and it’s reasonably fast. Enough said.

It also introduces a new file format. The “7z” format compressed some of my stuff about 80% more than Zip. It also compressed better than CAB or RAR. You can do people a favor and make your 7z files self-extracting, so they don’t have to download yet another archiver (my big beef with RAR).

It’s not only free, it’s better. Go get it.

And while we’re on the topic of Zip utilities, I would be remiss to not mention Ken Silverman’s excellent Zip tools. If you’re not afraid of the command line, they are a must-have.

WordPress Appliance - Powered by TurnKey Linux