Adventure in SCSI

Gatermann called me last night. He’d gotten his new Adaptec 19160, but Windows 2000 wouldn’t recognize it. Unfortunately, he’d reformatted his main drive too, so there was no going back and cheating by installing both his old and new SCSI cards side by side, then installing the driver, then pulling the old card and moving the drive.
We tried a couple of things over the phone. No go. I suggested he try installing Linux just to make sure the card was good.

By the time I arrived, he had a working Red Hat 7.2 configuration. Put that in your pipe and smoke it, Microsoft.

We downloaded the latest Adaptec PCI drivers, using his other Linux box. Windows 2000 didn’t like them. We downloaded the previous version. That one, unlike the other, had a dedicated Adaptec 19160 driver. We installed that and it actually worked.

Forty minutes later, we had a working all-SCSI Windows configuration.

I like Linux more and more every day.

How to build a reliable PC.

We touched on the topic of reliability last week. I figure I might as well give a more thorough discussion of what makes a PC reliable.
1. Power supply. I see more power supply failures than any other single component. Good power supplies fail without a whimper and don’t damage the rest of your equipment. Bad power supplies take other stuff with ’em when they die. Antec and Sparkle are examples of good basic power supplies. The power supplies that come in InWin and other brand-name cases tend to be fine as well. A notch above that is Enermax, maker of the ultimate in show-off power supplies, with plated finger guards and odd colors. Top-tier is PC Power and Cooling. If I wanted to build a computer and have absolute assurance it would still work in five years, I’d start with a PCP&C or at the very least, an Enermax.

Buy more wattage than you think you need. The power supply will run cooler and last longer if you do. Besides, you never know what you’ll want to stick in the case down the road.

2. Memory. Last time I checked, you could get 64-meg PC133 sticks for under $5. I wouldn’t trust ’em with my archenemy’s work though. Cheap memory may be untested, the PCB may not be a good design, or even worse, it may have chips that were tested and deemed unsuitable for use in PCs (but fine in other less-demanding devices). Unscrupulous makers sometimes buy up these chips and take their chances. It may seem foolhardy to pay $100 for a 256-meg stick from Crucial, but I haven’t just heard horror stories about commodity memory. I’ve seen it with my own eyes. I’ve had more than 1,000 brand-name modules cross my desk. Three were defective. I’ve had fewer than 50 commodity modules cross my desk. More than half proved defective. Some wouldn’t even work–the system would just beep at you. The worse ones appeared to work for a while, but the system was always crashing. Don’t take chances on your memory. I tend to buy my memory over-spec as well. Even if a motherboard takes PC100 memory, I go ahead and buy PC133 CAS2 memory. The chips will run just fine at a lower speed, so I have an overengineered system for a while, and if I ever upgrade I’m more likely to be able to take the memory with me.

3. Motherboards. Buy brand-name boards. I’ve never had an Asus board fail. (Watch one fail next week now that I’ve said that. But I’m happy with the reliability and longevity of Asus boards.) I’ve done well with other brands too, like AOpen, Abit, FIC, and Tyan. I know MSI boards are popular but I don’t have any personal experience with them. Asus has impressed me with their farsighted engineering–in my experience, you’re more likely to be able to upgrade an Asus board in three or four years than others.

Most people know to check the hardware enthusiast sites when researching a board. I urge you to also check the Usenet newsgroups. You’ll find some good advice. Finding very little on a board can be a good sign too-it’s an indication that a board doesn’t have many problems. Years ago, I was researching the Asus SP97V motherboard, because it was dirt cheap, but it was an Asus. I searched on Usenet and found very little about it–maybe a half-dozen messages. Most of it was just idle chatter. One message was talking about various boards, including the offhanded comment, “The SP97V is a good board for the money, BTW. I’ve used three of them.” That clinched it. Nobody was talking bad about the thing. I had one positive, and very little talk overall, which generally indicates satisfaction. Satisfied people rarely talk about stuff unless its quality blows them away.

4. CPU fans. Never go cheap on CPU fans. There’s a humongous roundup of currently available fans. Get a heavy-duty fan, even if you don’t overclock. Remember, the CPU you’re protecting is a lot more valuable than the fan. A good fan will keep your CPU well within its specified operating temperature range, and I’d like to think that the pricier fans will have a longer life. Get a ball-bearing fan rather than a sleeve-bearing fan; a cheap sleeve-bearing fan is quieter but it’s also likely to conk out on you in a couple of years if you leave your systems on 24/7.

Bookmark that site, by the way. Dan’s one of the better technology writers out there today, and he doesn’t take himself too seriously. He’s an entertaining read, explains things well, knows what he’s doing (and he’s pretty open about his methodology), and he’s probably a certifiable genius, but he’s not pretentious. In fact, he seems to enjoy making people think he’s not quite sane. I make sure I pay that site a visit at least twice a week.

5. Case fans. It’s a good idea to put a supplemental fan in the machine. Two is usually overkill unless you’ve got some really hot hard drives, and it’ll make your computer louder. You can quiet them by manipulating the voltage. Dan’s Data talks a lot about them too, including how to slow them down. For typical users, a simple ball-bearing case fan is sufficient.

6. Hard drives. IBM currently recommends you not run their drives more than 8 hours a day. So that eliminates IBM from the running. That’s a shame, because they used to make spectacular drives. (I still like their laptop drives better than any others I’ve seen though, and I’m not the only one.) I’ve seen fewer dead Quantum and Maxtor drives than any other brand, although Samsung really has surprised me with their reliability, and the drives are cheap. Seagate has a good reputation but I have very limited experience with their recent drives. Maxtor’s a safe choice at the mid range and high end, while Samsung is tough to beat for the low end.

7. Cabling. The cables that come with brand-name PC motherboards seem to be of good quality, as are the cables I’ve seen bundled in Maxtor retail kits. If an IDE cable looks flimsy, don’t buy it. Problematic cables slow you down due to the need to retransmit data. Also never buy an IDE cable that’s longer than 18 inches. Longer cables are available, but IDE specs state 18 inches as the maximum. Longer cables may work, but it’s questionable. If you have to reach the top bays in a tall tower case, you’ll have to go SCSI. Sorry.

Rounded cables will improve airflow, but be careful. Rounding shortens cables, so the wires inside a long rounded cable are even longer than stated. While a relatively new practice on the desktop, I saw rounded SCSI cables in IBM servers and workstations as long ago as 1995.

Linux Performance Tuning

I found a very superficial Linux Journal article on performance tuning linked from LinuxToday this week. I read the article because I’m a performance junkie and I hoped to maybe find something I hadn’t heard before.
The article recommended a kernel recompile, which many people don’t consider critical anymore. It’s still something I do, especially on laptops, since a kernel tuned to a machine’s particular hardware boots up faster–often much faster. While the memory you save by compiling your own kernel isn’t huge and was much more critical back when a typical computer had 8 MB of RAM, since Linux’s memory management is good, I like to give it as much to work with as possible. Plus, I’m of the belief that a simple system is a more secure system. The probability of a remote root exploit through the parallel port driver is so low as to be laughable, but when my boss’ boss’ boss walks into my cube and asks me if I’ve closed all possible doors that are practical to close, I want to be able to look him in the eye and say yes.

The same goes for virtual consoles. If a system runs X most of the time, it doesn’t need more than about three consoles. A server needs at most three consoles, since the only time the sysadmin will be sitting at the keyboard is likely to be during setup. The memory savings isn’t always substantial, depending on what version of getty the system is running. But since Linux manages available memory well, why not give it everything you can to work with?

The best advice the article gave was to look at alternative window managers besides the ubiquitous KDE and Gnome. I’ve found the best thing I’ve ever done from a performance standpoint was to switch to IceWM. KDE and Gnome binaries will still run as long as the libraries are present. But since KDE and Gnome seem to suffer from the same feature bloat that have turned Windows XP and Mac OS X into slow pigs, using another window manager speeds things along nicely, even on high-powered machines.

I take issue with one piece of advice in the article. Partitioning, when done well, reduces fragmentation, improves reliability, and allows you to tune each filesystem for its specific needs. For example, if you had a separate partition for /usr or /bin, which hold executable files, large block sizes (the equivalent of cluster sizes in Windows) will improve performance. But for /home, you’ll want small block sizes for efficiency.

The problem is that kernel I/O is done sequentially. If a task requires reading from /usr, then /home, then back to /usr, the disk will move around a lot. A SCSI disk will reorder the requests and execute them in optimal order, but an IDE disk will not. So partitioning IDE disks can actually slow things down. So generally with an IDE disk, I’ll make the first partition a small /boot partition so I’m guaranteed not to have BIOS issues with booting. This partition can be as small as 5 megs since it only has to hold a kernel and configuration files. I usually make it 20 so I can hold several kernels. I can pay for 20 megs of disk space these days with the change under my couch cushions. Next, I’ll make a swap partition. Size varies; Linus Torvalds himself uses a gig. For people who don’t spend the bulk of their time in software development, 256-512 megs should be plenty. Then I make one big root partition out of the rest.

With a multi-drive system, /home should be on a separate disk from the rest. That way, if a drive fails, you’ve halved your recovery time because you’ll either only have to install the OS on a replacement drive, or restore your data from backups on a replacement drive. Ideally, swap should be on a separate disk from the binaries (it can be on the same disk as /home unless you deal with huge data files). The reason should be obvious: If the system is going to use swap, it will probably be while it’s loading binaries.

Still, I’m very glad I read this article. Buried in the comments for this article, I found a gem of a link I’ve never seen referenced anywhere else before: Linux Performance Tuning. This site attempts to gather all the important information about tuning Linux to specific tasks. The pros know a lot of this stuff, but this is the first time I’ve seen this much information gathered in one place. If you build Linux servers, bookmark that page. You’ll find yourself referring back to it frequently. Contributors to the site include kernel hackers Rik van Riel and Dave Jones.

The penguins are coming!

The penguins are coming! Word came down from the corner office (the really big corner office) that he wants us to get really serious about Linux. He sees Linux as a cheap and reliable solution to some of the problems some outside clients are having. This is good. Really good.
My boss asked if it would be a capable answer to our needs, namely, for ISP-style e-mail and for Web caching. But of course. Then he asked if I was interested in pursuing it. Now that’s a silly question.

Now it could be that FreeBSD would be even better, but I know Linux. I don’t know FreeBSD all that well. I’ve installed it once and I was able to find my way around it, but I can fix Linux much more quickly. The two of us who are likely to be asked to administer this stuff both have much more Linux experience than we have BSD experience. Plus you can buy Linux support; I don’t know if you can buy FreeBSD support. I doubt we will, but in my experience, clients want to know (or at least think) that some big company is standing behind us. They’re more comfortable if we can buy support from IBM.

So maybe my days of Linux being a skunkworks project are over. The skunkworks Linux boxes were really cleverly disguised too–they were Macintoshes. They’re still useful for something I’m sure. I expect I’ll draft one of them for proof-of-concept duty, which will save us from having to pull a Compaq server from other duty.

I spent a good portion of the day installing Debian 3.0 on an old Micron Trek 2 laptop. It’s a Pentium II-300 with 64 megs of RAM. It boots fast, but current pigware apps tend to chew up the available memory pretty fast. I recompiled the kernel for the hardware actually in the machine and it helped some. It’s definitely useful for learning Linux, which is its intended use.

I’ve noticed a lot of people interested in Linux lately. One of our NT admins has been browsing my bookshelf, asking about books, and he borrowed one the other day. Our other NT admin wants to borrow it when he’s done with it. The Trek 2 I installed today is for our senior VMS admin, who wants a machine to learn with. My boss, who’s been experimenting with Linux for a couple of years, has been pushing it aggressively of late.

I don’t know if this situation is unique, but it means something.

I spent a good part of the evening at the batting cages. I messed my timing up something fierce. I hit the first few pitches to the opposite field, some of them weakly, but soon I was hitting everything–and I mean everything–to the third-base side. So my bat speed came back pretty fast, and I was getting way out in front of a lot of the pitches. So I started waiting on the ball longer, hoping to start hitting the ball where it’s pitched. The end result was missing about a quarter of the time, slashing it foul to the third-base side a quarter of the time, hitting it weakly where it was pitched a quarter of the time, and hitting it solidly where it was pitched a quarter of the time. Good thing the season doesn’t start until June–I’ve got some work to do.

Afterward, I drove to my old high school, hoping to be able to run a lap or two around the track. I was hoping for two; realistically I knew I’d probably be doing well to manage one. There was something going on there, and I couldn’t tell if the track was in use or not, so I kept driving. Eventually I ended up at a park near my apartment. I parked my car, found a bit of straightaway, and ran back and forth until I was winded. It didn’t take long.

I can still run about as fast as I could when I was a teenager, but my endurance is gone. I’m hoping I can pick that back up a little bit. I was a catcher last season, filling in occasionally at first base and in left field. In the league I play in, we usually play girls at second and third base, and we’ve got a couple of guys who can really play shortstop, so I’ll probably never play short. When I was young I played mostly left field and second. I’d like to roam left field again. Not that I mind catching, but there’s a certain nostalgia about going back to my old position.

Repairing corrupt IE installations

After church last night I looked at one of the computers in the office. Internet Explorer wouldn’t run on it. Or, specifically, the Internet Connection Wizard wouldn’t run, so IE wouldn’t run. Unnecessary parts strike again. Whatever happened to Web browsers that just used whatever TCP/IP connection was available? Oh yeah. If you do that, then you don’t get a chance to bother people about signing up for an MSN account. I forgot. Of course, no one actually does that. They call their favorite computer expert and ask why Internet Explorer doesn’t run. Sometimes you can make things so easy you make them hard. And the grizzled veteran always either clicks through it, or if they don’t have an account, signs them up for Earthlink out of spite.
But, as always, I digress.

Anyway, I ran SFC.exe. It found changed/corrupted files, but it didn’t make a difference. I went to Add/Remove Programs, picked Internet Explorer, and told it to repair IE. That didn’t help either. I reinstalled IE. That didn’t help either–some DLLs related to the ICW failed to register.

At times I’ve used IEradicator to remove a severely corrupted IE, then I’ve reinstalled it. So I downloaded IEradicator, ran it, rebooted, let it do its thing, and was surprised to see IE on the desktop when it returned. I clicked on the IE icon. It worked perfectly.

I guess at some point he upgraded IE, but it clashed with the old one. IEradicator removed that top layer, but not what was laying beneath it. I’ve seen some strange things before, but this ranks right up there.

When faced with severe IE problems, the best tools you can have in your arsenal are IEradicator, to remove the old version whether Control Panel will let you remove it or not, and OffByOne, a tiny standalone Web browser that fits on a single floppy, which you can use to go download the current version of a more complete browser. Download them now and squirrel a few copies away someplace safe, before you need them.

The behavior I saw last night was completely unexpected, and completely unlike anything I’ve ever seen before. But I didn’t complain. At least it works now.

Microsoft’s temper tantrum

Microsoft is throwing a temper tantrum that if the states’ current proposal goes through, the company will be forced to withdraw Windows from the market.
Pay no attention, move along, there’s nothing to see here.

Remember, this is the company that didn’t sign an agreement with IBM for a Windows 95 OEM license until the day it was launched. At one point during the negotiations, Microsoft told IBM it could buy it at retail. As hard as it might be to remember now, at the time, IBM was still one of the top 5 players in the U.S. PC retail market.

This is a company that plays hardball. It says unreasonable things to get its way. And it’s used to getting its way. And even when it doesn’t get its way, it still says stupid things. Remember, in 1994 Steve Ballmer said a court’s decision against Microsoft in Stac’s favor would be reversed as soon as they found a judge with actual brains.

Reality check: Microsoft can very easily comply with the states’ demands. Or reach a compromise that will benefit everybody. Once upon a time, long long ago, when you installed Windows, you could tell it what you wanted. If you didn’t have any use for Calculator, you could click a little checkbox next to it, Windows wouldn’t install it, and you’d save about 200K of disk space. Hey, back when people were trying to run Windows on 40-meg hard drives, it was nice to have that ability. Or, if you already had a third-party calculator app that put Microsoft’s to shame and thus had no need for the one that came with Windows, you didn’t have to install it.

The same was true of DriveSpace and all the other bundled stuff. I mean, let’s get serious here: Is there any reason whatsoever to install Space Cadet Pinball on your domain controller?

But with Windows 95, Microsoft started to get unreasonable. Yes, you could uncheck that little box next to MSN, but when you did it, Windows didn’t actually seem to do anything. Regardless of whether you checked that box, when Windows was finished, you had an MSN icon on your desktop. If AOL continued to exist, Microsoft’s very existence was threatened. In order for Microsoft to survive, AOL had to die. So you got MSN whether you used it or not. (Some idiot with a journalism degree figured out how to remove it a couple of years later.)

With Windows 95B, things got more sinister. Netscape replaced AOL as the imminent threat to Microsoft’s very survival, so you got Internet Explorer whether you wanted it or not. This time, Microsoft didn’t even bother putting in a checkbox for Windows to ignore. You just got it. With Windows 95 OSR2.1 and 98, Internet Explorer became increasingly more entrenched.

Once it was evident that AOL would never die and Netscape would never rise again, RealPlayer and QuickTime became threats to Microsoft’s existence. So, with Windows 98, we got Microsoft Media Player, whether we wanted it or not. Never mind that the basic Real and QuickTime players are free and both companies would have loved for Microsoft to deliver them with Windows and it would have saved the company development costs.

Microsoft could go a long, long way towards appeasing the states if they’d just put in little checkboxes that let you decide whether Internet Explorer or MediaPlayer was installed, just like Calculator. There’s no need for 8,000 different versions of Windows, like Steve “The Embalmer” Ballmer wants people to believe. Let the consumer decide what pieces he or she wants. Does a deaf person need MediaPlayer? It’s questionable. Does a file server really need Internet Explorer? Absolutely not.

And while there are magazines and book authors who want you to believe otherwise, thousands of people have removed Internet Explorer from Windows. And guess what? The sun didn’t quit rising. The world failed to fall apart. The stock market didn’t crash. Their computers didn’t fall over. The applications they needed to run still ran. In fact, the applications ran better once they got the unnecessary machinery gone. Imagine that, a basic engineering principle applying to computers!

Microsoft execs have complained about a double standard, because Apple, IBM, and Be all shipped Web browsers with their OSs. Of course, there was a big difference. In the case of MacOS, BeOS, and OS/2, you could tell the OS not to install the browser, and it didn’t do it. The same for their other components. In the case of OS/2, you could even remove the entire Windows subsystem. You lost the ability to run Windows 3.1 programs, but you gained speed and stability. I knew people who did that. I’ve done minimalist Mac OS installations that took up less than 20 megs and were completely useless because they lacked the drivers needed to install other software. But if I want to be stupid enough to install a completely crippled OS that can’t do anything besides boot a computer and let me look at its empty hard drive, Apple’s not going to stop me.

The overwhelming majority of people will just leave things alone. But the people who like to get into the nuts and botls of things want (and deserve) the opportunity to change how their computers work. They want Microsoft to fight its battles in the marketplace, not in the memory and CPUs of their computers. I don’t blame them in the least. Of course, I’m spoiled. IBM and Commodore let me have it my way, back when I was buying my operating systems from them.

So, Microsoft has a history of threats, and a history of following through with them, even when the reasoning behind them is totally ludicrous. But in the case of IBM, they ultimately budged, albeit 45 minutes into the 11th hour, and they didn’t budge much. But you don’t just shut out the #3 or #4 PC maker in the country. At the time, Microsoft still needed IBM, and IBM needed Microsoft, as much as both companies hated to admit it.

This is no different. Microsoft can’t just pull Windows off the market. Windows is still its main source of revenue, and Windows runs on more than 90 percent of the computers on the market. Microsoft isn’t going to just give that away. Sure, they make some Mac products, but the Mac is 5 percent of the market on a good day. The cheapest and easiest replacement for Windows, in the unlikely event Microsoft pulled out, is Linux, where Microsoft is a non-player. Microsoft could still sell Windows software to the existing installed base. But it’s ludicrous. Pulling Windows off the market is corporate suicide.

I really don’t think Microsoft would have made IBM buy its copies of Windows 95 at retail. Not everyone remembers it now, but there was some resistance to Windows 95 initially, and a company the size of IBM not shipping Windows 95 on its new computers would have given way to much credence to the naysayers. Microsoft was counting on Windows 95 being big, and it wasn’t going to take any chances. It had spent way too much money on research, development, and hype. Microsoft made that threat to see just how far IBM would go. And that’s what Microsoft is doing now. It’s trying to see how much the states are going to budge.

And that’s all there is to Ballmer’s rhetoric. Nothing more. And nothing less.

Encryption on the cheap

Disspam cruises along. It’s not often that I gush about a program, let alone a 4.5K Perl script, but Disspam continues to make my life easier. Granted, it simply takes advantage of existing network resources, but they’re resources that were previously (to my knowledge) limited to the mail administrator. Literally half my e-mail at home today was spam. Disspam caught every last piece.
A little scripting of my own. I’ve got a client at work who wants absolute privacy guaranteed. He and his assistant have some files they don’t want anyone else to be able to read, period. Well, there’s no way to guarantee that under NT, Unix, or VMS. Under NT, we can take away anyone else’s rights to read the file, but an administrator can give himself rights to read the file once again. We can make it set off all kinds of sirens if he does it, but that security isn’t good enough.

Well, the only way we can guarantee what they want is with encryption. But we’re nervous about making files that one and only one person can read, because last year, one of our executives went on vacation in Florida, fell ill, and died. We don’t want to be in a situation where critical information that a successor would need can’t be unlocked under any circumstance. So we need to encrypt in such a fashion that two people can unlock it, but only two. So the client’s backup is his assistant, and the assistant’s backup is the client. That way, if something ever happens to one of them, the other can unlock the files.

Password-protected Zip files are inadequate, because any computer manufactured within the past couple of years is more than fast enough to break the password through brute force in minutes, if not seconds. The same goes for password-protected Word and Excel documents. Windows 2000’s encryption makes it painfully easy to lock yourself out of your own files.

So I spent some time this afternoon trying to perfect a batch file that’ll take a directory, Zip it up with Info-Zip, then encrypt it with GnuPG. I chose those two programs because they’re platform-independent and open source, so there’s likely to always be some kind of support available for them, and this way we’re not subject to the whims of companies like NAI and PKWare. We’d be willing to pay for this capability, but this combination plus a little skullwork on my part is a better solution. For one, the results are compressed and encrypted, which commercial solutions usually aren’t. Since they may sometimes transfer the encrypted package over a dialup connection, the compression is important.

Plus, it’s really nice to not have to bother with procurement and license tracking. If 40 people decide they want this, we can just give it to them.

The biggest problem I ran into was that not all of the tools I had to use interpreted long filenames properly. Life would have been much easier if Windows 2000 had move and deltree commands as well. Essentially, here’s the algorithm I came up with:

Encrypt:
Zip up Private Documents subdirectory on user’s desktop
Encrypt resulting Zip file, dump file into My Documents
Back up My Documents to a network share

Decrypt and Restore:
Decrypt Zip file
Unzip file to C:Temp (I couldn’t get Unzip to go to %temp% properly)
Move files into Restored subdirectory on user’s desktop

I don’t present the batch files here yet because I’m not completely certain they work the right way every time yet.

They don’t quite have absolute security with this setup, but that’s where NTFS encryption comes in. If these guys are going to run this script every night to back the documents up, it’s no problem if they accidentally lock themselves out of those files. If their laptops get stolen, all local copies of the documents are encrypted so the thief won’t be able to read them. And the other user will be able to decrypt the copy stored on the server or on a backup tape. Or, I can be really slick and copy their GPG keys up onto the same network drive.

This job would be much easier with Linux and shell scripts–the language is far less clunky, and file naming is far less kludgy–but I have to make do. I guess in a pinch I could install the NT version of bash and the GNU utilities to give myself a Unixish environment to run the job, but that’s a lot more junk to install for a single purpose. That goes against my anti-bloat philosophy. I don’t believe in planning obsolescence. Besides, doing that would severely limit who could support this, and I don’t have to try to plant job security. I always get suspicious when people do things like that.

Stopping spam.

Forget what I wrote yesterday. I was going to post the stuff I wrote in Ohio when I realized it isn’t all that good, it’s definitely not useful, and the people who annoy me the most are the people who can’t get over themselves. No one cares what I ate for breakfast, and the only people who care what went on in Ohio already know.
So here’s something useful instead. It’s the coolest thing I’ve found all year. Maybe all decade, for that matter.

Spam begone. I hate spam. It wastes my time and my bandwidth and, ultimately, my money. I’ve seen some estimates that spam costs ISPs as much as $5 per month per account. You’d better believe they’re passing those losses on to you.

There are tons and tons of anti-spam solutions out there, but most of them run on the mailserver side, so for an end-user to use them, they have to set up a mail server and either use it for mail or run fetchmail to pull the mail in from ISP’s mail servers. I’ve done that, but it’s convoluted. But that’s trivial compared to setting up the anti-spam kits.

I was crusing along, vaguely happy, when my local mailserver developed bad sectors on the hard drive, so one day when I went to read my mail, I heard clunking noises. I turned around, flipped on the power switch to the server’s attached monitor, and saw read errors. Hmm. I hope that mail wasn’t important…

Eventually I shut down my mail server and put up with the spam, hoping I’d come up with a better idea.

I found it in a Perl script called disspam.pl, written by Mina Naguib.

It took a little doing to get it running in Debian. Theoretically it’ll run on any OS that has Perl installed. Here’s what I did in Debian:

su (to become root)
apt-get install libnet-perl (Perl couldn’t see the network without this, so the next command in this sequence was failing. This hopefully isn’t necessary on other distros, as I have no idea what the equivalent would be.)
perl -MCPAN -e shell (as per readme–I accepted the defaults, then when it asked for CPAN servers, I told it my continent and country. Then it gave me 48 choices. I picked a handful at random, since none were any more obviously close to me than others.)
install Net::POP3 (as per readme)
quit
cp sample.conf disspam.conf
chmod 755 disspam.pl

Next, I loaded up disspam.conf into a text editor. It looks just like a Windows-ish INI file.

The second line gives me an exclude list. It’ll take names and e-mail addresses. So I put in a few important names that could possibly be blocked (friends with AOL and Hotmail addresses). That way if their ISPs ever misbehave and get blacklisted, their mail will still get to me. Then I popped down to the end of the file and configured my POP3 mailbox. I had an account I hadn’t read in a week, so I figured I’d get a good test. Just drop in your username, password, and POP3 server like you would for your e-mail client. If you have more than one account, copy and paste the section.

Bada bing, bada boom. You’re set. Run disspam.pl and watch. In my case, it flagged and deleted about a dozen messages, typical of what I usually get, like mail offering me Viagra or access to horny cheerleaders or how to find out anything about anyone (which I already know–I have a journalism degree). The only questionable thing it flagged was mail from MLB.com. I can’t get off their mailing list ever since I voted online for the All-Star game. No importa, I never read that mail anyway. I could have always added MLB.com to my exclude list if what they had to say mattered to me.

But if you’re like me and get lots of mail–that was my less-busy account–and about half of it is spam, that stuff’s going to scroll by really fast. So here’s what I recommend doing: when you execute disspam.pl, use the following command line:

~/disspam/disspam.pl ~/disspam/disspam.conf >> ~/disspam/disspam.log

Then you can examine disspam.log. If disspam ever deletes something it shouldn’t have, you can add the person to your exclude list and e-mail them to ask what they wanted. It looks to be less work than deleting all that spam. Probably less embarrassing too. Have you ever accidentally opened one of those horny cheerleader e-mail messages when there were people around? Yikes!

I fired up Ximian Evolution, pulled down my mail, and had 15 new messages. No spam. None. Sweet bliss.

It’s just version 0.05 and the author considers it beta, but I love it already.

Unix’s power allows you to string simple tools together to make powerful ones. Here are some suggestions.

You can e-mail the log to yourself with these commands:

mail -s disspam [your_address] rm ~/disspam/disspam.log

If you want the computer to do all the work for you, here’s the command sequence:

cronttab

Then add these entries:

0 0 * * * mail -s disspam [your_address] * 0 * * * ~/disspam/disspam.pl ~/disspam/disspam.conf >> ~/disspam/disspam.log

If you read your mail on the same machine that runs disspam, you can substitute your user account name for your e-mail address and save your ISP a little traffic.

You’ll have to provide explicit paths for disspam.pl and disspam.conf.

The first entry causes it to mail the log at midnight, then delete the original. The second entry filters your inbox(es) on the hour, every hour. To filter more frequently you can add more lines:


* 10 * * * ~/disspam/disspam.pl ~/disspam/disspam.conf >> ~/disspam/disspam.log
* 20 * * * ~/disspam/disspam.pl ~/disspam/disspam.conf >> ~/disspam/disspam.log
* 30 * * * ~/disspam/disspam.pl ~/disspam/disspam.conf >> ~/disspam/disspam.log
* 40 * * * ~/disspam/disspam.pl ~/disspam/disspam.conf >> ~/disspam/disspam.log
* 50 * * * ~/disspam/disspam.pl ~/disspam/disspam.conf >> ~/disspam/disspam.log

This program shouldn’t be necessary for very long. It’s short and simple (4.5K worth of Perl) so there’s no reason why mail clients shouldn’t start incorporating similar code. Until they do, you run the risk of disspam and your mail client getting out of sync and some spam coming through. If you read your mail on a Linux box with an mbox-compliant client like Sylpheed or Balsa or Kmail, you can bring fetchmail into the equation. Then create a .fetchmailrc file in your home directory (name it ~/.fetchmailrc to ensure it goes to the right place). Here’s the format of .fetchmailrc:

poll SERVERNAME protocol PROTOCOL username NAME password PASSWORD

So here’s an example that would work for me:

poll mail.swbell.net protocol pop3 username dfarq password censored

Next, set your mail client to no longer check for mail automatically, then type crontab and edit your disspam lines so they read like this:

* 20 * * * disspam.pl disspam.conf >> ~/disspam.log ; fetchmail (your server name)

In case you’re interested, the semicolon tells Unix not to execute the second command until the first one is complete. If you have more than one mail account, add another fetchmail line.

As an aside, Evolution seems to use the mbox file format but it doesn’t store its file where fetchmail will find it. I think you could symlink /var/spool/mail/yourusername to ~/evolution/local/Inbox/mbox and it would work. I haven’t tried that little trick yet.

But even if you’re not ambitious enough to make it run automatically and integrate with all that other stuff, it’s still a killer utility you can run manually. And for that matter, if you can get Perl running on NT or even on a Mac, this ought to run on them as well.

Check it out. It’ll save you time and aggravation. And since it only reads the headers to decide what’s spam and what’s not, it’ll save bandwidth and, ultimately, it’ll save your ISP a little cash. Not tons, but every little bit can help. You can’t expect them to pass their savings on to you, but they’ll certainly pass their increased expenses on to you. So you might as well do a little something to lower those expenses if you can. Sometimes goodwill comes back around.