Any Unix gurus care to help me with mod_rewrite?

I’ve watched my search engine traffic decrease steadily for the past few months since I changed blogging software. It seems most engines don’t care much for the super-long arguments this software passes in its URLs.

The solution is mod_rewrite, and I think my syntax looks correct, but it’s not working for me.The goal is to fake out search engines to make them think they’re looking at static files. Search engines are reluctant to index database-driven sites for fear of overloading the site. Since I can’t tell them not to worry about it, I have to make the site look like a static site.

To that end, I created a section at the end of my httpd.conf file:

# rewrites for GL

RewriteEngine on
RewriteRule ^/article/([0-9]+)$ /article.php?id=$1 [NC,L]

This line should make the software respond to Thursday’s entry (https://dfarq.homeip.net/article.php?story=20040902200759738) if it’s addressed as https://dfarq.homeip.net/article/20040902200759738.

Once mod_rewrite is working, in theory I can modify the software to generate its links using that format and watch the search engines take more of a liking to me again. But I’ve got to get mod_rewrite going first, and I’m stumped.

Any expert advice out there?

Thanks in advance.

Need to squeeze a little more on that floppy?

I’ve been experimenting again with bootdisks and the FreeDOS project came to mind.

Boot floppies are getting rarer but they’re still hard to avoid completely. I think FreeDOS is worth a look for a variety of reasons.Its system files take up half the space of Win9x’s DOS. That extra 100K on the disk can make the difference between your tools fitting on a floppy or not.

FreeDOS supports FAT32. There’s an unofficial DR-DOS fork that does as well, but the licensing terms of FreeDOS are a whole lot more clear.

The FreeDOS FORMAT.EXE can overformat disks. If you use more than 80 tracks, the disks have problems in some machines, but a 1.68 megabyte disk using extra sectors per track should be OK. Concerned about overformatting disks? The Amiga’s default high-density disk format was 1.76 megabytes. That extra 240K can make a big difference, especially when coupled with that 100K you’ve already saved. The syntax to make a bootable 1.68 meg disk: FORMAT A: /F:1680 /S

The syntax for a 1.74 meg disk: FORMAT A: /F:1743 /S

The FreeDOS command interpreter includes command history, so you don’t need to make space on the disk or in low memory for DOSKEY.

Using FreeDOS and its 1.68 meg floppy, I was able to squeeze Ghost 8.1 (a 1.3 meg monster) onto a boot floppy and still have 197,632 bytes free to play with. With that kind of space left, if need be, one could format the disk with FreeDOS, then SYS it under Win9x and run MS-DOS 7 on it.

If you still need to squeeze a little more space, get the freeware FDFormat, which can also format oversized floppies and lets you reduce the root directory down to 16 entries from the default 224, which gives you a few more kilobytes of usable space. If you need to put more than 16 files on the disk, create a subdirectory and put your files in the subdirectory. The syntax would be FDFORMAT /D16 /F168 /S. Substitute /F172 for a bigger disk. To increase the performance of the floppy (who doesn’t want the slowpoke floppy to be a bit faster?) add the /X:2 /Y:3 options. A boot disk formatted this way yields 1,595,904 free bytes with the FreeDOS boot files installed.

That’s enough space to be almost useful for something again. You’ll at least be able to fit more on Bart’s modular disks or Brad’s network boot disk.

FTE – a DOS-style editor for Linux

I don’t remember what I was looking for, but I found another DOS-style editor for Linux and Unix.

FTE is another editor that harkens back to the look of the typical DOS app of about 10 years ago, similar to SETEDIT. For casual editing, either program will do very nicely, and provide a look and feel comparable to the editor that came with DOS 5 and 6.

I’ve always liked SETEDIT, but it suffers from the same identity crisis as emacs. Is it an editor? An MP3 player? A desk calculator? All of the above? And while it’s workable over a remote terminal connection, it’s not as snappy as I’d like.

FTE is a little sluggish from afar but faster. I like how it gives me the ability to have multiple files open and deal with large blocks of text, and continue to use the same key sequences I’ve known and been using since early high school. Its syntax highlighting is definitely a nice feature. It takes that feature a bit further than SETEDIT. For example, it highlights the corresponding closing bracket when you move over the opening one.

FTE’s main advantage is that it’s already bundled with some distributions. There’s a Debian project page for it. And a Google search turns up anectotal evidence that it comes in recent versions of Suse, Red Hat and Mandrake as well. If you’re a DOS veteran who’s not enamored with vi or emacs, FTE’s probably worth a look.

For what it’s worth, I typically use nano, but FTE is definitely a whole lot more powerful.

Roll your own news aggregator in PHP

M.Kelley: I’m also wondering how hard would it be to pull a PHP/MySQL (or .Net like BH uses) tool to scrape the syndicated feeds off of websites and put together a dynamic, constantly updated website.
It’s almost trivial. So simple that I hesitate to even call it “programming.” And there’s no need for MySQL at all–it can be done with a tiny bit of PHP. Since it’s so simple, and potentially so useful, it’s a great first project in PHP.

It’s also terribly addictive–I quickly found myself assembling my favorite news sources and creating my own online newspaper. To a former newspaper editor (hey, they were student papers, but one of them was at Mizzou, and in my book, if you can be sued for libel and anyone will care, it counts), it’s great fun.

All you need is a little web space and a writable directory. If you administer your own Linux webserver, you’re golden. If you have a shell account on a Unix system somewhere, you’re golden.

First, grab ShowRDF.php by Ian Monroe, a simple GPL-licensed PHP script that does all the work of grabbing and decoding an RDF or RSS file. There are tons of tutorials online that tell you how to code your own solution to do this, but I like this one because you can pass options to it to limit the number of entries, and the length of time to cache the feed. Many RDF decoders fetch the file every time you call them, and some feeds impose a once-an-hour limit and yell at you (or just flat ban you) if you go over. Using existing code is a good way to get started; you can write your own decoder that works the way you want at some later date.

ShowRDF includes a PHP function called InsertRDF that uses the following syntax:
InsertRDF("feed URL", "name of file to cache to", TRUE, number of entries to show, number of seconds to cache feed);

Given that, here’s a simple PHP page that grabs my newsfeed:


<html><body>

<?php include("showrdf.php"); ?>

<?php

// Gimme 5 entries and update once an hour (3600 seconds)

InsertRDF("https://dfarq.homeip.net/b2rss.xml", "~/farquhar.cache", TRUE, 5, 3600);

?>

</body></html>

And that’s literally all there is to it. That’ll give you a very simple HTML page with a bulleted list of my five most recent entries. Unfortunately it gives you the entries in their entirety, but that’s b2’s fault, and my fault for not modifying it. I’ll be doing that soon.

You can see the script in action by copying and pasting it into your Web server. It’s not very impressive, but it also wasn’t any effort either.

You can pretty it up by making yourself a nice table, or you can grab a nice CSS layout from glish.com.

I can actually code tables without stealing even more code, so here’s an example of a fluid three-column layout using tables that’ll make a CSS advocate’s skin crawl. But this’ll get you started, even if that’s the only useful purpose it serves.


<html><body>

<?php include("showrdf.php"); ?>

<table width="99%" border="0" cellpadding="6">

<tr>

<td colspan="3" align="left">
<h1>My personal newspaper</h1>
</td>

</tr>

<tr>

<td width="25%">

<!--- This is the leftmost column's contents -->

<!--- Hey, how about a navigation bar? -->

<?php include("navigationbar.html"); ?>

</td>

<!--- Middle column -->

<td width="50%">

<p><h1>Dave Farquhar</h1></p>

<?php

// Gimme 5 entries and update once an hour (3600 seconds)

InsertRDF("https://dfarq.homeip.net/b2rss.xml", "~/farquhar.cache", TRUE, 5, 3600);

?>

</td>

<!--- Right sidebar column -->

<td width="25%">

<p><h2>Freshmeat</h2></p>

<?php

InsertRDF("http://www.freshmeat.net/backend/fm-releases-software.rdf", "~/fm.cache", TRUE, 10, 3600);

?>

<p><h2>Slashdot</h2></p>

<?php

InsertRDF("http://slashdot.org/developers.rdf", "~/slash.cache", TRUE, 10, 3600);

?>

</td>

</tr>

</table>

</body></html>

Pretty it up to suit your tastes by adding color elements to the <td> tags and using font tags. Better yet, use the knowledge you just gained to sprinkle PHP statements into a pleasing CSS layout you find somewhere.

Finding newsfeeds is easy. You can find everything you ever wanted and then some at Newsisfree.com.

Using something like this, you can create multiple pages, just like a newspaper, and put links to each of your files in a file called navigationbar.html. Every time you create a new page containing a set of feeds, link to it in navigationbar.html, and all of your other pages will reflect the change. This shows another nice, novel use of PHP’s niceties–managing things like navigation bars is one of the worst things about static HTML pages. PHP makes it very convenient.

Getting out of a sticky BIND

Setting up DNS on Linux isn’t supposed to be the easiest thing in the world. But it wasn’t supposed to be this hard either.
I installed Debian (since it’s nice and lean and mean) and BIND 9.2.1 and dutifully entered the named.conf file and the zones files. I checked out their syntax with the included tools (named-checkconf and named-checkzone). It checked out fine. But my Windows PCs wouldn’t resolve against it.
Read more

Ho-hum.

Another day, another Outlook worm. Tell me again why I continue to use Outlook? Not that I ever open unexpected attachments. For that matter, I rarely open expected ones–I think it’s rude. Ever heard of cut and paste? It’s bad enough that I have to keep one resource hog open to read e-mail, so why are you going to make me load another resource hog, like Word or Excel, to read a message where the formatting doesn’t matter?
The last couple of times I received Word attachments that were important, I converted them to PDFs for grins. Would you believe the PDFs were considerably smaller? I was shocked too. Chances are there was a whole lot of revisioning data left in those documents–and it probably included speculative stuff that underlings like me shouldn’t see. Hmm. I guess that’s another selling point for that PDF-printer we whipped up as a proof of concept a couple of weeks ago, isn’t it? I’d better see if I can get that working again. I never did get it printing from the Mac, but seeing as all the decision-makers who’d be using it for security purposes use PCs, that’s no problem.

I spent the day learning a commercial firewall program. (Nope, sorry, won’t tell you which one.) My testbed for this thing will be an old Gateway 2000 box whose factory motherboard was replaced by an Asus SP97 at some point in the past. It’s got 72 megs of RAM. I put in an Intel Etherexpress Pro NIC today. I have another Etherexpress Pro card here that I’m bringing in, so I’ll have dual EEPros in the machine. The firewall has to run under Red Hat, so I started downloading Red Hat 7.2. I learned a neat trick.

First, an old trick. Never download with a web browser. Use the command-line app wget instead. It’s faster. The syntax is really simple: wget url. Example: wget http://www.linuxiso.org/download/rh7.2-i386-disc1.iso

Second trick: Download your ISOs off linuxiso.org. It uses some kind of round-robin approach to try to give you the least busy of several mirrors. It doesn’t always work so well on the first try. The mirror it sent me to first was giving me throughput rates that topped out at 200KB/sec., but frequently dropped as low as 3KB/sec.Usually they stayed in the 15MB/sec range. I cancelled the transfer (ctrl-c) and tried again. I got a mirror that didn’t fluctuate as wildly, but it rarely went above the 20MB/sec. range. I cancelled the transfer again and got a mirror that rarely dropped below 50MB/sec and occasionally spiked as high as 120MB/sec. Much better.

Third trick (the one I learned today): Use wget’s -c option. That allows wget to resume transfers. Yep, you can get the most important functionality of a download manager in a 147K binary. It doesn’t spy on you either. That allowed me to switch mirrors several times without wasting the little bit I’d managed to pull off the slow sites.

Fourth trick: Verify your ISOs after you download them. LinuxISO provides MD5 sums for its wares. Just run md5sum enigma-i386-disc1.iso to get a long 32-character checksum for what you just downloaded. If it doesn’t match the checksum on the site, don’t bother burning it. It might work, but you don’t want some key archive file (like, say, the kernel) to come up corrupt. Even though CD-Rs are dirt cheap these days and high-speed burners make quick work of them, there’s still no point in unnecessarily wasting 99 cents and five minutes on the disc and half an hour on a questionable install.

As for downloading the file in separate pieces like Go!Zilla does, there’s a command-line Linux program called mget that does it, but it doesn’t follow redirection and it doesn’t do FTP except through a proxy server, so I have a hard time recommending it as a general-purpose tool. When it works, it seems to work just fine. You might try mget, but chances are decent you’ll end up falling back on wget.

Sorcerer, meet Squid. Squid, meet Sorcerer.

I didn’t feel all that well last night. Not sure if I’m coming down with something, or if it’s something else. I’ve actually felt a little weird for the last couple of days, so I’ve been sucking down zinc lozenges, and I remembered Steve DeLassus’ advice the last time I got sick: swallow a raw garlic clove. I felt fine the next day. So guess what I had for breakfast this morning? That’ll solve the problem of anyone wanting to come near me all day…
I napped a good part of the evening, but I got a little work done. I finally got the guts to raise my hand in the Sorcerer mailing list and ask if anyone else was having problems compiling XFree86. Turns out there was a bug. So now I don’t feel so stupid. It took a couple of hours to compile, and at first I configured it wrong, but now I’ve got a usable GUI.

I also installed Squid on the Sorcerer box. There isn’t a spell for Squid yet, and I’m not positive I can write it (it requires adding users and doinking with configuration files, and editing configuration files automatically goes a little beyond my Unix lack-of-expertise), but I may give it a try. One thing that annoys me about Squid: It uses really lame compiler options, and it ignores the system default options. I need to learn the syntax of make files so I can try to override that. The main reason to run Squid is for performance, so who wouldn’t want a Squid compiled to wring every ounce of performance it can out of the CPU?

But at any rate, I installed it, and did minimal–and I mean minimal–configuration: adding a user “squid” and setting it to run as that user, changing ownership of its directory hierarchy, opening it up to the world (I’m behind a firewall), running squid -NCd1, and putting a really lame script in /etc/rc3.d. Here’s the script:

#!/bin/sh
echo “Starting Squid…”
/usr/local/squid/bin/squid

See? Told you it was lame.

Performance? It smokes. There are a few sites that Squid seems to slow down no matter what, but www.kcstar.com absolutely rips now, so I can get my Royals updates faster.

It makes sense. My Squid boxes have previously been TurboLinux boxes, which are nice, minimalist systems, but they’re designed for portability. In other words, they’re still 386-optimized. Plus, they’re running the 2.2 kernel and ext2. This one’s running 2.4.9, disk formatted reiserfs, with everything optimized for i686.

Linkfest.

I felt downright awful yesterday, but it’s my own fault. I remember now why I don’t take vitamins with breakfast. Very bad things happen.
So I’m whupped, and I’m not going to post anything original today. Just some stuff I’ve found lately and haven’t gotten around to posting anywhere.

But first, something to keep in the back of your mind: If The Good News Players, a drama troupe from the Concordia University system, is ever visiting a Lutheran church near you, be sure to go check it out. They are amazing. I put myself together enough to catch them at my church last night and I didn’t regret it in the least. They tell Bible stories in the form of mini-musicals; they’re easy to understand, professional, and just plain funny.

Linux OCR. This is huge. It’s not quite production-quality yet, but then again, neither is the cheap OCR software shipped with most cheap scanners. Check it out at claraocr.org.

It would seem to me that this is the missing link for a lot of small offices to dump Windows. Linux has always been a good network OS, providing fileshares, mail and Web services. Put Zope on your Web server and you can update your company’s site without needing anything like FrontPage. WordPerfect for Linux is available, and secretaries generally love WordPerfect, as do lawyers. ClaraOCR provides an OCR package. SANE enables a large number of scanners. GIMP is available for graphics work. And we’re close to getting a good e-mail client. And the whole shebang costs less than Windows Me.

Linux VMs, without VMware. This is just plain cool. If, for security reasons, you want one service per server, but you don’t have the budget or space for 47 servers in your server room, you can use the User-Mode Linux kernel. (The load on most Linux servers is awfully light anyway, assuming recent hardware.) This Linux Magazine article describes the process. I could see this being killer for firewalls. On one machine, create several firewalls, each using a slightly different distribution and ruleset, and route them around. “Screw you, l337 h4x0r5! You are in a maze of twisty passages, all alike!”

And a tip. I find things by typing dir /s [whatever I’m looking for] from a DOS prompt. I’m old-fashioned that way. There’s no equivalent syntax for Unix’s ls command. But Unix provides find. Here’s how you use it:

find [subdirectory] -name [filename]

So if I log in as root and my Web browser goes nuts and saves a file somewhere it shouldn’t have and I can’t find it, I can use:

find / -name obnoxious_iso_image_I’d_rather_not_download_again.iso

Or if I put a file somewhere in my Web hierarchy and lose it:

find /var/www -name dave.jpg

Windows XP activation cracked. Here’s good news, courtesy of David Huff:

Seems that the staff of Germany’s Tecchannel has demonstrated that WinXP’s
product activation scheme is full of (gaping) holes:

WinXP product activation cracked: totally, horribly, fatally and
Windows Product Activation compromised (English version)

03/29/2001

Where’ve you been all my life? Yes, I say that to every program I find that I like. But this time I think I might really mean it.

My biggest beef with disk optimizers is that I never found one with an intelligent directory sort routine. You see, the most important files in the directory should appear first for best performance on a FAT or FAT32 volume. Norton Utilities doesn’t offer a foolproof method to get the most important files up top every time. Neither does Fix-It. Nuts & Bolts (now McAfee Utilities) had the best method, but seeing as talking about McAfee Utilities is a violation of the license agreement, I can’t tell you if McAfee Utilities still has the feature, if it’s improved, if it’s worth having, or anything of the sort. Frankly I don’t want to know, unless the answer is no. I refuse adamantly to do business with any company that thinks it’s above the First Amendment. Even Microsoft isn’t that despicable. Apple’s not that despicable. Hell, Apple and Microsoft put together, with ultimate crybaby baseball players Gary Sheffield and Frank Thomas thrown in for good measure, aren’t HALF that despicable.

So who cares if McAfee Utilities is any good? You don’t want it anywhere near your computer no matter what it does. (And I suspect it’ll do a royal job of breaking it, based on my experience with Nuts & Bolts, which was a versatile suite but dangerous if used improperly. And every other McAfee product I looked at before they instituted that license agreement sucked. I mean really sucked. And it’s easier to try to stop freedom of speech than it is to improve your products.)

So… You’ve got the powerful Norton Utilities, with lots of selectable options but a couple of options that should be there that aren’t. And you’ve got Fix-It, which is a lot easier to use but not very configurable at all, so it’s better than Defrag and Scandisk but far from perfect. What to do? Buy one of them. Then download lfnsort .

LFNsort allows you to sort directories intelligently. Using multiple criteria. Fabulous. Download it, then run it (preferably you should exit all running programs first). Here’s the syntax I use:

lfnsort -a-s c: /s

This sorts your directory entries by access date, or, if no access date is available, by size (the next-best indication of importance). In the root directory I think I’d want to go with a manual sort (on my machine, the c:windows and c:program files entries get buried deeper than I’d like) but otherwise LFNsort seems to work really well.

So if you want the fastest computer possible, get a utilities suite, then download this, test it, and if you like the results, register it.

Troubleshooting intermittent PC problems

How to troubleshoot an intermittent PC problem. We’ve got an aging P2-233 at work that likes to bluescreen a lot under NT4–usually once every day or two. No one who looked at it was able to track it down. The first thing I noticed was that it still had the factory installation of NT, from about three years ago. Factory installations are bad news. The first thing you should do with any PC is install a fresh copy of Windows. If all you have are CAB files and no CD, don’t format the drive–just boot to DOS, go into that directory, run Setup, and install to a new directory other than C:Windows. With NT, it’s also possible to install from DOS though the syntax escapes me momentarily.

The first thing I suggested was to run RAM Stress Test, from www.ultra-x.com , over the course of a weekend to eliminate the possibility of bad memory. I followed that by formatting the drive FAT and running SpinRite. After six hours, SpinRite gave the disk a completely clean bill of health.

Knowing the memory and disk were good, I built up the system, installing NT, then installing SP5 128-bit, then installing IE 5.01SP1, then installing Diskeeper Lite, then installing Office 97 and Outlook 98 and WRQ Reflection, then running Windows Update to get all the critical updates and SP6a. (Yes, the great hater of Windows works in a shop that runs Microsoft software almost exclusively on its PCs.) I ran Diskeeper after each installation to keep the drive in pristine condition–I find I get better results that way than by installing everything and then running Diskeeper.

The system seemed pretty stable through all that. Then I went to configure networking and got a bluescreen. Cute. I rebooted and all was well and remained well for an hour or two.

How to see if the bluescreen was a fluke?

I devised the following batch file:

:loop
  dir /w /s c:
goto loop

Who says command lines are useless and archaic? Definitely not me! I saved the file as stress.bat and ran 10 instances of it. Then I hit Ctrl-Alt-Del to bring up Task Manager. CPU usage was at 100%. Good.

The system bluescreened after a couple of hours.

How to track down the problem? Well, I knew the CD-ROM drive was bad. Can a bad CD-ROM cause massive system crashes? I’ve never heard of that, but I won’t write off anything. So I disconnected the CD-ROM drive. I’d already removed all unnecessary software from the equation, and I hadn’t installed any extraneous peripherals either. So with the CD-ROM drive eliminated, I ran 10 instances of the batch file again.

The system didn’t make it through the night.

OK. Memory’s good. Hard drive’s good. Bad CD-ROM drive out of equation. Fresh installation of OS with nothing extra. What next?

I called my boss. I figured maybe he’d have an idea, and if not, he and I would contact Micron to see what they had to suggest–three-year warranties and a helpful technical support staff from a manufacturer who understands the needs of a business client are most definitely a good thing. The day Apple manages to figure that out will be the first step towards capturing and keeping more than six percent of the market. But I digress.

My boss caught the obvious possibility I missed: heat.

All the fans worked fine, and the CPU had a big heatsink put on at the factory that isn’t going anywhere. Hopefully there was thermal compound in there, but if there wasn’t, I wouldn’t be getting in there to put any in, nor would I be replacing the heatsink with a heatsink/fan combo. So I pulled the P2-333 out of the PC I use–it was the only 66 MHz-bus P2 I had–and put it in the system. I’d forgotten those old P2s weren’t multiplier-locked, so the 333 ended up running at 233. That’s fine. I’ve never had overheating problems with that chip at its rated speed, so at 100 MHz less, I almost certainly wouldn’t run into problems.

With that CPU, the system happily ran 10 instances of my batch file for 30 hours straight without a hiccup. So I had my culprit: That P2-233 was overheating.

Now, ideally a stress test would tax more system memory than this one did and would force some floating-point operations as well. So for your home system, a good stress test might be to load up several FPS games and let them run in demo mode continuously for a while. A command-line MP3 encoder, encoding the same WAV file and then deleting the resulting MP3 file over and over in a continuous-loop batch file also would suffice to put the floating-point unit to use and would also force the disk into action.

If you have time and parts available, you can troubleshoot a recalcitrant PC by running such a real-world stress test, then replacing possible suspect parts (CPU, memory, hard drive, motherboard) one at a time until you isolate the problem.

WordPress Appliance - Powered by TurnKey Linux