The best e-book site I’ve found

The best ebooks site I’ve found, by far, is the archive at the University of Adelaide in Australia. The selection is outstanding, but the presentation is even better.

Steve Thomas, the curator, takes tremendous care to ensure Adelaide’s e-books display their best on any device. Most e-books, even commercial books, pay little to no attention to formatting, and the result all too often is books that are difficult to read.

Read more

An easy, low-budget CMS

I have a friend who wants to set up a Web site where he and a couple of other people can post articles. The easy way out is to just set up a blog for them, but I don’t really like the blog metaphor for their site.
I found Hotani on Freshmeat this morning. It doesn’t even come close to giving the functionality of a full-blown CMS like Zope but it handles the basics: You feed it text, and it feeds it into a consistent-looking Web site with links to the rest of your content.

A reminder: 30 Days to a More Accessible Web Site

In a conversation today, I referred to Mark Pilgrim’s excellent 30 Days to a More Accessible Web Site.
This is must-read material. I confess to being guilty of neglecting most of the things in this piece, even though I would have gained substantial benefit from some of it at a recent point in my life, when I wasn’t able to operate a mouse and could barely keyboard.

I implemented the “add titles to links” feature. It required me to hack some PHP and is certainly the most substantial thing I’ve implemented without Steve’s help. It’s not much but it’s nice, even for those who have no disabilities–now, when you mouse over a calendar entry, the title of the entry pops up, like a tooltip. And for those using speech readers, now my calendar starts to make some sense.

As a bonus, some of this stuff will make Google treat you better if you implement it.

Read it. Download a copy and save it to your hard drive. And start implementing it.

Increase the speed of your Web pages

There are commercial utilities that will optimize your HTML and your images, cutting the size down so your stuff loads faster and you save bandwidth. But I like free.
I found free.

Back in the day, I told you about two programs, one for Windows and one for Unix, that will crunch down your JPEGs by eliminating metadata that’s useless to Web browsers. The Unix program will also optimize the Huffman tables and optionally resample the JPEG into a lossier image, which can net you tremendous savings but might also lower image quality unacceptably.

Yesterday I stumbled across a program on Freshmeat that strips out extraneous whitespace from HTML and XML files called htmlcrunch. Optionally, it will also remove comments. The program works in DOS–including under a command prompt in Windows 9x/NT/2000/XP, and it knows how to handle long filenames–or Unix.

It’s not advertised as such, but I suspect it ought to also work on PHP and ASP files.

How much it will save you depends on your coding style, of course. If you tend to put each tag on one line with lots of pretty indentation like they teach in computer science classes, it will probably save you a ton. If you code HTML like me, it’ll save you somewhat less. If you use a WYSIWYG editor, it’ll probably save you a fair bit.

It works well in conjunction with other tools. If you use a WYSIWYG editor, I suggest you first run the code through HTML Tidy first. HTML Tidy, unlike htmlcrunch, actually interprets the HTML and removes some troublesome information. But in some cases, HTML Tidy will add characters, but this is usually a good thing–its changes improve browser compatibility. If you feed HTML Tidy a bunch of broken HTML, it’ll fix it for you.

You can further optimize your HTML with the help of a pair of Unix commands. But you run Windows? No sweat. You can grab native Windows command-line versions of a whole slew of Unix tools in one big Zip file here.

I’ve found that these HTML tools sometimes leave spaces between HTML elements under some circumstances. Whether this is intentional or a bug in the code, who knows. But it’s easy to fix with the Unix tr command:

tr "> indexopt.html

Some people believe that Web browsers parse 255-character lines faster than any other line length. I’ve never seen this demonstrated. And in my experience, any Web browser parses straight-up HTML plenty fast no matter what, unless you’re running a seriously, seriously underpowered machine, in which case optimizing the HTML isn’t going to make a whole lot of difference. Also in my experience, every browser I’ve looked at parses CSS entirely too slow. It takes most browsers longer to render this page than it takes for my server to send it over my pokey DSL line. I’ve tried mashing my stylesheets down and multiple 255-character lines versus no linebreaks whatsoever made little, if any, difference.

But if you want to try it yourself, pass your now-optimized HTML file(s) through the standard Unix fmt command, like so:

fmt -w 255 index.html > index255.html

Optimizing your HTML files to the extreme will take a little time, but it’s probably something you only have to do once, and your page visitors will thank you for it.

Basic design principles

Steve DeLassus asked me for some tips on design for a site he’s building. Since that’s fairly general-interest information, I figure I might as well make a post of it.
The most important thing to remember is that there are basic rules of design that you can follow and be a good designer, even if you have zero ability. Great designers know where to break the rules. (And for the record, if I were a great designer, I’d be an art director for some magazine somewhere. I’m not.)

Fonts. General rule: Serif fonts are easier to read on paper than sans-serif fonts. The opposite is usually true onscreen, which has lower resolution than paper. You can play it safe by specifying fonts like Verdana, Lucida, and Georgia, which are specially designed for screen displays.

Use Lucida or Verdana if you want to look modern. Use Georgia if you want to look traditional.

Specify Times as a secondary font for Georgia, and Helvetica and Arial as secondaries for Lucida or Verdana, but those fonts are so common that I’d avoid them for primary use. As an experiment in college, I stopped using Times and Arial on my papers and susbstituted other, less-common fonts. My grades improved. Being a little bit distinctive can help. I suspect Georgia and Verdana may one day become as common online as Times and Arial, but that day isn’t here yet.

Colors. Use a high-contrast scheme like black on white. People are used to light backgrounds and dark text, so be prepared for complaints if you use a dark background and light text, even if you believe (as I do) that dark backgrounds are easier on the eyes. People in their teens and 20s (and possibly early 30s) are likely to be more forgiving on this than people who are older.

I’ve run into people who are militantly opposed to dark backgrounds. I’ve never run into anyone militantly opposed to light ones. So play it safe.

Color schemes. Follow the rules of the color wheel unless you know better. And remember: Any color will look fine with black or white, and very nearly any color will look fine with some shade of gray. Limit funky color schemes to your navigation bar; keep your main body text close to the classic black and white.

Keep in mind what you want to convey. A funky, hip site might mix orange and blue or orange and green. A more conservative site will drift towards blue and yellow.

And a very safe choice: Black, white, and red. Anything you do with those three colors will look just fine.

An easy way to play with color schemes is to visit this site.

Backgrounds. The in thing seven years ago in Web design was to use a background pattern. Today that’s generally a no-no, at least if you’re overlaying text. Place your text against a solid color. Limit background usage to your margins. Busy backgrounds are distracting.

Animation. I have animation turned off in all of my Web browsers. Animation is distracting. You can look very professional if you never use animated GIFs, Flash, or Javascript. It is extraordinarily difficult to use animated GIFs, Flash, and gratuitous Javascript and not look amateurish.

Think about it for a minute: Our most basic instinct is survival. And out in the wild, movement could mean a couple of things. Something that moved might be lunch. Or something that moved might think you’re lunch. So you naturally pay more attention to something that moves than to something that doesn’t.

On most professionally designed sites, the only thing that moves is the ads. There’s a big reason for that.

Existing media. If you have art you intend to use, make sure your site goes with it. Better yet, design your site around it. I designed this site around the montage of photos along the top. Steve had a family crest. It’s an elegant, attractive design. Almost as attractive and elegant as my family crest. Steve’s crest utilized a blue and a gray that seem to go really well together, so I suggested Steve use those colors as the principal colors in his site. The gray will be well-suited for the background of the text portion, so the natural place to use the blue is in his navigation bar.

Rule breakage. The exception to virtually every rule here is your page title. When you blow text up really big, you can get away with almost anything. So if you’re going to get daring, get daring on your page title. And even if you don’t get daring, most fonts look terrific really big. So blow your page title up really big.

Optimizing Web graphics

Gatermann told me about a piece of freeware he found on one of my favorite sites, tinyapps.org, called JPG Cleaner. It strips out the thumbnails and other metadata that editing programs and digital cameras put in your graphics that isn’t necessary for your Web browser to render them. Sometimes it saves you 20K, and sometimes it saves you 16 bytes. Still, it’s worth doing, because more often than not it saves you something halfway significant.
That’s great but I don’t want to be tied to Windows, so I went looking for a similar Linux program. There isn’t much. All I was able to find was a command-line program, written in 1996, called jpegoptim. I downloaded the source, but didn’t have the headers to compile it. I went digging and found that someone built an RPM for it back in 1997, but Red Hat never officially adopted it. I guess it’s just too special-purpose. The RPM is floating around, I found it on a Japanese site. If that ever goes away, just do a Google search for jpegoptim-1.1-0.i386.rpm.

I used the Debian utility alien to convert the RPM to a Debian package. It’s just a 12K binary, so there’s nothing to installing it. So if you prefer SuSE or TurboLinux or Mandrake or Caldera, it’ll install just fine for you. And Debian users can convert it, no problem.

Jpegoptim actually goes a step further than JPG Cleaner. Aside from discarding all that metadata in the header, its main claim is that it optimizes the Huffman tables that make up the image data itself, reducing the image in size without affecting its quality at all. The difference varies; I ran it on several megabytes’ worth of graphics, and found that on images that still had all those headers, it frequently shaved 20-35K from their size. On images that didn’t have all the extra baggage (including some that I’d optimized with JPG Cleaner), it reduced the file size by another 1.5-3 percent. That’s not a huge amount, but on a 3K image, that’s 40-50 bytes. On a Web page that has lots of small images, those bytes add up. Your modem-based users will notice it.

And Jpegoptim will also let you do the standard JPEG optimization, where you set the file quality to a numeric value between 1 and 100, the higher being the truest to the original. Some image editors don’t let you adjust the quality in a very fine-grained manner. I’ve found that a level of 70 is almost always perfectly acceptable.

So, to try to get something for nothing, change into an image directory and type this:

jpegoptim -t *

And the program will see what it can save you. Don’t worry if you get a negative number; if the “optimized” file ends up actually being bigger, it’ll discard the results.

To lower the quality and potentially save even more, do this:

jpegoptim -m70 -t *

And once again, it’ll tell you what it saves you. (The program always optimizes the Huffman tables, so there’s no need to do multiple steps.) Be sure to eyeball the results if you play with quality, and back up the originals.

Commercial programs that claim to do what these programs do cost anywhere from $50 to $100. This program may be obscure, but that’s criminal. Go get it and take advantage of it.

Also, don’t forget the general rule of file formats. GIF is the most backward-compatible, but it’s encumbered by patents and it’s limited to 256-color images. It’s good for line drawings and cartoons, because it’s a lossless format (it only compresses the data, it doesn’t change it).

PNG is the successor to GIF, sporting better compression and support for 24-color images. Like GIF, it’s lossless, so it’s good for line drawings, cartoons, and photographs that require every detail to be preserved. Unfortunately, not all browsers support PNG.

JPEG has the best compression, because it’s lossy. That means it looks for details that it can discard to make the image compress better. The problem with this is that when you edit JPEGs, especially if you convert them between formats, you’ll run into generation loss. Since JPEG is lossy, line drawings and cartoons generally look really bad in JPEG format. Photographs, which usually have a lot of subtle detail, survive JPEG’s onslaught much better. The advantage of JPEG is the file sizes are much smaller. But you should always examine a JPEG before putting it on the Web; blindly compressing your pictures with high compression settings can lead to hideous results. There’s not much point in squeezing an image down to 1.5K when the result is something no one wants to look at.

The tightrope of Web design

There are few challenges more daunting than designing a truly first-rate Web site.
And I’m not here to tell you how to design a first-rate Web site, because I’m not so arrogant as to assert that I’ve ever done it. I’ve tried it a dozen or so times. Some of the results have been good enough to be worthy of staying on the Web for a while. Some of them have been so bad that if someone were to hand me a printout today, I’d question what I could have possibly been thinking when I did it, and I might even question whether the design was mine. Yes, I’ve done my best to forget a lot of them.

And a lot of people are probably wondering why I’m making such a big deal out of this, since making a Web site is something that it seems like everybody does. I think everyone I went to college with had a Web site that had pictures of their cats, lists of all the CDs they owned (or wished they owned), their resumes, and links to all of their friends’ sites.

But that’s precisely the issue. Since everyone does it, it’s difficult to stand out.

There are actually three elements that make up a truly first-rate site, and the biggest problem with most near misses is that they only hit one or two of those elements. Other sites, like most personal home pages that populated the Web in the early ’90s, missed them all.

Content. A first-rate site has to have something to say. The biggest problem with those early personal home pages was that people had nothing to say. Finding clever ways to present boring and useless information wears off quickly. Ideally, a site should give some order to that content, so people can find what they’re looking for. A Weblog dedicated to the rebuilding of vintage BMW motorcycles could be extremely useful, but its usefulness will wear off very quickly if there isn’t a good way to find it.

Community. The best stuff comes from the questions people ask, or the answers people provide. Just ask any teacher. Anything that provides opportunity for banter between content provider and reader, or between readers, is a good thing. If there’s a way to organize and search that banter, so much the better. That hypothetical BMW motorcycle blog would be a lot more useful with people asking questions and sharing their own experience.

Design. This is last, and possibly least. Yet for many people it’s the most challenging. This is partly because some people aren’t naturally gifted in this area (I’m not), and partly because of the crude tools involved. There are probably other factors. We’ll concentrate on this area though, because it’s probably the only area that’s debatable.

Some people question whether design is even necessary. This is a sure sign that an awful lot of designers are doing their jobs. Design’s job is to set the mood, present the content in a facilitating manner, and get out of the way.

The challenge the Web presents is that power users are used to setting all the settings on their computer and it staying that way. They set the colors and the font and the window size the computer should use for everything, and some of them resent it when anyone imposes anything different on them. Some of them even seem to resent the use of p-tags to denote the end of a paragraph. They’ll decide when a paragraph ends and a new one begins, thank you very much. What’s the original author of the piece know, anyway?

On the other hand, you have users who are still trying to figure out what that blasted mouse is for. (This is in contrast to the people like me who’ve been using a computer for 20 years and are still trying to figure out what that blasted mouse is for.) They don’t know where those settings are and don’t care to set them themselves; they expect to be able to go to a Web page, and if it just looks like a raw data feed, they’ll go on to the next place because it looks nicer.

Those power users have a difficult time with this concept, but mankind has learned a few things in the thousands of years since the first time someone applied ink to parchment. Most of it was through trial and error, but most of that wisdom is timeless. Throwing that away is like deciding you don’t like the number zero. For example, in the case of Roman alphabets, a line length of between 50 and 80 characters reads much faster than any other length. If reading a page makes you feel tired, check the line length.

Knowing that, a browser window expanded to full screen is too short and too wide. Books and magazines and newspapers are vertically-oriented for a reason. So the primary navigation goes along the side, because there’s horizontal room to spare and vertical room is too precious to waste on something not content-oriented. Most computer users don’t want to think about this kind of stuff.

When it comes to font selection, things get a little bit easier. Fonts with serifs (feet and ears, like Times) look elegant and they’re easy to read because the serifs guide the eye. Sans-serif fonts (like Arial, which is a Helvetica ripoff) look really good when you blow them up big, but when you run them too small, the eye gets confused. The problem is that computer screens don’t have enough resolution to really do serifs justice. So the best thing to do in most situations is to run a sans-serif font with lots of line spacing. The extra space between the lines helps to guide the eye the same way serifs will. If you notice the typography, the designer has probably done a poor job. If you feel physically tired after reading the piece, the designer definitely has done a poor job.

Brightness and contrast are another issue. The rule is that for short stretches, you can read just about anything. That’s why you’ll see photos run full-page in magazines with the caption superimposed on top. But for reading anything more than a paragraph, you need a fair bit of contrast. Our society is used to black text on white. White or light grey text on black should theoretically work as well, but we’re used to light backgrounds, so we struggle sometimes with dark backgrounds.

But contrast done well can extend beyond convention. It’s possible to make an eye-catching and perfectly readable design with orange and blue, assuming you use the right shades of orange and blue and size elements appropriately. If you don’t feel physically tired after reading it, the designer did a good job, even if you don’t like blue and orange.

The problem with Web design is multifaceted. Not all browsers render pages the same way. This was a nightmare in the mid-90s, when Microsoft and Netscape sought to gain advantages over one another by extending the HTML standard and not always incorporating one another’s extensions. Netscape and Opera deciding to release browsers that follow the standards regardless of what that does to pages developed with Microsoft tools is a very good thing–it forced Microsoft to at least act like it cares about standards. So if a designer is willing to work hard enough, it’s possible to make a page that looks reasonably close in all the major browsers today.

HTML never helped matters any. HTML is a very crude tool, suitable for deliniating paragraphs from headings and providing links but nothing else. You can tell from looking at the original standard that no one with design background participated in its creation. Anything created in strict HTML 1.0 will look like a page from a scientific journal. To adjust line spacing or create multi-column layout, people had to resort to hacks–hacks that browsers will react to in different ways.

XHTML and CSS are what journalism students like me toiling in the early ’90s trying to figure out what to do with this new medium should have been praying for. It’s still not as versatile as PostScript, but it’s very nearly good enough as a design language.

The final design hurdle, though, has always been with us and will only get worse. You could always tell in the early ’90s what pages were created on campus with $10,000 workstations and which ones were created on computers the student owned. Lab-created pages used huge fonts and didn’t look right at any resolution below 1024×768. Meanwhile, I was designing for 14-inch monitors because that was what I had. That 14-inch monitor cost me 300 bucks, buddy, so I don’t want to hear any snickers!

Today, you can buy a decent 19-inch monitor for what I paid for that 14-incher. But as monitors have gotten larger, resolutions have only varied more. A lot of people run 17-inch or even 19-inch monitors at 640×480. Sometimes this is because they haven’t figured out how to change the resolution. Sometimes it’s because they like huge text. Flat-panel displays generally look gorgeous in their native resolution but terrible in any other, so it’s not fair to ask a flat-panel user to change. These displays became affordable within the past couple of years, so they are more common now than ever. A typical flat-panel runs at 1024×768 or 800×600. And on the other extreme, a 21-inch monitor capable of displaying 1600×1200 comfortably (or higher) can be had for $700.

So, since you can’t predict the resolution or window width people will be using, what do you do? CSS and XHTML provide a bit of an answer. It’ll let you create a content column that scales to the screen size. And if you’re really, really careful, you can specify your elements’ sizes in relative terms, rather than absolute pixel measurements. But this messes up if you have lots of graphics you want to position and line up correctly.

And some designs just stop working right when you mess with the font size. Mine don’t, primarily because I’m a disciple of Roger Black. I don’t have any really strong feelings about Black, it’s just that the first book I read by a designer that I really understood was co-written by Black. And most of Roger Black’s techniques work just fine when you crank up the font sizes. If anything, they look better when you make the fonts big enough that your neighbor can read them when you have your curtains open.

Browser test

I’m curious what this looks like. I’m using a template from glish.com, heavily modified by yours truly. It’s not quite the design I originally envisioned but I think it’s close enough. It’s dark. It’s readable. It’s a little edgy. It’s me.
You can customize the text size (and font) with buttons on the left and it’ll set a cookie so the change stays persistent. For the feature to work right, it needs cookies and javascript enabled.

I do want to modify this to use relative rather than absolute text sizes so it won’t override IE’s default. That’s an incendiary issue among Web designers, but this site looks fine in huge fonts (partly by design) so I’m willing to make the concession. Besides, I know not everyone keeps cookies and javascript enabled.

Assuming this thing doesn’t just completely fall apart in IE6, I’ll move the old HTML files over (so people’s old bookmarks don’t die suddenly) and flip the switch on my router. At this point I don’t want to put any more work into it until I’m certain it’ll work fine for those using the browser I won’t touch with someone else’s 10-foot-pole.

So, Konqueror and IE6 users–how’s it hold up?