I saw the headline on Slashdot: Forensic evidence trying to prove whether MS-DOS contained code lifted from CP/M. That got my attention, as the connection between MS-DOS and its predecessor, CP/M, is one of the great unsolved mysteries of computing.
Unfortunately, the forensic evidence doesn’t prove a lot.
The Register has already posted a rebuttal. I have my own problems with the analysis.
The author began from a flawed assumption from the start. He set out to prove that QDOS (the predecessor to MS-DOS 1.0) wasn’t derived from CP/M source code. That was a given from the start. Nobody ever accused Tim Paterson of driving to Digital Research’s headquarters, two states away, and pilfering the source code. And, as the author noted, much of CP/M was written in a high-level language called PL/M, that Gary Kildall was fond of. Without a PL/M compiler for the 8086, or a cross-compiler with an 8086 target, the PL/M code would have been useless to Paterson anyway.
A more likely scenario is disassembling a memory dump of CP/M, then re-implementing that memory dump in 8086 assembly language, which is similar to 8080 assembly language, but not identical. In the early days of 8-bit computing, things like this happened pretty frequently. Reverse-engineering software was easier then, when you only had 64K of memory (or less) to sort through and decode. It certainly happened in the world of Atari game consoles and computers, and on the C-64.
There are two cryptic clues to this assertion: Kildall challenged at least one journalist to ask Bill Gates why the string in MS-DOS and CP/M function 9 is terminated by a dollar sign. Kildall stated that only he knows the reason for that.
The second clue is an alleged Easter egg present both in CP/M and early versions of MS-DOS. Zeidman attributes that story to Jerry Pournelle, but John C Dvorak told the same story in a column in PC Magazine in the 1995 timeframe, right around the time Caldera acquired CP/M and DR-DOS from Novell. Supposedly there was a key sequence that caused both operating systems to print Gary Kildall’s name and other information. The only way this could happen is if some cryptic code from CP/M ended up in MS-DOS.
Zeidman dismisses this out of hand.
In addition, such a message would be easily seen by opening the binary files in a simple text editor unless the message was encrypted. CP/M had to fit on a floppy disk that held only 160 kilobytes; Kildall’s achievement was squeezing an entire operating system into such a small footprint. But it is difficult to imagine he could do this and also squeeze in an undetectable encryption routine. And although we’re now in an era of hackers breaking into heavily secured computers, no one has ever cracked DOS to find this secret command.
But I set out to look for it anyway. I used a utility program developed at SAFE to extract strings of text from binary files. Not only did Kildall’s name not show up in any QDOS or MS-DOS text strings, it did not show up in CP/M either. The term “Digital Research” did appear in copyright notices in the CP/M binary files, but not in MS-DOS or QDOS binary files.
If Jerry Pournelle did indeed see a hidden message revealed by a secret command, it was not in MS-DOS.
Gary Kildall was one of the most brilliant programmers of his generation. I’m not comfortable dismissing Kildall’s ability to put a small encrypted message in CP/M based on the judgment of someone who doesn’t seem to know (or at least doesn’t care) how futile comparing 8086 assembly code and PL/M code is.
It’s not a terribly difficult programming problem to solve. CP/M had to fit on a 160K floppy, but while it’s been a number of years since I examined a CP/M disk, I recall it fit in 160K of storage with room to spare. Gary Kildall’s name and a copyright notice would easily fit in 100 bytes of memory or less. A Caesar cipher, which would disguise the notice as something else, could be implemented in a few hundred bytes in assembly language. A clever and experienced programmer might be able to do it in considerably less. The lookup table and the encrypted notice could conceivably do double-duty as code that does something else entirely.
If I were going to dig through CP/M executable code to find it–and that’s where you would find it, not necessarily in the source–I would start by looking for illogical code. Code that works, but isn’t especially elegant. It might include some NOP instructions for no clear reason. And I would look for code that addresses that space as data, rather than as code.
As someone who attempted to program in 8080 assembly language perhaps twice, and well over two decades ago, I’m not the guy to find it. But that’s how I would go about looking for it, if I had the time, talent, and inclination. Unfortunately I only have the last piece.
So why hasn’t the message been found? Because almost everyone has heard the Kildall airplane story. I don’t know anyone personally, besides myself, who knows the Easter egg story. It’s just not all that well known. Now that the story is on Slashdot, perhaps someone with the requisite abilities, hardware and software will go searching for it.
And I think the key is to look for it in CP/M, not DOS. Try to uncover the Easter egg there, then attempt the same sequence on a PC running PC DOS 1.0. (And you’re less likely to find it in MS-DOS 1.11, which was where Zeidman looked. Zeidman’s failure to track down a copy of PC DOS 1.0 for his investigation is a little disheartening.)
I’m glad Zeidman went to the effort, but his investigation hasn’t convinced me of anything. It just wasn’t very thorough.