How I began as a software engineer...

In the mid-eighties, on our TRS-80 Color Computer (the "Coco"), my dad was using a simple filing program to organize a mailing list or maybe a parts inventory, I can't remember which. We'd call this a database today :) It was written in BASIC and I think we obtained it through Rainbow magazine (for the Coco, specifically) in the days when they printed source listings and you could order the audio tape for that issue with all the programs on it.

We got to have a few entries in the filing system, probably a few hundred. I can't imagine it even being near a thousand as it all fit in memory on what was perhaps a 64K machine at the time (we had upgraded from the 16K model!) If you wanted to print out the list, it was very convenient to sort it first. I'm not joking when I tell you that this took at least two hours to complete before printing would begin -- this on a ~2 Mhz machine which, in general, could run rings around the common 4.77 MHz IBM/DOS machines of the day. And, of course, the results were not saved anywhere so printing a second copy would take another two hours.

When my dad asked rhetorically, "What are those electrons *doing*???", I decided to investigate. A recent article in Rainbow magazine was discussing various sorting techniques. Turns out our little filing program was using a state-of-the-art bubble sort. Within a few hours, I'd rewritten the code to use the new fangled Quicksort algorithm, since the article recommended it as the best general purpose sorting method. This same sort now took only 20 minutes. My dad was simply amazed, and completely satisfied. But it seemed to me that 20 minutes was still an awful long time to sort a few hundred records. Where *was* the time being spent?

BASIC on the Coco has no profiling tools or anything nearly so sophisticated, even if at age 13 or 14 I had known about such things. But I was developing a concept of computer storage and was curious about this strange part of BASIC, the VARPTR. It turns out that BASIC is generally a garbage collected language and Coco BASIC came with three datatypes: numbers (which were always floating point), strings, and arrays (of any dimension) of numbers or strings. In this case, this meant that the sort was exclusively being done on strings. The other important fact is that BASIC has copy-on-assignment semantics. This meant that during the partitioning of the Quicksort, the strings being moved into one partition or the other by means of copying the original to the new location and then garbage collecting the old. I'm not sure at what level I understood all this at the time, but I know I did have the concept that the sorted result would have exactly the same strings as the original list and that those could exist anywhere, in any order, and only the order in which we referred to them mattered. VARPTR actually gives the programmer the pointer to the storage in BASIC's heap. I spent some time grappling with this concept, mostly by experimenting.

In the end, my father was astonished: the sort now took 20 seconds. He could not fathom what I could have possibly done to achieve a 60-fold increase, especially on top of the 6 times increase I had already achieved. I guess it is no wonder that he never hesitated to use software that I wrote for his businesses and such.

In today's terms, I am now using a 32bit, 550Mhz computer, compared to the 2MHz 8/16bit 8609 of the Coco (it was an 8bit Motorola processor with some 16bit registers). So I'd guess my hardware today is probably at least 500 times faster. Then consider that I'd be using a compiled language today and not BASIC, so that would give me another 20-40 fold increase in speed. So the sort today would take 1 or 2 milliseconds. I guess that sounds about right. Then again, the original 2 hours would only be about 5 seconds today...