Tuesday, March 28, 2006

Generality, Specificity & Computing

Tom Evlin's post about David Isenberg & The Rise of the Stupid Network left me thinking about a design principle that I try to follow. It's also one that I try impress on software engineers I mentor. It's really pretty simple:

Find the most general solution possible.

A more general solution translates into a solution that is applicable to a larger domain. Introducing unnecessary specifics always translates into a reduction of the solution space, the solution will be unnessarily less useful and capable of solving fewer problems.

For example, one might create a software function to calculate the volume of a building. The question to ask is whether or not a function that calculates the volume of a space can be used instead. A more general implementation might be useful in calculating, say the volumes of ice cubes. This is the general idea.

Over the course of years and years of creating software, I have repeatedly run into cases where I needed to solve a problem very similar to one I had already solved, but I was unable to use my original solution because I had introduced all sorts of unnecessary specifics from the original problem domain. "Ack! If I'd written this more generally, I'd be able to use it here too!"

The more I made a conscious effort to avoid unnecessary specifics and create general solutions, the more I've found myself in situations where my software could be reapplied in domains I'd never even considered when I did the original design, just like the Stupid Network that went on to do all sorts of things beyond the intentions of its designers.

Why do so many unnecessary specifics creep into software design? Part of it may be human nature, my first inclination is to strictly design a solution for a specific problem, and this seems to be the case with the people with whom I've worked. Designing generally takes a little more time, because it requires extra thought and sometimes some extra work; unfortunately, time is usually a scarce resource.

It might sound silly, but part of the problem is that many computer languages, such as C++, naturally seem to steer engineers toward specifics. This is the case, because there are other advantages to be gained in being specific; e.g., it's easy to check for errors at compile time before an application is even run. This is a complaint made by many advocates of "dynamic languages" (that debate needs to be the subject of another post).

Specifics are the bane of some computing problems, especially problems involving a lot of communication. Mr. Evslin's post demonstrates the power of generality over specificity in solving communication problems. In a Stupid Network, general data is organized into generalized packets that are addressed and shipped across a network; this Stupid Network has proved to be much more powerful than so-called "Intelligent" Networks by exploiting this principle.

Most communication problems can be considered even more generally in the form of transportation problems. Communication amounts to the transportation of information. Transportation networks have existed much longer than communication networks, and words like "packet" and "address" find their sources in transportation. Generalizing communication problems as transportation problems allows one to leverage experiences gained over ages spent searching for solutions to transporation problems.

In the real world, there are some companies that specialize in transporting specifics; take for example, house movers and piano movers. But most of our shipping companies, Fedex, UPS, et al. ship generalities in the form of packages. Almost exclusively, they ship generic packages, packages encapsulating all sorts of specifics. As is often the case, there are algorithmic lessons to be learned from a good real world analogue; in this case, the value of principles regarding generality and encapsulation when it comes to transportation of both information and things.

In my opinion, there's a great deal of room for improvement in computer science when it comes to the development of computing languages and the problem of communication. In the beginning, most of our computing problems were problems of calculation. The need for calculation was the primary impetus. After that, came the era of large corporate databases. This created a need for storage, more calculation and some communication (but nothing like what we see today).

With the advent of the Internet, graphical user interfaces (which rely on sending messages) and web applications, the computing problems that seem to be occupying most of our time are problems of communication (aka data transportation). The programming languages and supporting systems we use don't seem to be particularly well suited for this.

Expressed as simply as it can be expressed, these communication problems amount to moving data from point A to point B. This is a succinct definition of the general problem. In theory, it is a much easier problem than it currently is in practice using existing tools. There appears to be a lot of room for generalization and improvement in this area; much of the trouble seems to stem from hassles coming in the form of unnecessary specifics.


Post a Comment

Subscribe to Post Comments [Atom]

<< Home