Tuesday, October 31, 2006

Musing: Types as Constraints

Designing programming languages isn't my area of expertise, but that fact alone isn't enough to keep me from musing about it.

Seeing data types as constraints

A consideration with any programming language is the data typing system supported. In statically-typed languages, variables are strictly defined with immutable types such as integer, string, real, etc. These statically-typed languages tend to be compiled, and the type of the variable is determined at compile time.

In dynamically-typed languages, variables assume a mutable type determined during program execution. In contrast to a statically-typed language, the type of a variable can change in a dynamically-typed language. Generally, either one typing strategy or the other is employed; languages tend to be either statically-typed or dynamically-typed with little middle ground in-between.

Unfortunately, neither strategy is generally superior. There are software development challenges where static typing is the preferable approach and there are other challenges where dynamic typing yields greater benefits.

The point of this post is to comment on a thought I've been mulling over for a long time: the consideration of types as constraints. Googling for "types as constraints" (and "types as sets") is enough to show me I'm not original in this line of thought. Not original, but possibly not crazy either.

I've always seen type presented as a fundamental and irreducible concept, but I think there is utility in simply considering type as a constraint on a variable rather than an essential property. From this perspective, the only truly essential property of a variable is its capacity to assume various values for a set of possible values.

By viewing type as a mere constraint, I think it may be possible to build a sensible logical framework that offers a clean conceptual bridge between dynamic-typing and static-typing. (At this point, I've yet to investigate the newly open-sourced Strongtalk in any detail, but I hope to do so in the near future.)

Consider the union of integers + reals + strings. This set can be considered the set of all possible values for a variable in a dynamically-typed language. Given the perspective of type-as-a-constraint, a declaration of type in a statically-typed language merely constrains a variable to one of three subsets: integers, reals or strings.

It should be easy to see that booleans, shorts, longs, floats, doubles, enumerations, bit fields and ranged subtypes (as in Ada) can all be seen as subsets of some set of all possible values (e.g., integers + reals + strings). Likewise regular expressions might be seen as constraints as well.

Musing my way along, here's a quick stab at syntax proposal, with three declarations in a hypothetical language in which types are viewed as constraints.

variable A;
variable B : { integer };
variable C : { integer, range(1,20) };

In the first case, the variable A is completely unconstrained. It can assume any member of the set integers + reals + strings. In the second case, the variable B is restricted to the subset integers. In the final example, the range(1, 20) constraint further restricts variable C to a subset of integers, specifically the integers 1, 2, 3 ... 20.

Such a language could not only support standard intrinsic constraints associated with basic types (i.e., integer, string, real, etc.), but also an extensible system of constraints. Newly defined constraints could come in the form of boolean functions and the invocation of exception handling could be handled by the language compiler or intepreter.


variable D : { positive_integer };

where positive is a boolean function returning true (note: below const-ness is also expressed in the form of a constraint)

bool positive_integer(v : { integer, constant })
return (v > 0);

It is implied that the parametric constraints (integer, constant) are preconditions for the newly derived constraint.

There are many tradeoffs involved with various systems of typing. Generally, the more strictly typed a variable is, the greater the opportunities are for compile time error detection, performance optimization, resource optimization, etc. Although you lose many of those benefits with dynamic-typing, you do gain in terms of other benefits such as flexibility and fewer execution paths.

Given a language in which types are expressed as constraints, one could leave a variable unconstrained when the freedom of dynamic languages is desired (or the development task doesn't require highly constrained rigor). However, in cases where performance and rigor are essential, variables could be as strictly defined as in a language such as Ada. Conceptually, I think the approach could be a cohesive strategy for bridging the chasm between statically-typed and dynamically-typed languages.


Post a Comment

Subscribe to Post Comments [Atom]

<< Home