Tuesday, February 21, 2012

The Design of Hammer: Part IV

The war between dynamic and static type systems is almost as old as the idea of programming languages itself. Hell, I get into one of these debates at least once a month. The result is often a stalemate, in which the combatants agree that each philosophy is useful in its niche. When you want to exercise your creativity without being hindered by petty rules, dynamic typing is the way to go. On the other hand, when you want your program to run fast and crash less, static typing is a must.

In the previous post I stated that, when faced with a trade-off between safety and convenience, Hammer will always err on the side of safety. The trend continues here.

First, I want to clarify some terms. When I say that a type system is strong, I mean that it gives each value one type and one type only. If a language, at any point, causes a value to be cast, coerced, or converted into a different type without the programmer's explicit request, then that language has a weak type system. PHP, JavaScript, Ruby, and early flavors of C fall into this category. The general consensus is that weak typing is almost always a bad idea, because it leads to subtle, hard-to-find bugs.

When I say that a type system is static, I mean that it performs type checking prior to executing code. A dynamic type system checks types when the program is already running, so that it has more information with which to determine whether to indicate an error. Unfortunately, this also means that values in the language need to be tagged with their types in some way and that processing cycles are stolen from the actual program for validation duty. Some cleverness in language implementations can minimize these costs, but it is impossible to entirely remove the overhead of dynamic typing.

Static type systems have their own problems. Type checking without run-time information will always reject some programs that are actually valid. The number of false negatives can be reduced by making the type system more expressive, but a "perfect" type system is impossible to create, being equivalent to a general solution of the halting problem. More expressive type systems are also often more complex. Despite these disadvantages, a sound and expressive static type system is a must for Hammer.

Robin Milner once said: "Well-typed programs don't go wrong." Indeed, a sound type system can completely eliminate a whole host of common and nasty bugs. Languages like ML and Haskell never have to deal with any run time type mismatch errors. With extra cleverness, it is even possible to guarantee things like:
  • You will never attempt to access the head of an empty list.
  • You will try to multiply two matrices A and B if and only if A : R^{p * m}B : R^{m * q} and the result C : R^{p * q}.
  • Every red-black tree you construct will satisfy the color invariants.
  • Many, many more.
Note that none of the guarantees of the type system incur run-time overhead. This is very important for a language like Hammer, which does not have garbage collection and thus would not benefit from a "boxed" representation of data. Hammer, like C, deals with raw integers, floats, and pointers. Because types have no "baggage", we get optimal performance and simple FFI marshaling.

Most important, the type system constrains side effects. Types become reliable documentation for exactly what a function will do. One can be sure that foo : Int -> Int is harmless, whereas launch_missiles : IO () warrants some inspection.

No comments: