The modern approach to teaching computer science is rather unfortunate. If the goal is a practical one, to teach languages and conventions that are used in corporate shops, then Java as a first language makes sense. But if the goal is understanding of computer science and programming concepts, then alternative languages might be more effective. I just noticed Jeff Atwood recently wrote an excellent post on this very subject.
If anything, the average C, Java, or .NET programmer would benefit from increased familiarity with alternative languages. For simple GUI applications or quick data processing scripts, something like wxPython or Perl have many advantages and it's important to know why.
I've been lucky to have been exposed to quite a few of the more obscure programming languages. In this set of posts I want to explore some of my favorites and the features that make them unique. Today I'm starting with Objective Caml.
Objective Caml
For a year-long independent study program my senior year of college, I undertook an ambitious project to be implemented in OCaml. Caml is a remarkable language developed in France with origins in the lovely but obscure world of ML. As an aside, you might want to look at this family tree of computer language history.
ML was a functional language developed in 1973 for the purposes of automated theorem proving. OCaml, though far removed from these origins, incorporates many of ML's early innovations. My favorite being type inference, which may be familiar to Haskell and recent .NET users.
In other statically typed languages, like C, the type of variables must be declared. With type inference, the compiler is smart enough to figure this out on it's own by analyzing variable use within code. Consider this OCaml snippet:
#let i = 4.0 *. 2.2
val i : float = 8.8
The compiler (or interpreter in this case) inferred the type of i was float. It infers this based on the float multiplication operator (*.), which takes two floats and returns a float. Consider another example, a function called add1:
# let add1 x = x + 1;;
val add1 : int -> int = <fun>
# add1 10;;
- : int = 11
The type system has inferred that this function, add1, accepts an integer and returns an integer. What happens if you try to violate this typing by passing add1 a float value?
# add1 10.0;;
This expression has type float but is here used with type int
OCaml complains and the function fails. Just as in C or Java, the type of all variables is known at runtime, but in OCaml it is not necessary to declare these types in advance (the compiler knows).
How about this:
# let lessthan x y = x < y;;
val lessthan : 'a -> 'a -> bool = <fun>
This function takes two variables of the strange type 'a and returns a boolean. The less-than operator can make comparisons between two values of any type (as long as they're both the same), this is described as polymorphic. The "any type" concept is represented by 'a. OCaml inferred the type signature of this function based on it's definition (x < y) and the types accepted by the less than operator.
# lessthan 10 11;;
- : bool = true
# lessthan 10.0 5.0;;
- : bool = false
The following fails because the function arguments are of different (inconsistent) types:
# lessthan 10.0 5;;
This expression has type int but is here used with type float
The difference between type inference and dynamic typing is subtle but important. In a dynamically typed language, variables are typed during runtime as they're used. This means their type can change during execution, giving the programmer some flexibility. Static typing forces variables to use the same kind of data throughout and the types of all variables must be determined at compile time.
Pattern Matching
Another interesting and practical feature of OCaml is pattern matching. Think of this as the mutant big-brother of switch-statements. Consider an OCaml function:
# let is_empty x = match x with
[] -> true
| head :: tail -> false;;
val is_empty : 'a list -> bool = <fun>
OCaml inferred that this function takes a list of any type ('a list) and returns a boolean. Lists in OCaml are expressed with brackets and list items are separated by semicolons, as in: [1; 2; 3;]. The pattern matching works on the structure of the list. If the list is empty (ie. []) then the function returns true. If the list is not empty, then return false. I'll explain how this second test works in a moment.
# is_empty [];;
- : bool = true
# is_empty [1; 2; 3;];;
- : bool = false
The '::' in the function above is a special operator known as cons and is generally used to manipulate lists. I think about it as constructing or deconstructing a list (or appending, if you prefer), as in this code:
# 1 :: [2; 3;];;
- : int list = [1; 2; 3;]
Pattern matching lets you describe the structure of a list and take action based on that structure. When employed with recursion, pattern matching really shines. Here's a function to find the length of a list that uses both tail-recursion and the cons operator:
# let rec length x = match x with
[] -> 0
| head :: tail -> 1 + length tail;;
val length : 'a list -> int = <fun>
# length [1; 2; 3; 4;];;
- : int = 4
OCaml inferred that length takes a list of any type and returns an integer. The cons operator breaks the list into a head (the first item) and a tail (the rest of the list). It then returns 1 + the length of the tail. When the list is empty, zero is returned, recursion halts and the calculation completes.
Additional Notes
OCaml is one of my favorite programming languages. It's also very practical. Thanks to type inference and static typing, it effects some benefits of a dynamically typed language, but with runtime performance closer to Java or C (though still quite a bit slower). Pattern matching is a powerful tool for processing lists and symbols. It also boasts extensive documentation and a surprising amount of Internet resources.