Wednesday, March 10, 2010

Hard Typing in Qore 0.8.0

Qore 0.8.0 development is at an advanced stage. One of the major features of this release will be the addition of hard typing. This is an interesting subject because it's not always obvious how hard typing should be added to a language that previously was entirely dynamically typed.

Several issues have presented themselves so far. One is how to handle variables declared with a certain type but not assigned any value. Qore's syntax for typed variable declarations looks like this:
my int $x;

One of the reasons for adding hard (aka strong) typing to the language is that is allows many more errors to be caught at parse time that otherwise can only be caught at run time. Consider the following code:
sub func(int $i) {

my int $x;

The dilemma is - should the typed variable declaration cause $x to get a default value for the type, or should the value of $x be undefined (or actually in Qore: NOTHING) and therefore still cause a run time type error to be thrown. I've just checked how perl6 should handle this, and it appears that they take both approaches - as above with a lower-case "int", $x will get the default value for the type (0), with an upper-case "Int" the value will be undefined.

Qore's current implementation in svn (subject to change) is that $x holds NOTHING and a run time type exception will be thrown when func(int) is called.

Qore currently contains some other logic that allows such code to work:
my list $l;
$l += "str";

This will cause $l to be a list with "str" as the sole element; so even though $l was not initialized with a value in the declaration, the += operator still behaves as expected.

The idea is to allow the programmer to check if variables have been assigned a value or not. When we had Qore assigning default values in every declaration without an assignment, we found that it broke a lot of code that we otherwise expected to work.

Furthermore, some types cannot have a default value, such as objects (at least for classes that do not have a constructor that takes no arguments) and for the reference and code types. In order to make type handling consistent, we decided that variables and object members will only get values if they are explicitly assigned.

Now that I'm writing this down, it seems to me that the best solution would be to implement some kind of magic to cover the run time error above and pass the default value for the type to func(int) above - a solution similar to the one implemented for the += operator. That would be great, because then we can avoid run time type errors even though we've declared all our types (something hard typing should give us) and still be able to tell if we've assigned the variable or not. Otherwise we could modify the parser to track if a variable's been assigned or not when it's used, however this would not be a perfect solution because the parser cannot always know this with certainty (halting problem), so in some cases run time type errors would still occur.

It wasn't totally clear to me what would happen in Perl6 if an undefined Int value is passed to a function that expects an Int as a value.

Another issue is making the included function and class library friendly for strongly-typed code. There are a lot of functions and class methods that can return multiple values depending on the argument types. Since qore 0.8.0 in svn now supports overloading, we simply tag each variant with its parameter types and return type (note that an overloaded version of a function or method is called a variant in Qore). However there are cases when the return type depends on the values of the arguments, and not just the types (for example, if you pass a string with an invalid URL to the parseURL() function, it will return NOTHING). In these cases we're writing alternate, new versions of the functions that either throw an exception or return a default value for the return type. In this last example, there is now a parse_url() function that always returns a hash if the URL can be parsed, otherwise an exception is thrown.

Many of Qore's functions simply return NOTHING if the arguments are not of the expected type. In Qore 0.8.0, these functions have a default, untyped variant mapped to f_noop(), a function that simply returns NOTHING. All other possibilities are mapped to the actual functions that perform the useful work that needs to be done. This allows qore to recognize function calls that do not make sense and report that they return NOTHING at parse time (as long as types are declared in all the relevant places). An example of this type of function is parseURL(), a new version, parse_url() has been added that only accepts types that can produce a result.

Most of the other issues that have come up have been solved satisfactorily, method overloading in class trees (implemented successfully), default arguments (implemented for user and builtin code), matching variants at parse time and at run time (Qore tries to match the variant at parse time if types are available, otherwise variant matching is performed at run time).

However another issue is that the type system is very simple and flat; you may declare a variable of type list, but you cannot restrict the element type of the list. The same with hash keys and references (you cannot currently declare a variable as a reference to a particular type). Also it's currently not possible to declare code references to code with a particular signature (the new builtin type code "code" is used to restrict lvalues to being assigned a closure or call reference).

It's all subject to improvement before the 0.8.0 release. The libsmoke-based qt4 module has been converted to use hard typing and has been a very valuable test bed for the hard typing implementation.

Qore in svn is currently stable- if you want to check out qore with hard typing, grab the source from svn and compile it. Qore in svn is also currently compiling on windows with cygwin, although without modules and only as a monolithic binary.

To download qore from svn, use the following command:

svn co qore

Comments are very much appreciated!