A few key features give XL its particular expressive power. These advanced features combine with simpler ones to give you a programming language that grows as you need it, yet remains simple when you need simple things done.
To the best of my knowledge, these features are original to XL, and were implemented for the first time in the XL compiler (or for some of them in its ancestor, LX, sometimes as long as 5 years ago). But please don't hesitate to let me know if you think otherwise.
Tree-based program representation
XL source code has a standard, tree based, object-oriented, persistent, compile-time representation, which can be used to implement all sorts of program manipulations (or meta-programs.) This representation gives XL an extensibility similar to Lisp, but in a very different way. A tree representation can be rendered back into source code, simplifying the implementation of source-to-source code transformations.
- Standard representation means that there is a language-defined representation for every source statement. Therefore, meta-programs are portable between XL implementations. Standardization also implies that the tree representation "makes sense" for a variety of uses, contrary to the typical internal representations used by more traditional compilers.
- Tree-based means that the internal representation of a source program is a familiar abstract syntax tree. It is not, for instance, a text-based or list-based representation. A tree representation simplifies most manipulations.
- Object-oriented means that each tree element has a type, that the types are related through inheritance, and that operations on the trees can use dynamic dispatch based on the tree type. An object-oriented tree representation greatly simplifies several program manipulation algorithms.
- Persistent means that the program representation can be serialized to disk. There is a standard format for programs, beside source code. Program elements can be saved to persistent storage between sessions. Individual tools in a workflow can be located in different processes easily (or even on different machines, or run at different points in time)
- Compile-time means that the program representation is not, by default, preserved at runtime. This is not a limitation, because a plug-in can easily perform a program transformation to turn the compile-time representation into a suitable run-time representation. A typical use would be to keep run-time type information (RTTI) for selected object types.
XL is really defined in terms of its standard representation, not in terms of its syntax. Multiple syntax variants for the same tree representation are possible, and automatic conversion from one to another using rendering engines is easy. One particular flavor of this is automatic reformatting of source code according to specific guidelines.
Pragmas
Pragmas are directives that extend the language in arbitrary ways. They are used for a variety of simple features that require keywords in languages like C or C++. But more generally, pragmas are a way for the language to specify that a compiler plug-in should be applied to a particular program section.
When you write {foozoo}integer X;, the compiler invokes the plug-in responsible for the {foozoo} pragma (located in an implementation-dependent way), and passes the program tree corresponding to the declaration integer X.
Pragmas give a readable way to extend the language beyond its basic capabilities, or to implement specific aspects of the language.
Expression Reduction
Several languages (C++, Ada) offer the possibility to overload
arithmetic operators such as '+' or '/'. The languages I know of limit
themselves to binary operators. In some cases, combining operations
results in improved performance (vector and multimedia operations) or precision
(matrix operations). XL offers a general notation for defining
such operator combinations:
function Multiply_Add(vector A, B, C) return vector written A*B+C
vector M, N, O
vector P := M * N + O
XL also allows this notation to be used for the parameters to
generic types, in which case the use of multiple operators is even
more useful.
generic [type item; ordered Low, High] type array
written array[Low..High] of item
type vector is array[1..100] of real
array V[1..10] of integer
A single expression can be reduced twice, once to resolve generic arguments, and once to resolve the expression itself. This makes it possible to have fairly readable construction expressions for data type.
generic [type T] type vector written vector[T]
generic[type item]
function vector(integer Size) return vector[item]
written vector[Size] of item
// Two reductions at once
vector K[3] of integer
Some of the expression elements need not be parameters of the function being defined
function IsIdentity(matrix M) return boolean written M = 1
function IsZero(matrix M) return boolean written M = 0
A written form can be used to create implicit conversions:
function Real(integer I) return real written I
real R := 1 // Implicit call to Real(1)
Why?
The operator overloading syntax is easier to read than in most alternatives
(consider how you indicate if operator++ is prefix or postfix in C++).
It is also more general. When applied to generic types, it allows libraries to
define a syntax such as array[A..B] of T. When applied to objects,
it allows optimizations that the compiler cannot know about to be specified by the
programmer. It also enable convenient mathematical notations (0 < X < 1)
Named operators
In the above array example, of is not an
operator, but a named infix operator. XL allows you to use arbitrary
names for infix operators in expressions.
function And(integer X, Y) return integer written X and Y
integer Bits := Value and Mask
function Rectangle (integer X, Y, Width, Heigh) return rectangle
written Width by Height at X row Y
rectangle R := 10 by 20 at 30 row 40
This technique allows one to use an infix function call notation
where appropriate, as in Objective-C or Smalltalk. Such notations are
often (but not always) more readable.
Why?
In addition to basic operators such as A and B, it also can be used
for convenience notations such as A between 0 and 1.
Cons
Someone will soon realize that with the proper declarations, the following becomes legal XL:
if you read this then you are an idiot
Type-safe variable argument lists
At least one very frequent operation requires a variable number of
arguments: displaying the results of a program. So far, no language
has found a really elegant solution:
- Pascal has a WriteLn procedure which is practical to use
and efficient, but is a special-case that you cannot write in Pascal
WriteLn('X=', X, 'Y=', Y);
- C has the printf function, which is practical to use and
quite efficient, but is not type safe, causing subtle programming
errors.
printf("X=%d, Y=%d\n", X, Y);
- C++ has the ostream insertion operators (<<) which
is type safe. But using them is not nearly as practical as printf for
complex formatting, and it is quite inefficient (since it involves one
function call per parameter, thus bloating the code a lot).
I also personally find it quite awkward to read.
cout << "X=" << X << ", Y=" << Y << eol;
- Ada and Java solve the problem with string concatenation
operators, which are quite inefficient (one function call per item
plus memory management and string concatenations). And they are not
that readable either, since they use the '+' or '&' operator which
also has other meanings. Last, and more importantly, this solution
only works for text output, not for variable argument lists in general.
System.out.println("X=" + X + ", Y=" + Y)
XL allows you to define a WriteLn procedure yourself, that
behaves exactly like the Pascal WriteLn yet can be defined in
a library and written in XL. This is achieved using the
others keyword, which stands for any number of arguments. A
procedure or function with others parameters is generic, and
the compiler will use it to generate functions with the appropriate
number of arguments.
generic type writeable if
with writeable W
WriteIt W
procedure WriteLn(others) is
Write others
Write NewLine
procedure Write(writeable W; others) is
WriteIt W
Write others
procedure WriteIt(integer I)
procedure WriteIt(real R)
procedure WriteIt(character C)
Note that this definition also makes use of a generic
type named writeable. This type indicates that the
writeable type in the first Write definition can be
replaced by any type for which it is possible to call
WriteIt. With this definition, the XL call looks similar to
the corresponding Pascal call. Formatting is achieved using the
format operator.
WriteLn "X=", X, ", Y=", Y
WriteLn X format "X=###.###", Y format "Y=###.###"
The same technique can be used to define a Max function
taking an arbitrary number of arguments of any type with a "less-than"
operator:
generic type ordered if
with ordered X, Y
with boolean B := X < Y
function Max(ordered X) return ordered is return X
function Max(ordered X; others) return ordered is
result := Max(others)
if result < X then result := X
real A, B, C, D
real E := Max(A, B, C, D)
integer I, J, K := Max(I, J)
Last, the same technique is also used to define generic types with
variable numbers of arguments:
generic [type item; ordered Low, High; others] type array
written array[Low..High, others] of item
is array[Low..High] of array[others] of item
array Matrix[1..5, 1..5] of real
True generic types
In C++, it is possible to define template functions, for instance
generic algorithms that apply to a variety of types. However, template
parameters have to be repeated over and over, although for a family of
algorithms they often tend to be identical.
template <class T>
T min(T a, T b) {
if (a < b) return a; else return b;
}
template <class T>
T max(T a, T b) {
if (a > b) return a; else return b;
}
XL introduces the idea of true generic types. Generic types,
parameterized or not, can be used directly in parameters or return
types as well as in the definition of type interfaces or in generic
argument lists. In all cases, they implicitly make the corresponding
declaration generic, with the corrresponding parameter. This can make
generic code significantly smaller.
generic type ordered
function Min(ordered A, B) return ordered is
if A < B then return A; else return B
function Max(ordered A, B) return ordered is
if A > B then return A; else return B
Within a same declaration, a same generic type is identical. Two
instances of ordered in the Min function above
always correspond to the same type.
The same is also true for parameterized generic types. An instance
of the name of such a type without the corresponding parameters makes
the declaration generic on all the parameters of the type. In that
case, the parameters of the type can be referred to using a dotted
notation, as if they were fields of the generic type.
generic [type item] type list
function First(list L) return list.item
function Last(list L) return list.item
Constrained generic types
Generic types facilitate the declaration of generic algorithms. But
they also can make them more precise. Generic types can be constrained
using the if keyword to specify an interface that the generic
type must follow. The constraint is generally a small piece of code
that is supposed to compile for any value of the generic type. Generic
instantiation will fail otherwise.
generic type ordered if
// Indicate that 'ordered' requires a boolean "less-than"
with ordered A, B
with boolean C := A < B
function Min(ordered A, B) return ordered
integer X, Y, Z := Min(X, Y) // OK
record T, U, V := Min(T, U) // Error: no less-than for records
A generic parameter can 'derive from' a generic type, indicating
that it inherits its constraints.
generic [type row like ordered; type column like ordered] type map
map M[integer, character] // OK, constraints satisfied
map N[integer, record] // Error: constraints not satisfied
Predicated generic specialization
C++ offers template specialization and partial
specialization. Specialization is used to define special cases for
template instantiation.
template <class T> class vector; // Template
template <class T> class vector<T *>; // Partial specialization
template <> class vector<bool>; // Specialization
In addition to such "structural" specializations, XL also supports
specializations based on predicates, which enables specializations
that would require intermediate 'facet' helper classes in
C++. Predicated generic specialization makes the intent much clearer.
generic [type T] type vector
generic [type T] type vector for vector[pointer to T] // structural
generic type vector for vector[boolean] // structural
generic [type T] type vector when size(T) = size(integer) // predicated
Usable generic types
The combination of expression reduction and
type-safe variable argument lists makes it
possible to define generic types that behaves really like built-in
types used to work in languages such as Pascal.
array A[1..5] of integer
array B[1..3, 'A'..'C'] of real
A[1] := 3
B[2, 'C'] := 2.5
It also makes it possible to combine the declaration of a complex
type with its initialization in one single expression:
generic [type T] type vector written vector[T]
generic[type item]
function vector(integer Size) return vector[item]
written vector[Size] of item
procedure Test() is
with vector K[3] of integer
In the above example, the vector K[3] of integer declaration contains
both a static piece of information (vector of integer) and a dynamic one (vector[3]).
In that case, the first expression reduction invokes the vector function with the type
integer for item and the value 3 for Size.
In turn, this causes the instantiation of the vector[integer] type.
Notice that the compiler can disambiguate the two uses of the brackets between
vector[3] and vector[integer] based only on the written forms that have been given.
|