Home | About | Partners | Contact Us
SourceForge Logo

Quick Links

Home
News
Thoughts and Rants
License
SourceForge Info
Download
Browse CVS
Mailing Lists

XL Links

XL Home
The XL Touch
Basic XL features
Why use XL?
Compiler Status
Frequently Asked Questions
XL Mailing list

Understanding...

Concept programming
Moka
XL
Thin Tools
Active Libraries
Refactoring
Optimizations
Coda
Notes

More Info

Contributors
Related Ideas
Some history
Applications
Other links
The GNU Project

The XL Touch

A few key features give XL its particular expressive power. These advanced features combine with simpler ones to give you a programming language that grows as you need it, yet remains simple when you need simple things done.

To the best of my knowledge, these features are original to XL, and were implemented for the first time in the XL compiler (or for some of them in its ancestor, LX, sometimes as long as 5 years ago). But please don't hesitate to let me know if you think otherwise.


Tree-based program representation

XL source code has a standard, tree based, object-oriented, persistent, compile-time representation, which can be used to implement all sorts of program manipulations (or meta-programs.) This representation gives XL an extensibility similar to Lisp, but in a very different way. A tree representation can be rendered back into source code, simplifying the implementation of source-to-source code transformations.

  • Standard representation means that there is a language-defined representation for every source statement. Therefore, meta-programs are portable between XL implementations. Standardization also implies that the tree representation "makes sense" for a variety of uses, contrary to the typical internal representations used by more traditional compilers.
  • Tree-based means that the internal representation of a source program is a familiar abstract syntax tree. It is not, for instance, a text-based or list-based representation. A tree representation simplifies most manipulations.
  • Object-oriented means that each tree element has a type, that the types are related through inheritance, and that operations on the trees can use dynamic dispatch based on the tree type. An object-oriented tree representation greatly simplifies several program manipulation algorithms.
  • Persistent means that the program representation can be serialized to disk. There is a standard format for programs, beside source code. Program elements can be saved to persistent storage between sessions. Individual tools in a workflow can be located in different processes easily (or even on different machines, or run at different points in time)
  • Compile-time means that the program representation is not, by default, preserved at runtime. This is not a limitation, because a plug-in can easily perform a program transformation to turn the compile-time representation into a suitable run-time representation. A typical use would be to keep run-time type information (RTTI) for selected object types.

XL is really defined in terms of its standard representation, not in terms of its syntax. Multiple syntax variants for the same tree representation are possible, and automatic conversion from one to another using rendering engines is easy. One particular flavor of this is automatic reformatting of source code according to specific guidelines.

Pragmas

Pragmas are directives that extend the language in arbitrary ways. They are used for a variety of simple features that require keywords in languages like C or C++. But more generally, pragmas are a way for the language to specify that a compiler plug-in should be applied to a particular program section.

When you write {foozoo}integer X;, the compiler invokes the plug-in responsible for the {foozoo} pragma (located in an implementation-dependent way), and passes the program tree corresponding to the declaration integer X.

Pragmas give a readable way to extend the language beyond its basic capabilities, or to implement specific aspects of the language.

Expression Reduction

Several languages (C++, Ada) offer the possibility to overload arithmetic operators such as '+' or '/'. The languages I know of limit themselves to binary operators. In some cases, combining operations results in improved performance (vector and multimedia operations) or precision (matrix operations). XL offers a general notation for defining such operator combinations:

function Multiply_Add(vector A, B, C) return vector written A*B+C
vector M, N, O
vector P := M * N + O

XL also allows this notation to be used for the parameters to generic types, in which case the use of multiple operators is even more useful.

generic [type item; ordered Low, High] type array
    written array[Low..High] of item

type vector is array[1..100] of real
array V[1..10] of integer

A single expression can be reduced twice, once to resolve generic arguments, and once to resolve the expression itself. This makes it possible to have fairly readable construction expressions for data type.

generic [type T] type vector written vector[T]

generic[type item]
function vector(integer Size) return vector[item]
    written vector[Size] of item

// Two reductions at once
vector K[3] of integer

Some of the expression elements need not be parameters of the function being defined

function IsIdentity(matrix M) return boolean written M = 1
function IsZero(matrix M) return boolean written M = 0

A written form can be used to create implicit conversions:

function Real(integer I) return real written I
real R := 1     // Implicit call to Real(1)

Why? The operator overloading syntax is easier to read than in most alternatives (consider how you indicate if operator++ is prefix or postfix in C++). It is also more general. When applied to generic types, it allows libraries to define a syntax such as array[A..B] of T. When applied to objects, it allows optimizations that the compiler cannot know about to be specified by the programmer. It also enable convenient mathematical notations (0 < X < 1)

Named operators

In the above array example, of is not an operator, but a named infix operator. XL allows you to use arbitrary names for infix operators in expressions.

function And(integer X, Y) return integer written X and Y
integer Bits := Value and Mask

function Rectangle (integer X, Y, Width, Heigh) return rectangle
    written Width by Height at X row Y
rectangle R := 10 by 20 at 30 row 40

This technique allows one to use an infix function call notation where appropriate, as in Objective-C or Smalltalk. Such notations are often (but not always) more readable.

Why? In addition to basic operators such as A and B, it also can be used for convenience notations such as A between 0 and 1.

Cons Someone will soon realize that with the proper declarations, the following becomes legal XL:

if you read this then you are an idiot

Type-safe variable argument lists

At least one very frequent operation requires a variable number of arguments: displaying the results of a program. So far, no language has found a really elegant solution:

  • Pascal has a WriteLn procedure which is practical to use and efficient, but is a special-case that you cannot write in Pascal
    WriteLn('X=', X, 'Y=', Y);
    
  • C has the printf function, which is practical to use and quite efficient, but is not type safe, causing subtle programming errors.
    printf("X=%d, Y=%d\n", X, Y);
    
  • C++ has the ostream insertion operators (<<) which is type safe. But using them is not nearly as practical as printf for complex formatting, and it is quite inefficient (since it involves one function call per parameter, thus bloating the code a lot). I also personally find it quite awkward to read.
    cout << "X=" << X << ", Y=" << Y << eol;
    
  • Ada and Java solve the problem with string concatenation operators, which are quite inefficient (one function call per item plus memory management and string concatenations). And they are not that readable either, since they use the '+' or '&' operator which also has other meanings. Last, and more importantly, this solution only works for text output, not for variable argument lists in general.
    System.out.println("X=" + X + ", Y=" + Y)
    

XL allows you to define a WriteLn procedure yourself, that behaves exactly like the Pascal WriteLn yet can be defined in a library and written in XL. This is achieved using the others keyword, which stands for any number of arguments. A procedure or function with others parameters is generic, and the compiler will use it to generate functions with the appropriate number of arguments.

generic type writeable if
   with writeable W
   WriteIt W
procedure WriteLn(others) is
   Write others
   Write NewLine
procedure Write(writeable W; others) is
   WriteIt W
   Write others
procedure WriteIt(integer I)
procedure WriteIt(real R) 
procedure WriteIt(character C)   

Note that this definition also makes use of a generic type named writeable. This type indicates that the writeable type in the first Write definition can be replaced by any type for which it is possible to call WriteIt. With this definition, the XL call looks similar to the corresponding Pascal call. Formatting is achieved using the format operator.

WriteLn "X=", X, ", Y=", Y
WriteLn X format "X=###.###", Y format "Y=###.###"

The same technique can be used to define a Max function taking an arbitrary number of arguments of any type with a "less-than" operator:

generic type ordered if
    with ordered X, Y
    with boolean B := X < Y
function Max(ordered X) return ordered is return X
function Max(ordered X; others) return ordered is
    result := Max(others)
    if result < X then result := X

real A, B, C, D
real E := Max(A, B, C, D)
integer I, J, K := Max(I, J)

Last, the same technique is also used to define generic types with variable numbers of arguments:

generic [type item; ordered Low, High; others] type array
    written array[Low..High, others] of item
is array[Low..High] of array[others] of item

array Matrix[1..5, 1..5] of real

True generic types

In C++, it is possible to define template functions, for instance generic algorithms that apply to a variety of types. However, template parameters have to be repeated over and over, although for a family of algorithms they often tend to be identical.

template <class T>
T min(T a, T b) {
    if (a < b) return a; else return b;
}

template <class T>
T max(T a, T b) {
    if (a > b) return a; else return b;
}

XL introduces the idea of true generic types. Generic types, parameterized or not, can be used directly in parameters or return types as well as in the definition of type interfaces or in generic argument lists. In all cases, they implicitly make the corresponding declaration generic, with the corrresponding parameter. This can make generic code significantly smaller.

generic type ordered
function Min(ordered A, B) return ordered is
    if A < B then return A; else return B
function Max(ordered A, B) return ordered is
    if A > B then return A; else return B

Within a same declaration, a same generic type is identical. Two instances of ordered in the Min function above always correspond to the same type.

The same is also true for parameterized generic types. An instance of the name of such a type without the corresponding parameters makes the declaration generic on all the parameters of the type. In that case, the parameters of the type can be referred to using a dotted notation, as if they were fields of the generic type.

generic [type item] type list
function First(list L) return list.item
function Last(list L) return list.item

Constrained generic types

Generic types facilitate the declaration of generic algorithms. But they also can make them more precise. Generic types can be constrained using the if keyword to specify an interface that the generic type must follow. The constraint is generally a small piece of code that is supposed to compile for any value of the generic type. Generic instantiation will fail otherwise.

generic type ordered if
    // Indicate that 'ordered' requires a boolean "less-than"
    with ordered A, B
    with boolean C := A < B

function Min(ordered A, B) return ordered

integer X, Y, Z := Min(X, Y)    // OK
record T, U, V := Min(T, U)     // Error: no less-than for records

A generic parameter can 'derive from' a generic type, indicating that it inherits its constraints.

generic [type row like ordered; type column like ordered] type map
map M[integer, character]       // OK, constraints satisfied
map N[integer, record]          // Error: constraints not satisfied

Predicated generic specialization

C++ offers template specialization and partial specialization. Specialization is used to define special cases for template instantiation.

template <class T> class vector;        // Template
template <class T> class vector<T *>;   // Partial specialization
template <> class vector<bool>;         // Specialization

In addition to such "structural" specializations, XL also supports specializations based on predicates, which enables specializations that would require intermediate 'facet' helper classes in C++. Predicated generic specialization makes the intent much clearer.

generic [type T] type vector
generic [type T] type vector for vector[pointer to T]     // structural
generic          type vector for vector[boolean]          // structural
generic [type T] type vector when size(T) = size(integer) // predicated

Usable generic types

The combination of expression reduction and type-safe variable argument lists makes it possible to define generic types that behaves really like built-in types used to work in languages such as Pascal.

array A[1..5] of integer
array B[1..3, 'A'..'C'] of real

A[1] := 3
B[2, 'C'] := 2.5

It also makes it possible to combine the declaration of a complex type with its initialization in one single expression:

generic [type T] type vector written vector[T]

generic[type item]
function vector(integer Size) return vector[item]
    written vector[Size] of item

procedure Test() is
    with vector K[3] of integer

In the above example, the vector K[3] of integer declaration contains both a static piece of information (vector of integer) and a dynamic one (vector[3]). In that case, the first expression reduction invokes the vector function with the type integer for item and the value 3 for Size. In turn, this causes the instantiation of the vector[integer] type. Notice that the compiler can disambiguate the two uses of the brackets between vector[3] and vector[integer] based only on the written forms that have been given.


Copyright Christophe de Dinechin
First published Feb 17, 2000
Version 1.6 (updated 2002/12/19 14:35:14)