LX Compared to Other Languages

Christophe de Dinechin
Version 1.5 (updated 2002/01/23 16:15:22)

For many programmers, seeing a language in action and a few simple examples is the best way to get a general idea of what it can do. Below are some of the interesting characteristics of LX, roughly from the simplest to the most advanced, which should give you some idea of the language. Each feature contains a short description, an example, and an indication of why this is useful or necessary. This list is only intended to show where LX is most likely to differ from other common languages such as C++. It is not a specification.

We also do not discuss here in depth what sets LX apart, the numerous ways to extend the language.

Indentation sensitive syntax, like Occam or Python
Reduced usage of punctuation characters
Style-insensitive names: open_file = OpenFile
Digit grouping in numbers: 1_000_000
Based numbers: integral and real numbers in bases 2 to 36, such as 16#FFF0_000E
Expression reduction: multi-way operator overloading, for expressions such as A+B*C
Named operators: Infix binary operators with arbitrary names
Programming by contract as in Eiffel: assertions, preconditions and postconditions
Input and output parameters to functions and procedures
Named and out-of-order arguments
Result parameter in functions
Type-safe variable argument lists: defining a Pascal-like WriteLn
Regular data types: no int keyword
Data inheritance: Building data types by aggregation
Variant records: A more compact equivalent of union
Tagged variant records: Validation of overlapping fields in a data structure
Type interface: Making the interface of a type independant of its implementation
Properties: controlled access to data fields
Logical inheritance: Inheritance does not require a shared data representation
Constructors and destructors: Safety in constructing and destructing objects, with greater flexibility than in C++
Multiple kind of pointers: optional garbage collection, address arithmetic or persistence
Real modular programming: no #include
The body keyword: Avoiding repeating function arguments in definitions
The using statement: making field and module access shorter
No private interfaces: hidden information is really hidden
Customizable for loop: for loops use iterator objects, that the user may redefine
Customizable case statement: The case statement (equivalent to C's switch) can accept any data type.
Pragmas rather than keywords for implementation details specification
Disciplined exception handling
Function-based dynamic dispatch: Solving the weak-base-class problem
Dynamic objects: Objects that are always allocated dynamically and garbage collected.
Multi-way dynamic dispatch ("multi-methods")
Forwarding and delegation: making another object responsible for unknown operations
True generic types: facilitate the definition of generic (template) algorithms
Constrained generic types: early validation of generic (template) instantiations
Predicated generic specialization: complex specializations made easier
Usable generic types: constructed array type behaves like built-in array in other languages
Standard library: An extensive library covering containers, algorithms, user interface or networking.
Reflection: user-defined compiler and language extensions

Indentation-sensitive syntax

Most programming languages use keywords or special characters (such as '{' and '}') to delimit blocks. This makes it possible for the indentation to no longer reflect the actual structure of the program, as in the following C++ code:

if (A == B);
    printf("A==B\n");

LX, on the other hand, uses the indentation to indicate the structure of blocks. The indentation is defined with either only spaces, or only tabs. Combinations in a same source file are not allowed.

if A = B then
    WriteLn "A=B"

Indentation in LX remains very flexible. It does not prevent grouping multiple statements on a line, nor does it prevent splitting a long line in the middle. LX has terminating characters and keywords (';' and 'end'), but they are optional when indentation allows them to be deduced.

-- Single line compound statement
if N = 0 then return 1; else return N * fact(N-1); end

-- Multi-line expressions and optional end
while ptr <> nil and ptr.size > 0 and ptr.array[ptr.size] <> 0
  and ptr.array[ptr.size-1] <> 0
  and ptr.array[0] <> ptr.array[ptr.size]
  loop
     Write "A=", A,
           "B=", B
     ptr := ptr.next
  end -- 'end' is optional here, deducible from indentation

Why? Code is read many times as often as it is written. Indentation is the easiest way for the human brain to visualize program structure. So forcing indentation to be kept in sync with program structure helps maintenance.

Reduced usage of punctuation

No keyword or character is required to open or close a block, since indentation is used to delimit the block:

procedure TableOfFactorials() is
    with integer N, I; real Fact
    WriteLn "--- Table of factorials ---"
    for N in 1..100 loop
        Fact := 1.0
        for I in 2..N loop Fact *= I
        WriteLn "Factorial of ", N,
                " is ", Fact
    WriteLn "--- End -------------------"

The semi-colon ';' is required to separate statements or declarations only if they are on the same line. Otherwise, a line break can act as an instruction end. Instructions or blocks can span multiple lines, or be on the same line separated with semi-colons (see the for loops above). An instruction can be split between lines using indentation. Invoking a procedure, such as WriteLn, does not require parenthesis around the arguments.

Why? The punctuation is unnecessary, and causes additional visual clutter. In addition, on many non-US keyboard layouts, typical punctuation characters such as { or } are difficult to type.

Style-insensitive names

To most non-programmers, JohnSmith, john_smith and JOHN_SMITH all denote the same person. As programmers, we unlearn this identity to make them different. When looking up a name, LX ignores case and separating underscore '_' characters (two consecutive underscores are not allowed).

-- There is only one 'file' and one 'open_file' below
function OpenFile(string Name, Mode) return file
FILE F := open_file("foo.dat", "r")

On the other hand, name overloading allows you to reuse the same name for different entities when the name will not be used in the same context.

type rectangle
function Rectangle(integer X, Y, Width, Height) return rectangle
rectangle Rectangle := Rectangle(10, 10, 20, 20)

In addition, LX, like any Mozart-based language, requires a renderer, which can be used to present code to programmers with their preferred style.

Why? Style preferences cause religious reaction from many otherwise reasonable programmers. Style insensitivity allows reuse of libraries while maintaining a consistent style in your own code.

Cons Entities that would be distinguished by case in the real world can no longer be distinguished this way. For instance, V for volume and v for speed in physics. This is common and reasonable practice.

Digit grouping

As processor architectures move toward ever increasing bit sizes (32, then 64 bits), large numbers become difficult to decipher.

long long million = 1000000;
long long billion = 1000000000;
double pi = 3.1415926535897932384626433;

LX allows you to use a separating underscore '_' between digits to make them easier to read:

integer million := 1_000_000
integer billion := 1_000_000_000
real pi := 3.14159_26535_89793_23846_26433;

Why? Large constants are getting more common with larger processor word sizes.

Based numbers

On today's architectures, the most used bases are 10, 2 and 16. Octal is rarely used. Yet C has a short notation for base 8, a longer notation for base 16, and no notation at all for base 2. Worse, these notations do not apply to floating-point numbers, making the encoding of numbers such as MAX_FLT awkward. Last, a few mathematical algorithms are best written with bases such as 3 or 7.

LX offers a general notation for based numbers, that allows any base between 2 and 36, and applies to floating-point numbers as well.

integer Large := 16#FFFF_FFFF
integer Twenty_Seven := 3#1000
real MAX_FLT is 2#1.1111_1111_1111_1111_1111_1111#E127

Why? This solution is general and easy to read. It doesn't create syntactic ambiguities as other notations would. LX parses 0x0 as 0 x 0 (infix x operator), and 03H-1 as 03 H -1.

Expression Reduction

Several languages (C++, Ada) offer the possibility to overload arithmetic operators such as '+' or '/'. The languages I know of limit themselves to binary operators. In some cases, combining operations results in improved performance (vector and multimedia operations) or precision (matrix operations). LX offers a general notation for defining such operator combinations:

function Multiply_Add(vector A, B, C) return vector written A*B+C
vector M, N, O
vector P := M * N + O

LX also allows this notation to be used for the parameters to generic types, in which case the use of multiple operators is even more useful.

generic [type item; ordered Low, High] type array
    written array[Low..High] of item

type vector is array[1..100] of real
array V[1..10] of integer

Some of the arguments of binary operators need not be parameters of the function being defined

function IsIdentity(matrix M) return boolean written M = 1
function IsZero(matrix M) return boolean written M = 0

A written form can even be used to enable implicit conversions:

function Real(integer I) return real written I
real R := 1     -- Implicit call to Real(1)

Why? The operator overloading syntax is easier to read than in most alternatives (consider how you indicate if operator++ is prefix or postfix in C++). It is also more general. When applied to generic types, it allows libraries to define a syntax such as array[A..B] of T. When applied to objects, it allows optimizations that the compiler cannot know about to be specified by the programmer. It also enable convenient mathematical notations (0 < X < 1)

Named operators

In the above array example, of is not an operator, but a named infix operator. LX allows you to use arbitrary names for infix operators in expressions.

function And(integer X, Y) return integer written X and Y
integer Bits := Value and Mask

function Rectangle (integer X, Y, Width, Heigh) return rectangle
    written Width by Height at X row Y
rectangle R := 10 by 20 at 30 row 40

This technique allows one to use an infix function call notation where appropriate, as in Objective-C or Smalltalk. Such notations are often (but not always) more readable.

Why? In addition to basic operators such as A and B, it also can be used for convenience notations such as A between 0 and 1.

Cons Someone will soon realize that with the proper declarations, the following becomes legal LX:

if you read this then you are an idiot

Programming by Contract

Bertrand Meyer's Eiffel language formalized the idea of programming by contract. LX allows you to define preconditions, postconditions and internal assertions in your objects. The compiler may either verify these predicates, or use them for better optimizations.

function Factorial(integer N) return real
    assume N >= 0
    ensure out Result >= in N

procedure Display() is
    with integer I
    for I in 1..100 loop
        with real F := Factorial(I)
        assert F > 0.0
        WriteLn "F = ", F

Note: the keyword for preconditions may change to 'require', following a suggestion that 'assume' makes the responsibilities of the caller and callee less clear.

Why? Programming by contract improves the maintainability and robustness of code, and it can improve optimizations too.

Input and output parameters

C and C++ functions only have input arguments passed by value. Other kinds of argument passing conventions have to be simulated using pointers. As the underlying architecture changes, the compiler cannot take advantage of wider registers.

In the following code, assuming short is 16 bits, it makes sense to pass a pointer when the size of pointers is 16 bits.

struct Rect {
   short Top, Left, Bottom, Right;
};
void CopyRect(const Rect *from, Rect *to)

When the pointer is 32 bit, on many RISC architectures, copying the struct from registers would be faster than accessing it from memory. On 64-bit architecture where the whole structure fits in a single register, having to use pointers and memory accesses becomes more expensive than passing the struct directly in a register, but the compiler is not allowed to do that.

In LX, the programmer specifies if the parameters are used for input, output or input/output to the function or procedure. As a result, the compiler is free to select the best compromize for passing the arguments, depending on the target architecture.

type rect is record with
    integer16 Top, Left, Bottom Right
procedure CopyRect(in rect From; out rect To)

On a 64-bit architecture, one register will be passed as input, and one as output, with no memory access. C will mandate two 64-bit pointers as input, with at least one memory load and one store inside the function. On a 16-bit architecture, LX will probably use the same approach as C, passing two pointers. Note that since an LX compiler may use different methods for passing arguments, input arguments are read-only (in C and C++, the local copy in the function can be modified.)

Output arguments are constructed in the procedure, and deleting them is the responsibility of the caller, except if the callee exits with an exception. A program is invalid if there is a way for a procedure to exit without having initialized all of its output parameters. A compiler is free to check this at compile time or at run time. In general, if output arguments have complex constructors, they should be initialized before anything else is done.

Why? Specifying the intent improves readability and maintenability, and gives the compiler the information it needs to make the right optimization.

Named and out-of-order arguments

Some function require very large parameter lists. Calls to these functions tend to become difficult to read. In C++, the problem is often worked around by passing objects that encapsulate parameters, but in some cases this is not practical.

control_register DiskController = control_register(
    0xEFFF42D0, 32, 16, 0xFFFFFFFF, true, true);

LX allows the call site to specify the name of the arguments, in which case the order of the arguments may be different than the order of the parameters.

control_register DiskController := control_register(
        Address: 16#EFFF_42D0,
        Register_Size: 32,
        Memory_Access_Size: 16,
        Bit_Mask: 16#FFFF_FFFF,
        Read_Enable: true,
        Write_Enable: true)

Result parameter in functions

C and C++ pass the result of functions by copy. Just like passing all incoming arguments by value, this can be inefficient. Some compilers perform an optimization known as the Named Return Value (NRV) optimization, but the ability to perform this optimization is very dependent on the coding style in the function. And many compilers just don't do this optimization at all.

In LX, the return value of a function is named result. Actually, there is practically no difference between the two following declarations:

function F(integer N) return integer
procedure F(out integer result; integer N)

In many cases, this allows a more compact coding of the function:

function Factorial(integer N) return real is
    with integer I
    Result := 1.0
    for I in 2..N loop Result *= I

There is, naturally, also a return statement, which can be useful for early termination of a function:

function Factorial(integer N) return real is
	if N = 0 then return 1.0
	return N * Factorial(N-1)

If the returned data type has a contructor, then it is not possible for the function to return an uninitialized value. From that point of view, result is considered like any output parameter.

Type-safe variable argument lists

At least one very frequent operation requires a variable number of arguments: displaying the results of a program. So far, no language has found a really elegant solution:

Pascal has a WriteLn procedure which is practical to use and efficient, but is a special-case that you cannot write in Pascal
```
WriteLn('X=', X, 'Y=', Y);
```
C has the printf function, which is practical to use and quite efficient, but is not type safe, causing subtle programming errors.
```
printf("X=%d, Y=%d\n", X, Y);
```
C++ has the ostream insertion operators (<<) which is type safe. But using them is not nearly as practical as printf for complex formatting, and it is quite inefficient (since it involves one function call per parameter, thus bloating the code a lot). I also personally find it quite awkward to read.
```
cout << "X=" << X << ", Y=" << Y << eol;
```
Ada and Java solve the problem with string concatenation operators, which are quite inefficient (one function call per item plus memory management and string concatenations). And they are not that readable either, since they use the '+' or '&' operator which also has other meanings.
```
System.out.println("X=" + X + ", Y=" + Y)
```

LX allows you to define a WriteLn procedure yourself, that behaves exactly like the Pascal WriteLn yet can be defined in a library and written in LX. This is achieved using the others keyword, which stands for any number of arguments. A procedure or function with others parameters is generic, and the compiler will use it to generate functions with the appropriate number of arguments.

generic type writeable if
   with writeable W
   WriteIt W
procedure WriteLn(others) is
   Write others
   Write NewLine
procedure Write(writeable W; others) is
   WriteIt W
   Write others
procedure WriteIt(integer I)
procedure WriteIt(real R) 
procedure WriteIt(character C)

Note that this definition also makes use of a generic type named writeable. This type indicates that the writeable type in the first Write definition can be replaced by any type for which it is possible to call WriteIt. With this definition, the LX call looks similar to the corresponding Pascal call. Formatting is achieved using the format operator.

WriteLn "X=", X, ", Y=", Y
WriteLn X format "X=###.###", Y format "Y=###.###"

The same technique can be used to define a Max function taking an arbitrary number of arguments of any type with a "less-than" operator:

generic type ordered if
    with ordered X, Y
    with boolean B := X < Y
function Max(ordered X) return ordered is return X
function Max(ordered X; others) return ordered is
    result := Max(others)
    if result < X then result := X

real A, B, C, D
real E := Max(A, B, C, D)
integer I, J, K := Max(I, J)

Last, the same technique is also used to define generic types with variable numbers of arguments:

generic [type item; ordered Low, High; others] type array
    written array[Low..High, others] of item
is array[Low..High] of array[others] of item

array Matrix[1..5, 1..5] of real

Regular data types

LX has no special data type using keywords, such as int in C and C++. All data types obey the same rules. Predefined data types such as integer are simply defined by the compiler implicitly. They are actually extracted from the LX.BUILT_IN module. Therefore, the integer name can be used like any other name, for instance to define a conversion function from an arbitrary precision big_integer type.

type big_integer;
function Integer(big_integer Z) return integer

When a type is used in a declaration, only one word of the type is on the left of the declared name. The remaining of the type is on the right of the declared name.

type Proc is procedure(integer N)
procedure P(integer N)

type Rec is record with integer N
record R with integer N

type Vector5 is array[1..5] of integer
array V5[1..5] of integer

Data inheritance

LX builds data types by aggregation, in a way similar to single inheritance in C++. Additional data fields are added to existing types. A predefined record empty data type is used to create data records.

type point is record with real X, Y
type point_3D is point with real Z

Variant records

C and C++ have union types. The syntax is not very practical since it allows only one declaration for each union member. To get an integer followed by either a real or two shorts or four chars, one has to write:

struct VariantStruct {
   int I;
   union {
      float F;
      struct { short S1, S2; } Shorts;
      struct { char C1, C2, C3, C4; } Chars;
   } U;
};
VariantStruct S;
char C4 = S.U.Chars.C4;

LX used tagged records to allow the declaration of such data structures, where multiple declarations can be placed under each of the conditions.

type variant_record is record with
    integer I
    when true:  real  F
    when true:  short S1, S2
    when true:  char C1, C2, C3, C4;
    when true
       char exponent
       char mantissa1, mantissa2, mantissa3;
    when true:
       bitmask bits[32]
variant_record S
char C4 := S.C4

Another form of variant records can be declared in LX and not easily in C or C++: records in which a data member declaration involves previous data members. For instance:

type TwoStrings is record with
    unsigned Size1
    array    String1[1..Size1] of character
    unsigned Size2
    array    String2[1..Size2] of character

Tagged variant records

In C or C++, it is not possible to really control the access to fields in a variant record. LX gives the possibility by adding conditions in the tags of a variant record. For instance, in the VariantStruct example above, it is possible to control when each of the fields is accessed depending on the value of I:

type variant_record is record with
    integer I
    when I >= 0:        real  F
    when I in 3..27 :   short S1, S2
    when true:          char  C1, C2, C3, C4

The guard conditions are used as assertions when accessing each of the fields. The conditions need not be mutually exclusive.

Type interface and properties

In C and C++, the interface of a type is its implementation. In other words, the following data type necessarily contains 4 floating point values.

struct complex { float Re, Im, Rho, Theta; };

LX allows you to define the interface of a type independently of its actual implementation. For instance, it is possible to provide read-only Rho and Theta fields in a complex type.

-- Interface of the type
type complex with
    real Re, Im
    out real Rho, Theta

-- Implementation of the type (normally hidden to the user)
type complex is record with
    real Re, Im

-- Implementation of the "missing" data fields (properties)
function Rho(complex Z) return real written Z.Rho is
    return Sqrt(Z.Re^2 + Z.Im^2)
function Theta(complex Z) return real written Z.Theta is
    return Atan2(Z.Im, Z.Re)

Logical inheritance

In C++, inheritance also implies a common internal representation. Therefore, it is not possible to create a Big_Integer class that would behave like an Integer clas if it doesn't also share its data members.

In LX, the properties of "behaving like" (or "being a") is called logical inheritance, and is independant of the underlying representation. One way of achieving logical inheritance is through data inheritance, in which case base-slicing will occur as in C++:

type shape
type rectangle like shape is shape with integer X, Y, W, H

But the data types can also be totally unrelated. In that last case, conversions must be provided between the derived and base types.

type integer
type big_integer like integer is record with
    unsigned Size
    array    Bytes[1..size] of byte
function Integer(big_integer I) return integer

Logical inheritance indicates that the derived type can be used as input argument to any function that would take the base object. It is also the basis for dynamic function selection in the case of dynamic dispatch, the LX replacement for C++ virtual functions.

Multiple logical inheritance is allowed. Multiple conversion functions must be provided in that case, for each of the base classes. C++ style multiple inheritance by aggregation can trivially be implemented using records, but more sophisticated schemes using more complex conversion functions can also be used.

Constructors and destructors

Constructors and destructors are special functions used to create and destroy objects. C++ makes heavy use of constructors and destructors, mostly because of its very weak memory management model inherited from C. While they are less useful in LX, constructors and destructors are available.

A constructor is a function with the same name as the type it returns. In any scope where a constructor is present, any object of the type must be constructed using a constructor. A function taking no argument is called the default constructor, and can be used for object declarations with no initialization. If there is a constructor, but no default constructor, default initialization of objects is not valid.

type complex
function Complex(real Re, Im) return complex -- constructor
procedure P() is
    with complex Z := Complex(1.0, 3.0)
    with complex I       -- Error: not initialized

Contrary to C++, constructors can be defined anywhere and involve other constructors. For instance, the following is legal in LX (notice the use of a nested function to create a default constructor):

type complex
function Complex(real Re, Im) return complex -- contructor
procedure P() is
    with function Complex() return complex is return Complex(0.0, 1.0)
    with Complex I      -- OK: Uses the local default constructor

Constructors that take a single argument of a different type are called conversion constructors. In the case of logical inheritance, the constructor converting to the base class is invoked implicitly by the compiler.

As indicated previously, LX functions define a parameter named result, which needs to be initialized when there are constructors. If there is a constructor but no default constructor, the very first statement of the function must be a return statement or initialize result using an assignment.

function I() return complex is
    return Complex(0.0, 1.0)
function J() return complex is
    result := Complex(0.0, -1.0)
function K() return complex is
    -- Error: no default constructor, first statement not an init
    Write "Hello"
    return Complex (3.5, 7.2)

If there are multiple output arguments, the first N statements of the procedure or function must initialize each of the output arguments in their declaration order. The exception to this rule are constructors declared in the same scope as the type they construct, or other constructors if none is declared in the type scope.

Destructors

Destructors in LX are procedures called Delete and taking one input argument of the given type. They are invoked implicitly by the compiler when any variable of the given type exits a given scope.

type vector
function vector(unsigned Size) return vector
procedure delete(vector V)

procedure P() is
    with vector V := vector(270)
    -- 'delete V' implicitly called at P exit

As for constructors, destructors can be defined locally. In that case, though, the last statement of a local destructor is to call implicitly the global destructor it hides.

type vector
function vector(unsigned Size) return vector
procedure delete(vector V)

procedure P() is
    with procedure delete(vector V) is
        -- ... some specific code
        -- the global delete V is invoked here
    with vector V := vector(3)
    -- ... other code
    -- The local delete V (and in turn the global one) is invoked
    -- on exit from P() here

Multiple kinds of pointers

Contrary to C and C++, pointers are not built-in entities. Just like arrays, they are constructed generic types. The LX library offers multiple pointer variants, corresponding to different pointer usages, and allowing the compiler to make reasonable optimization assumptions.

In general, all these pointer types are made largely unnecessary in application code, because of LX dynamic objects, which should be used for building complex data structures.

Pointer types that may be offered by the LX library include:

References behave exactly like the object they point to. The dereferencing operation is implicit. Assigning to a reference assigns to the referred object, however initialization initializes the reference. Changing the reference target is done with the 'Rename' procedure.

integer I := 3, J := 4
reference R to integer := I   -- Create the reference
reference Q to integer := J
R := 5
Write "I=", I                 -- Will write 5, not 3
R := Q                        -- Assigns R (and I) to 4, the value of J
R := 7                        -- Modifies I, not J
Rename J, R                   -- Set the reference R to point to J
R := 8                        -- Modify J through R
Write "J=", J                 -- Will write 8

Addresses behave almost exactly like C and C++ pointers. In particular, arithmetic can be performed on them as in C or C++. Dereferencing addresses must be explicit, using the prefix '*' operator. Creating addresses must also be explicit (via an address constructor). Like in C, uninitialized memory can be allocated and freed manually, and converted to any address type. The allocation function is called raw_alloc and takes a number of bytes as an argument. The deallocation procedure is called free.
```
array A[1..5] of integer
address Ptr of integer := address(A[0])
Ptr += 3        -- With some luck, points to A[3] :-)
*Ptr := 5
Ptr := raw_alloc(1000)
free Ptr
```
Pointers are normally used for applications that want to explicitly control allocation and deallocation. They are assumed by the compiler to never point to anything but dynamically allocated memory (they shouldn point to global or local variables, in particular). However, there is an explicit conversion from address types, which can be used at your own risks. Pointers have a special value, null, used to indicate that they point to nothing. Pointers are initialized with a null value by default. Pointers must be explicitly dereferenced with *P. Accessing fields is done with the dot '.' operator.
Memory accessed by pointers is allocated by the alloc function, which takes an object argument used to initialize the memory. Memory is freed by the free procedure, which also resets the pointer to null. free can apply to a null pointer and does nothing in this case. The destructor of pointers doesn't free the memory.
```
procedure P() is
    with
        complex I
        pointer P to complex    -- Initialized with NULL
        pointer Q to complex
    P := alloc(I)               -- Allocate and initialize
    Q := P                      -- Two pointers to same object
    P.Re := 78.25               -- Implicit dereference with '.'
    *P := I + I                 -- Replace pointed object
    free P                      -- Freeing memory (and set P to null)
    P := pointer(address(I))    -- OK: explicitly taking address
    P := pointer(I)             -- Error: can't point to "stack"
    P := P + 3                  -- Error: no arithmetic
```
Automatic pointers are normally used for temporary dynamically allocated memory. They behave like pointers, with the following differences: their destructor calls free; they are set to null whenever copied to another pointer; if copied from another pointer, free is invoked on the original value. Copy from a regular pointer or an address requires an explicit conversion. Copy to a regular pointer is allowed.
```
procedure Swap(big_blob A, B) is
    with auto_ptr P to big_blob := alloc(B)
    B := A
    A := *P
    -- The allocated memory is freed here automatically
```

Access types are used for objects allocated in garbage-collected memory. Access types behave mostly like pointers, but the allocation function is called new and there is no need to invoke free (although free can still be used to set an access type to null.)

type person is access to record with
   string       name, first_name
   person       father, mother

function person(string name, first_name;
                person father := NULL, mother := NULL) return person
                is
   result := new(person)
   result.name := name
   result.first_name := name
   result.father := father
   result.mother := mother

function CreateJohnDoe() return person is
   return person("Doe", "John",
                 person("Doe", "Jack"),
                 person("Duh", "Jane"))

Accesses are the building blocks for creating dynamic objects.

Real modular programming

C and C++ offer rudimentary modular programming through the #include preprocessor programming. This puts a lot of burden on the programmer, in particular having to write include guards at the beginning and end of each include file. It has several other significant drawbacks, such as the cost for the compiler to re-parse the include files over and over. For a language as complex as C++, the build-time cost is very high.

LX, like many other languages, has a real notion of modular programming. In LX, modules are declared using the same syntax as records. Modules are nothing more than constant records. Modules are typically declared using the module data type, which is a constant empty type.

A module interface therefore looks like the following:

module COMPLEX with
    type complex with real Re, Im;
    constant complex I
    function Complex () return complex
    function Complex (real Re, Im) return complex
    function Add(complex Z1, Z2) return complex written Z1+Z2
    function Sub(complex Z1, Z2) return complex written Z1-Z2
    -- ... and more

The implementation of the module can be defined in a different file, as follows:

module COMPLEX body is with
   type complex is record with real Re, Im
   constant complex I is Complex(0.0, 1.0)
   function Complex() return complex is return Complex(0.0)
   function Complex(real Re, Im) return complex is
       result.Re := Re
       result.Im := Im
   function Add body is
       result.Re := Z1.Re + Z2.Re
       result.Im := Z1.Im + Z2.Im
   -- ... and so on

The import statement is used to import declarations from another source file, without the cost of re-parsing them. The import statement is typically used to import modules. It can also be used to give a local short name to a module name.

import IO = LX.TEXT_IO
procedure WriteHello() is
    IO.WriteLn "Hello"

The `body` keyword

In the module implementation above, the COMPLEX module and the Add function have their argument lists replaced with the body keyword. When the type key and the name are enough to select a unique declaration, the body keyword can be used to replace all the additional parameters of this declaration.

The COMPLEX module implementation above is therefore a short version of:

module COMPLEX with
    type complex with real Re, Im;
    constant complex I
    function Complex () return complex
    function Complex (real Re, Im) return complex
    function Add(complex Z1, Z2) return complex written Z1+Z2
    function Sub(complex Z1, Z2) return complex written Z1-Z2
    -- ... and more
is with
   type complex is record with real Re, Im
   constant complex I is Complex(0.0, 1.0)
   function Complex() return complex is return Complex(0.0)
   function Complex(real Re, Im) return complex is
       result.Re := Re
       result.Im := Im
   function Add (complex Z1, Z2) return complex written Z1+Z2 is
       result.Re := Z1.Re + Z2.Re
       result.Im := Z1.Im + Z2.Im
   -- ... and so on

The body keyword can be used for procedures, functions, types, enumerations and record types (including modules).

The `using` statement

A good way to refer to imported entities is to prefix them with a module name, which can be a short-cut if one has been given in the import statement.

import IO = LX.TEXT_IO
procedure SayHello() is
    IO.WriteLn "Hello, world..."

Oftentimes, it is practical to be able to use members of a module or record directly. The using statement serves that purpose:

import IO = LX.TEXT_IO
procedure SayHello() is
    using IO
    WriteLn "Hello World..."
    WriteLn "I feel like talking today"
    WriteLn "What do you think?"

The same technique can be used to access members of a record directly:

procedure Module(vector V of (vector of complex)) return real is
    with integer I, J
    result := 0.0
    for I in 1..Size(V) loop
        for J in 1..Size(V[I]) loop
            using V[I][J]
            result += Re*Re + Im*Im  -- Found in V[I][J]
    result := Sqrt(result)

In many cases, the compiler will be able to optimize accesses to data members when this technique is used. Typically, the compiler might be able to compute the address of V[I][J] only once in the above loop.

It is also worth noting that whenever a qualified name is used to access a function or procedure, that same qualified name is implicitly used for the function or procedure arguments. Also, expression reduction applies to imported modules content

import CX = COMPLEX

function Test(CX.complex Z) return CX.complex is
    result := Z + CX.complex(1.5, 3.7)  -- '+' declared in COMPLEX
    result := CX.Sub(result, I)         -- I found in CX implicitly

No `private` interface

In C, C++ or Ada, the compiler has to know the complete description of a data type presented in an interface. Therefore, these languages require you to expose so-called "private" parts of the interface in the interface files. These are indeed all but private, since they are exposed to all users of the module.

struct Complex {
    Complex(double, double);
    double Re();
    double Im();
  private:
    double _Re, _Im;
};

LX, on the other hand, relies on the normal visibility rules to identify what operations are allowed on a type and what operations are not. To achieve that goal, LX allows slightly more operations with types for which the definition is not available. Also, LX offers type interfaces to control the exposure of data fields.

-- Module interface
type complex
function Complex(real Re, Im) return complex
function Re(complex Z) return real
function Im(complex Z) return real

-- Operations allowed to the module user: the type is opaque
complex I := complex(0.0, 1.0)
real Zero := Re(Z)

Customizable `for` loop

The for loop in LX is controlled by an iterator object. Iterator objects offer operations such as starting the loop, testing if this is the last iteration, and advancing to the next iteration. Iterator objects are typically created using iterator expressions, such as I in 1..5. The definition for such an integer iterator can be given as follows:

type integer_iterator is record with
    reference   Counter to integer
    integer     Low, High

function integer_iterator(reference Counter to integer;
                          integer Low, High)
                           return integer_iterator
                           written Counter in Low..High is
    result.Counter := reference(Counter)
    result.Low     := Low
    result.High    := High

procedure Start(in out integer_iterator I) is
    I.Counter := I.Low

function More(integer_iterator I) return boolean is
    return I.Counter <= I.High

procedure Next(in out integer_iterator I) is
    I.Counter += 1

procedure Test_Integer_Iterator() is
    with integer I
    for I in 1..5 loop
        WriteLn "I=", I

LX offers numerous standard iterators, for instance to iterate over arrays, lists, vectors and other containers.

procedure Write(array A) is
    with array.item I
    for I in A loop Write I

Customizable `case` statement

C and C++ have a switch statement to select among various alternatives. This statement cannot be used for non-integral values, in particular objects. LX equivalent statement is the case statement, but it can be used for many more data types.

procedure Analyze_This(integer X) is
    case X is
        when 1:         WriteLn "X=1"
        when 2..5:      WriteLn "X is small"
        when others:    WriteLn "I have no idea"
    WriteLn "Detailed analysis complete"

The case statement can be applied to any type, provided a suitable Index function is defined. The Index function returns the one-based index of its first argument among all other arguments, or a value which is not within 1..Number_of_arguments-1 otherwise. Index functions are defined by default for numeric types, strings and characters.

type point is record with integer X, Y
type rectangle is record with integer X, Y, W, H

function Index(point P) return integer is
    return 1    -- Not found

function Index(point P; point Test; others) return integer is
    if P = Test then return 1; else return Index(P, others) + 1

function Index(point P; rectangle Test; others) return integer is
    if P in Test then return 1; else return Index(P, others) + 1

procedure TestPoint(point P; rectangle R1, R2; point Q) is
    case P is
        when R1:        WriteLn "P is in R1"
        when R2:        WriteLn "P is in R2"
        when Q:         WriteLn "P = Q"
        when others:    WriteLn "P is somewhere, who knows?"

Pragmas

C and C++ have numerous specialized keywords, such as register, volatile or inline which are implementations hints to the compiler. Unfortunately, numerous situations are not covered by standard keywords. As a result, some implementations of C and C++ had to add their own keywords, such as __far, __export or __thread. C and C++ also have pragmas, but their use is awkward, due to the preprocessor-style syntax. Ada also has pragmas, but they require a lengthy notation which makes them difficult to use.

LX uses pragmas for all such keywords, and many more. The pragma notation in LX is also much more convenient, since they are often used. Pragmas are simply placed between curly braces. Pragmas can significantly change the implementation, but they normally do not change the fact that the program compiles or not.

{inline} function Get_Amount(account A) return amount is
    return A.Amount

{address 16#EFFF_4DB0}
record Control_Register with
     boolean    Enable  {offset 0} {bit 0}
     boolean    Reset   {offset 0} {bit 3}
     boolean    Error   {offset 0} {bit 7}
     integer    Count   {offset 1} {bit 0} {bitsize 3}
     boolean    Overflow{offset 1} {bit 4}

{lazy} function And(boolean A, B) return boolean written A and B is
     if not(A) then return false
     return B

{C "_memcpy"}
function memcpy(address Target, Source; unsigned Size) return address

Every compiler can define its own pragmas, but some pragmas are standardized by the language, including:

{inline}, {lazy} to control inlining and lazy argument evaluation of procedure and function calls. Lazy argument evaluation indicates that arguments are evaluated at their point of use (which must be unique in the function) rather than on entry in the function.
{address}, {offset}, {bit}, {bitsize}, {align}, {access_size}, {access_space} to control the address, layout, alignment, memory access size and memory access space of objects.
{byval} and {byref} to control arguments passing.
{language}, {name} and {C} to control interfacing with other languages.
{shared}, {export} and {import} to deal with shared libraries
{thread} to declare thread-local storage or instruction execution threads.

Last, new pragmas can be added by the user to extend the compiler, using reflection.

Disciplined exception handling

Several languages such as Ada, C++ and Eiffel provide exceptions as a mean to deal with error conditions. Exceptions can be used to signal these conditions to calling functions. Exception handlers are used to deal with the exceptions.

Ada and C++ exceptions are signals that are handled typically at one place, and then stopped there. An exception handler can rethrow the exception, but this takes a slight effort.

Eiffel and LX use a slightly different model, where exceptions signal an anomalous condition that persists until explictly cleared. Exception handler perform cleanup before propagating the exception, rather than just intercepting it. LX also offers the possibility to retry the block that caused the exception.

exception SERIAL_OVERRUN

procedure Copy_Serial_Data(serial_port Input, Output) is
    with integer Retries := 0
    try
        loop
            with byte B := Read(Input)
            exit if B = EOT
    catch SERIAL_OVERRUN:
        Retries += 1
        if Retries < 3 then retry
        Reset Input
        Reset Output
     catch others:
        Reset Input
        Reset Output

Dynamic objects

More and more applications are very dynamic in nature. There are types that are practically always accessed (using pointers or similar entities) rather than referenced directly. In an object-oriented design, the types being accessed often form a deep inheritance hierarchy. LX facilitates the manipulation of such types by allowing the use of pointers to become implicit. Using such types become very similar to the way the types are used in Java.

A dynamic object is implemented by deriving the implementation from the object data type.

type person is object with
   string name, first_name
   person father, mother

function person(string name, first_name;
                person father := NULL, mother := NULL) return person is
   result.name := name
   result.first_name := name
   result.father := father
   result.mother := mother

function CreateJohnDoe() return person is
   return person("Doe", "John",
                 person("Doe", "Jack"),
                 person("Duh", "Joan"))

Function-based dynamic dispatch

C++ offers dynamic dispatch, that is the invokation of different methods based on the actual type of an object. The mechanism in C++ is called virtual functions. The argument on which the dispatch is being done uses a special syntax: it is not passed within the parenthesed arguments to the function, but before a dot '.' or arrow '->' operator, and becomes the implicit 'this' in the virtual function. Virtual functions must be declared within the class declaration for the first argument. Last, dynamic dispatch is invoked through pointers or references, which are allowed to point to objects of derived classes: pointers and references are polymorphic.

struct Shape {
    virtual float Surface();
    ...
};

struct Rectangle: Shape {
    Rectangle (double, double, double, double);
    virtual float Surface();
    ...
};

Shape *shape = new Rectangle(1.3, 4.5, 7.0, 9.8);
float surface = shape->Surface();

This technique creates a subtle problem: you can't do dynamic dispatch if the functionality was not initially part of the base class. If the user of Shape needs a virtual Draw functionality, the only place where it can be added is class Shape. Adding this functionality is possible only if you own the class (that is, you can't add it if Shape was part of a third-party library). Even if you can add the functionality, it may break dependent code, for instance code that used the name Draw already. This problem is known as the weak base class problem, and can be very significant in large scale software engineering with C++.

LX takes another approach to dynamic dispatch. There is no special syntax at the call site, and function declarations involving dynamic dispatch can be placed anywhere. Dynamic dispatch is obtained by declaring a function that takes a polymorphic argument, using the any keyword. All functions with a compatible signature declared in a scope where the dispatch function is visible will be part of the dynamic dispatch.

Pointers and references are not implicitly polymorphic in LX. Instead, polymorphic objects or object types are declared using the any keyword.

type shape
type rectangle like shape
function Rectangle(real Top, Left, Bottom, Right) return rectangle

function Surface(any shape S) return real
function Surface(any rectangle S) return real

any shape S := rectangle(1.0, 3.14, 2.718, 9.00)
real Surface := Surface(S)

-- Create a polymorphic type
type polymorphic_shape is any shape

A polymorphic object can be converted to a non-polymorphic type or to a polymorphic type of a base or derived type using the as operator. The operator raises an exception if the dynamic type doesn't match the conversion being made (this never happens for conversion to a base type). Conversion to a base dynamic type may result in a truncation. The as operator is also allowed on pointer and access types and returns null rather than raise an exception if the conversion is invalid.

procedure Draw (rectangle R)
function BoundingBox(any shape S) return rectangle

function DrawBoundingBox(polymorphic_shape S) is
   if S isA rectangle then
      Draw S as rectangle       -- Raises exception if S not a rectangle
   else
      Draw BoundingBox(S)

Dynamic dispatch follows the rules of logical inheritance, and ignores data inheritance. In LX, data inheritance is a practical tool for implementing types, but is not in general exposed in the user-visible interface of the type.

Expression reduction can be used on expressions that result in dynamic dispatch. This can be used in particular to mimic the syntax used by other languages, whenever that notation is natural or easier to read.

function Surface(any shape S) return real
    written S.Surface()
function Offset(any shape S; point P) return any shape
    written S + P

Multi-way dynamic dispatch

In C++, dynamic dispatch can only occur on one argument at a time. LX has no such restriction. Note that dispatching through multiple arguments has a significant runtime cost. The rules for selecting the called procedure or function in that case follow the overloading rules when the types are known at compile time.

function Intersect(any shape S1, S2) return any shape
    written S1 inter S2
function Intersect(any rectangle R1, R2) return any rectangle

any shape S1 := rectangle(1, 2, 3, 4)
any shape S2 := rectangle(3, 4, 5, 6)
any shape S3 := shape()

any shape S4 := S1 inter S2     -- Invokes (rectangle, rectangle)
any shape S5 := S2 inter S3     -- Invokes (shape, shape)

It is important to stress again that while multi-way dynamic dispatch is possible, it is either slower than single-way dynamic dispatch (it does not execute in constant time), or requires vast amounts of memory, or both.

Forwarding and delegation

In many object-oriented development models, invoking a method on an object is called "sending a message", and the object is said to "receive the message". Some languages, such as SmallTalk or Objective-C, have the possibility to forward unknown messages to a "delegate". This can be used to implement small objects, known as "proxies", that filter some messages and send the rest to their delegate.

Although LX does not have a dedicated mechanism for forwarding and delegation, the characteristics of its logical inheritance mechanism make it easy to implement them. The proxy will inherit from the types it wants to forward to, and have a conversion to the base types that converts to its delegates. For instance, to create a grayed_shape proxy that intercepts drawing to any shape, leaving other operations to a shape go through, one can write:

type shape
procedure Draw(any shape S)
function Surface(any shape S) return real

type grayed_shape like shape is
record with
    any shape Delegate

function grayed_shape(any shape S) return grayed_shape is
    result.Delegate := S

procedure Draw (any grayed_shape P) is
    Set_Gray_Mode
    Draw P.Delegate
    Reset_Gray_Mode

-- Conversion to base, implicitly called
function shape(grayed_shape P) return any shape is
    return P.Delegate


procedure Test() is
    with rectangle R := rectangle(1, 2, 3, 4)
    with grayed_shape G := grayed_shape(R)
    Draw R                      -- Draws the rectangle as is
    Draw G                      -- Draws the rectangle grayed out

    with real S1 := Surface(R)  -- Compute rectangle surface
    with real S2 := Surface(G)  -- Compute rectangle surface using delegate
    WriteLn "S1=", S1, " S2=", S2

True generic types

In C++, it is possible to define template functions, for instance generic algorithms that apply to a variety of types. However, template parameters have to be repeated over and over, although for a family of algorithms they often tend to be identical.

template <class T>
T min(T a, T b) {
    if (a < b) return a; else return b;
}

template <class T>
T max(T a, T b) {
    if (a > b) return a; else return b;
}

LX introduces the idea of true generic types. Generic types, parameterized or not, can be used directly in parameters or return types as well as in the definition of type interfaces or in generic argument lists. In all cases, they implicitly make the corresponding declaration generic, with the corrresponding parameter. This can make generic code significantly smaller.

generic type ordered
function Min(ordered A, B) return ordered is
    if A < B then return A; else return B
function Max(ordered A, B) return ordered is
    if A > B then return A; else return B

Within a same declaration, a same generic type is identical. Two instances of ordered in the Min function above always correspond to the same type.

The same is also true for parameterized generic types. An instance of the name of such a type without the corresponding parameters makes the declaration generic on all the parameters of the type. In that case, the parameters of the type can be referred to using a dotted notation, as if they were fields of the generic type.

generic [type item] type list
function First(list L) return list.item
function Last(list L) return list.item

Constrained generic types

Generic types facilitate the declaration of generic algorithms. But they also can make them more precise. Generic types can be constrained using the if keyword to specify an interface that the generic type must follow. The constraint is generally a small piece of code that is supposed to compile for any value of the generic type. Generic instantiation will fail otherwise.

generic type ordered if
    -- Indicate that 'ordered' requires a boolean "less-than"
    with ordered A, B
    with boolean C := A < B

function Min(ordered A, B) return ordered

integer X, Y, Z := Min(X, Y)    -- OK
record T, U, V := Min(T, U)     -- Error: no less-than for records

A generic parameter can 'derive from' a generic type, indicating that it inherits its constraints.

generic [type row like ordered; type column like ordered] type map
map M[integer, character]       -- OK, constraints satisfied
map N[integer, record]          -- Error: constraints not satisfied

Predicated generic specialization

C++ offers template specialization and partial specialization. Specialization is used to define special cases for template instantiation.

template <class T> class vector;        // Template
template <class T> class vector<T *>;   // Partial specialization
template <> class vector<bool>;         // Specialization

In addition to such "structural" specializations, LX also supports specializations based on predicates, which enables specializations that would require intermediate 'facet' helper classes in C++. Predicated generic specialization makes the intent much clearer.

generic [type T] type vector
generic [type T] type vector for vector[pointer to T]     -- structural
generic          type vector for vector[boolean]          -- structural
generic [type T] type vector when size(T) = size(integer) -- predicated

Usable generic types

The combination of expression reduction and type-safe variable argument lists makes it possible to define generic types that behaves really like built-in types used to work in languages such as Pascal.

array A[1..5] of integer
array B[1..3, 'A'..'C'] of real

A[1] := 3
B[2, 'C'] := 2.5

Standard Library

LX has a standard library that is inspired by the C++ Standard Library for the containers and algorithms part, and by the Java library for graphics and networking. LX has its own libraries for input/output and mathematical operations. Last, LX offers access to the C library when there is one on the system.

The LX library tries to resemble these other libraries whenever possible. However, it sometimes offers a different interface to take advantage of unique LX features.

import IO = LX.Text_IO
import ALGO = LX.Algorithms
import LIST = LX.List

procedure Example() is
-- This function reads strings from its standard input and
-- sorts them on the output
    with string S; LIST.list L of string
    for S in IO.StdIn loop
        L += S
    ALGO.Sort L
    for S in L loop
        WriteLn S

Reflection

Reflection is the ability for a program to access its own representation. It can simplify dramatically operations that are defined on complex types as a composition of operations of their simpler subtypes: cloning, encoding data to transmit over the network, disk storage. Generic programming (templates) is not generally well suited for that purpose. Reflection also makes "active libraries" possible, that is libraries that direct the compiler.

Java offers an elementary form of reflection, giving access to a description of classes in a program. However, it is not possible to modify the program by altering the reflective representation. Such "read-only" access to reflective data is often called introspection rather than reflection. Several other languages, most notably LISP, have given programs the ability to modify themselves.

LX takes another approach to reflection. Whenever the compiler encounters a pragma, it invokes a compiler extension (typically, a shared library) and hands it a complete, standardized representation of the declaration or instruction being compiled, as well as the context in which it is being compiled. The compiler extension can then decide what to do with the code it has received. This makes LX the first language that mandates user-extensible compilers.

For instance, one may declare a {clonable} pragma that automatically generates a "Clone" function from type definitions.

{clonable} type tree is record with
    access Left to tree
    access Right to tree

For a more in-depth discussion of reflection, see the Mozart page. Mozart is the basis for LX reflection.