## Building, say, indices<6,4,2,0,-2,-4>

### December 3, 2012

A simple indexing class of variadic ** std::size_t** template parameters is often used to provide a structured method to select multiple elements from a C++11 tuple. In this post I present an alternative to this interface, commonly used when building an object of this type, wherein a finite series of indices is now generated according to a numeric range and signed stride; akin to Fortran array section syntax. Let’s look at the common solution first.

A tuple may be created either using the ** std::tuple** constructor, or

**function template. Both expect the same variadic series of arguments. In addition, the constructor requires the type of each argument explicitly. Hence, the more concise**

`std::make_tuple`

**, is used throughout the examples. In the example below, the last element of**

`std::make_tuple`

**will be an**

`t1`

**.**

`int`

auto t1 = std::make_tuple('z',false,42,"cat",7); auto t2 = std::tuple<char,bool,int,const char *,long>('z',false,42,"ktn",7);

Selection of a single tuple element is achieved using the standard tuple function template: ** std::get**. The sole template parameter of

**, a**

`std::get`

**, represents an index; and must be a constant expression. Using the**

`std::size_t`

**tuple from the code above,**

`t1`

**evaluates to an**

`std::get<4>(t1)`

**; with a value of**

`int`

**.**

`7`

As a *variadic* function template, ** std::make_tuple** can of course be used to create a new tuple from the elements of another. In the code below,

**, having type**

`t3`

**, is formed from copies of the 1st, 3rd, and 5th elements from the earlier tuple,**

`tuple<char,int,int>`

**.**

`t1`

auto t3 = std::make_tuple(std::get<0>(t1),std::get<2>(t1),std::get<4>(t1));

In the code above, we hard coded the selection using three index values: ** 0**,

**, and**

`2`

**. How could we instead write a function template,**

`4`

*select*, that accepts,

*at least*, a tuple argument, and returns another tuple formed from an arbitrary set of its elements? The conventional solution introduces the following simple variadic class template:

template <std::size_t ...Is> struct indices {};

An object of type ** indices** might then be created with, for example:

**;**

`indices<>`

**; or**

`indices<0,2,4>`

**. Our**

`indices<1,1,2,3,5,8>`

**function may then be defined, as shown in the code below.**

`select`

template <typename ...Ts, std::size_t ...Is> auto select(std::tuple<Ts...> t, indices<Is...>) -> decltype(std::make_tuple( std::get<Is>(t)... )) { return std::make_tuple( std::get<Is>(t)... ); }

Calling ** select** with the earlier tuple variable

**and a**

`t1`

**variable, again results in a tuple with type**

`indices<0,2,4>`

**.**

`tuple<char,int,int>`

There are a few C++11 things to notice in this code. The ** Is** template parameter pack is not expanded “directly” in the function body, due to the placement of the ellipsis. So, while

**, say, would expand to**

`indices<Is...>`

**, in the**

`indices<0,2,4>`

**function above, the actual expansion becomes**

`select`

**; three arguments for the variadic**

`std::get<0>(t), std::get<2>(t), std::get<4>(t)`

**function. Such ellipses will expand all parameter packs in the**

`make_tuple`

*pattern*to their left. Expanding multiple parameter packs of differing lengths, with a single ellipsis, will cause a compilation error.

Also worth noting: the use of C++11’s trailing return type, and ** decltype** specifier is, here, optional; the

**function**

`select`

*can*be typed just as effectively by the more ornate code below. Often, however, the former typing technique is preferable as it has a simple, mechanical, syntax-based application; and so is

*generally*applicable. With short functions the concise form can also provide a readable symmetry. With luck, perhaps by C++17, or maybe even C++14, we can do away with it altogether; after all, the

**keyword can bind to untyped lambda expressions.**

`auto`

template <typename ...Ts, std::size_t ...Is> std::tuple< typename std::decay< typename std::tuple_element<Is,std::tuple<Ts...>>::type >::type... > select(std::tuple<Ts...> t, indices<Is...>) { return std::make_tuple( std::get<Is>(t)... ); }

Finally, as only the template arguments of its type are used, the second function parameter is not bound to a name.

All well and good, but how would I modify *every* element of an arbitrary-length tuple? A common solution, seen here, here, and here, is to use ** make_indices**, a variadic class template with a

**member,**

`typedef`

**, which equates to an instantiation of the**

`type`

**class template. The**

`indices`

**template parameters of the**

`std::size_t`

**type are instantiated as a zero-based finite arithmetic progression, with length equal to the number of template arguments given to**

`indices`

**. For example,**

`make_indices`

**, would evaluate to**

`make_indices<short,int>::type`

**. The code below demonstrates a simple application of**

`indices<0,1>`

**within a function template,**

`make_indices`

**, which “does nothing”; well, it returns a tuple comprised of the same elements as the input. With additional parameters,**

`id`

**can easily be used to create tuple versions of**

`make_indices`

*map*and

*zipWith*.

template <typename ...Ts> std::tuple<Ts...> id(std::tuple<Ts...> t) { return select(t,typename make_indices<Ts...>::type()); }

Although useful, the ** make_indices** class template has a number of weaknesses:

- The template parameters of
are fixed as`indices`

only;`std::size_t`

- The first index created is always
;`0`

- The common difference between each index is fixed to
;`1`

- The common difference is a positive value only;
can only be applied to type template parameter lists, and not non-type template parameters.`make_indices`

The first point can be addressed by a variadic, generic, index container:

template <typename T, T...> struct indicesT {};

With tuples, the argument given to the relevant index function, ** std::get**, will commonly have type

**. The following**

`std::size_t`

*type alias template*allows the more straightforward

**specialisation of**

`std::size_t`

**to be used; e.g.**

`indicesT`

**instead of**

`indices<0,1,2>`

**; also, the earlier definition of**

`indicesT<std::size_t,0,1,2>`

**can remain unchanged.**

`select`

template <std::size_t ...Is> using indices = indicesT<std::size_t, Is...>;

The traditional ** make_indices** class template outlined earlier only allows us to specify the

*extent*of the generated, zero-based, indices. Using the

**alias template, the same may be achieved using an integral range. For example, like**

`mk_index_range`

**, the type expression**

`make_indices<int,bool,char>::type`

**will evaluate to**

`mk_index_range<0,2>`

**; or**

`indices<0,1,2>`

**in interval notation.**

`[0,2]`

The ** mk_index_range** alias template can also use a non-zero based

*start*value; for example

**will evaluate to**

`mk_index_range<8,10>`

**. A third, optional, template parameter of**

`indices<8,9,10>`

**allows the specification of a**

`mk_index_range`

*stride*. So,

**will evaluate to**

`mk_index_range<1,9,2>`

**; and**

`indices<1,3,5,7,9>`

**.**

`mk_index_range<9,1,-2>`

The indices produced by ** mk_index_range** will always have type

**; that is, the same type as the template parameter of the**

`std::size_t`

**function**

`<tuple>`

**. To produce**

`std::get`

*signed*indices, say of type

**, a second alias template is provided:**

`int`

**. Behind the scenes, something like**

`mk_index_rangei`

**, which evaluates to**

`mk_index_rangei<3,-3,-1>`

**, is defined as**

`indicesT<3,2,1,0,-1,-2,-3>`

**, using a more general alias template,**

`MkIndices<int,int,3,-3,-1>`

**.**

`MkIndices`

A fair question is, how many indices are produced by an arbitrary index range? For example, what is produced by ** mk_index_range<1,9,2>** as opposed to

**. How about**

`mk_index_range<1,10,2>`

**? The answer comes from the language of the new century: Fortran. With the**

`mk_index_range<1,0,1>`

*loop-control*of the DO construct defined by the grammar production below,

do-variable = expr1, expr2 [,expr3]

then the number of iterations is defined by the following equation:

niters = max((expr2 − expr1 + expr3)/expr3, 0)

The C++ code for this is a ** constexpr** function which can be used within the template arguments of the

**implementation.**

`mk_index_range`

We’re now in a position to have some fun. The following basic function, ** tuple_tail** is used within the definition of the tuple overload of the insertion operator

**. The**

`<<`

**function returns a tuple comprised of all elements of the input tuple argument,**

`tuple_tail`

*minus*the first element:

template <typename T, typename ...Ts> tuple<Ts...> tuple_tail(tuple<T,Ts...> t) { return select(t, mk_index_range<1,sizeof...(Ts)>()); }

The following function, ** tuple_reverse**, unsurprisingly returns a tuple constructed from all the elements of the input tuple argument, in reverse order:

template <typename ...Ts> tuple<Ts...> tuple_reverse(tuple<Ts...> t) { return select(t, mk_index_range<sizeof...(Ts)-1,0,-1>()); }

The ** mk_index_range** function template is now used throughout an updated version of the compile-time FFT code described in the previous post. The

**,**

`map`

**and**

`zipWith`

**functions there now all use**

`iota`

**; they’re also similar to each other; and are interesting enough. The following function,**

`mk_index_range`

**, is though more compelling: returning a tuple comprised of every**

`condenseN`

*n*th element of the input tuple. It is both integral to the FFT implementation;

*and*uses a stride, or common difference, that isn’t

**. Incidentally, the actual instantiation is**

`1`

**, and alas this will crash the current version of Clang; Clang 3.2 snapshot.**

`condenseN<2>`

template <size_t N, typename ...Ts> auto condenseN(tuple<Ts...> t) -> decltype(select(t,mk_index_range<0,sizeof...(Ts)-1,N>())) { return select(t,mk_index_range<0,sizeof...(Ts)-1,N>()); }

My personal favourite is also used by the compile-time FFT. The ** std::tuple_cat** is a variadic function template which should catenate all tuples provided as arguments. The implementation below uses a helper function,

**, which expands two parameter packs,**

`tuple_cat_helper`

**and**

`Is`

**, within a single statement:**

`Js`

template <typename ...Ts> tuple<> tuple_cat() { return tuple<>(); } template <typename ...Ts> tuple<Ts...> tuple_cat(tuple<Ts...> t) { return t; } template <typename Tup1, typename Tup2, std::size_t ...Is, std::size_t ...Js> auto tuple_cat_helper(Tup1 t1, Tup2 t2, indices<Is...>, indices<Js...>) -> decltype(make_tuple(get<Is>(t1)...,get<Js>(t2)...)) { return make_tuple(get<Is>(t1)...,get<Js>(t2)...); } template <typename ...Ts, typename ...Us, typename ...Tups> auto tuple_cat(tuple<Ts...> t1, tuple<Us...> t2, Tups ...ts) -> decltype(tuple_cat( tuple_cat_helper( t1,t2, mk_index_range<0,sizeof...(Ts)-1>(), mk_index_range<0,sizeof...(Us)-1>() ), ts... ) ) { return tuple_cat( tuple_cat_helper( t1,t2, mk_index_range<0,sizeof...(Ts)-1>(), mk_index_range<0,sizeof...(Us)-1>() ), ts... ); }

Note that the C++11 *standard* definition of ** std::tuple** is not defined as

**. The non-standard**

`constexpr`

**implementation provided with the FFT code is contained within the**

`tuple`

**namespace.**

`ctup`

The code for ** mk_index_range** is here and the

**-friendly**

`constexpr`

**implementation is here.**

`tuple`

## A compile-time FFT in C++11

### September 2, 2012

Generalised constant expressions are a core language feature introduced to C++ by the recent C++11 standard. A call to a function or method qualified with the new `constexpr`

keyword can now be evaluated at compile-time. Of course there are restrictions: arguments to such a call must also be amenable to compile-time evaluation. The primary restriction is that the body of a `constexpr`

function must comprise of a single `return`

statement. While this initially sounds draconian, the use of recursion, and the ternary operator, can permit essentially any calculation. The language of C++11 generalised constant expressions is that of a functional language.

The code below presents a brief example. Note the call to `sqr_int`

in the second template argument to the C++11 `std::array`

object. Also of interest: the call to `sqr_int`

will prosaically perform floating-point calculations at compile-time. Incidentally, among other restrictions, lambda expressions may not, alas, form part of a C++11 generalised constant expression.

#include <array> constexpr unsigned sqr_int(double d) { return d*d; } std::array<bool,sqr_int(4.5)> garr;

Having been impressed by the ctrace and metatrace compile-time ray-tracers, I thought I’d give the fast fourier transform (FFT) a whirl. Seeking an FFT implemented in a functional language, I came across the elegant Single Assignment C (SAC) version shown here. After failing to compile the assembled pieces with the SAC compiler, I converted it to Fortran 9x and verified the results against another Fortran implementation from here.

The next thing I needed was a container data type to stand in for the rank-one, first-class arrays employed by the Fortran. Thoughts of using `std::array`

soon evaporated as I realised that no `constexpr`

element access methods exist; neither `operator[]`

nor `std::array::front()`

comply. Similar issues exist with `std::tuple`

. Furthermore, the FFT only requires a homogeneous container. An implementation using the heterogeneous `std::tuple`

would be inclined to add additional type-checking code. In response, I created the following *recursive array* container type:

template <typename T, size_t N> struct recarr { constexpr recarr(T x, recarr<T,N-1> a) : m_x(x), m_a(a) {} constexpr T head() { return m_x; } constexpr recarr<T,N-1> tail() { return m_a; } private: T m_x; recarr<T,N-1> m_a; }; template <typename T> struct recarr<T,0> {};

The common restriction on recursive data types: members having the same type as the enclosing class must be pointers, is not applicable here; the `m_a`

member is always a different type; i.e. `recarr`

is not the same type as `recarr`

. The zero-length specialisation of `recarr`

has no members.

The list type used routinely in functional languages such as Haskell is a clear inspiration in the design of `std::recarr`

. A *cons* function for `std::recarr`

, necessary for the construction of higher order functions such as map and zipWith, may be defined as shown below. Rather than the raw `recarr`

*constructor*, the stand-alone `recarr_cons`

*function* can infer its template arguments from the runtime arguments.

template <typename T, size_t N> constexpr recarr<T,N+1> recarr_cons(T x, recarr<T,N> xs) { return recarr<T,N+1>(x,xs); }

An FFT is an operation on complex numbers. The standard `std::complex`

type, however, delivers another spanner to the works; it too lacks the `constexpr`

methods needed for a compile-time FFT; including the constructors, and arithmetic operators. So, a new complex number class was developed, along with the necessary C++11 generalised constant expression support. Complex number formulae and definitions were drawn from here and here.

GCC and Clang both have excellent support for C++11. Wouldn’t it be nice to find out which is the faster compiler for the evaluation of a `constexpr`

FFT? Some more challenges materialise here: Clang rejects all of the `cmath`

library functions, as being non-`constexpr`

. Functions required by the FFT include `sin`

, `sinh`

, `cos`

, `cosh`

, `sqrt`

, `log`

, `exp`

, `atan`

, `atan2`

and `pow`

. Initially I though that Clang was being a little cranky here, but this is in fact the correct behaviour; n.b. a `constexpr`

function may not be overloaded with a non-`constexpr`

function; and vice-versa. Standard library functions which are to be `constexpr`

must consequently be specified as such by the relevant C++ standard; the functions in `cmath`

are not. After all, functions comprised of a single return statement may not be the most performant! The N3228 and N3231 documents from the C++ Standards Committee’s November 2010 mailings elaborate on this theme.

So, `constexpr`

versions of the mathematical functions from `cmath`

necessary for the FFT were developed; with reference to the formulae and guidelines of the Algebra Help e-Book.

I am in fact a big fan of the `std::tuple`

class. In the end I couldn’t resist a second, comparative implementation of a `constexpr`

FFT using C++11 tuples; and so the magic of variadic templates. Need I mention that `std::tuple`

is also missing many of the `constexpr`

constructors and support functions necessary for the job? The github code linked below includes a simple recursive version of the `std::tuple`

type, with `constexpr`

constructors; a version of `get`

; a version of `make_tuple`

; and, as the FFT uses concat, a version of `tuple_cat`

. For comparison with `recarr_cons`

above, here is the simple, concise `tuple_cons`

; in constrast to a more verbose definition for `tuple_cat`

provided in the full project.

template <typename T, typename ...Ts> constexpr tuple<T,Ts...> tuple_cons(T x, tuple<Ts...> xs) { return tuple_cat(make_tuple(x),xs); }

The code is stored in a github repository here.

## Diagonal Selection

### February 6, 2011

A good example of the expressivity of the *forall* statement from Fortran 95 and HPF, is the selection of the elements on the diagonal of an array. We begin such a thing with the declarations of a rank 2 array *a*, and a vector *da* to hold the diagonal:

integer :: i, a(4,2) = reshape([(i,i=1,size(a))],shape(a)) integer :: da(minval(shape(a)))

Note that regardless of the rank of the original array, the length of the sought diagonal will always be equal to the mininum element of the array’s shape vector. In our case the shape vector of *a* is [ 4, 2 ], and so the diagonal has a length of 2. Of course this is due to our definition of the diagonal of an array. Informally, our notion of diagonal corresponds to the set of array elements, indexed by a set of equal values; and ordered by those values. So, for *a*, this will be the elements indexed by *a(1,1)* and *a(2,2)*: [ 1, 6 ]. So then, let’s look at the *forall* statement to achieve this:

forall (i=1:size(da)) da(i) = a(i,i)

How would this look as an array expression? As discussed in my first blog post, such array expressions hold the potential to handle input of arbitrary dimensionality; unlike the *forall* statement; though certainly more verbose.

### Diagonal Selection

If an array of rank *n *is flattened, say by the Fortran *pack* intrinsic, it becomes clear that the number of elements separating each adjacent pair of *diagonal* members, is constant. Conveniently, the set of recursive equations which can express this constant may also be encoded using the *scan_product* function from the previous blog post.

The array *a* has shape [ 4, 2 ]. The product scan of [ 4, 2 ] is [ 4, 8 ]. We’re looking for a logical mask vector to select the diagonal elements. ( True and False will also be denoted as 1 or 0 as required. ) Some examples: the earlier [ 4, 2 ] shape array will require a mask of [ **1**, 0, 0, 0, 0, **1**, *0*, *0* ]; while an array of shape [ 3, 4 ] requires a mask of [ **1**, 0, 0, 0, **1**, 0, 0, 0, **1**, *0*, *0*, *0* ]. The pattern to notice is a repeating string of False values, delimited by True values, then succeeded by False values. The repeating string of False values has a length of four in the first example, and a length of three in the second. How many False values in general? With an input array *a*, the following code will assign the answer to the *zps* (zeros_per_step) integer variable:

zps = sum(eoshift(scan_product(shape(a)),-1))

This value be used to size a logical vector (*zs*) of False values as shown. A True value should be cons’d to the start of this vector, and the result repeated *mi* times, where *mi* is one less than the length of the final diagonal. The implicit do loop can handle such concatenative actions concisely, as shown in the code below:

zs = spread(.false.,1,zps) mi = minval(shape(a))-1 tzs = [(.true.,zs,j=1,mi)]

The final, padding, string of False values, the vector *zpd* below, is longest in arrays having a relatively large final dimension; for example, shapes such as [ 2, 10 ], or [ 4, 5, 100 ].

nz = product(shape(a)) - (mi * (zps+1)) - 1 zpd = spread(.false.,1,nz)

To construct the final 1D boolean selection mask, *msk*, the *tzs* vector has both a single True value appended; and also the *zpd* vector:

msk = [(tzs,.true.,zpd,i=1,1)]

To apply this mask vector, we need to flatten the input array *a* using the *pack* intrinsic. The same *pack* intrinsic can then be used to apply the mask we have constructed above:

pack(pack(a,.true.),msk)

If the operation is to be encapsulated in a function, we are once again forced to choose a single, static rank for the result. The code below is then arbitrarily for an array of rank 3; again, an interface statement, could help construct a verbose generic representation. I’ve also used the *associate* *construct* from Fortran 2003, which operates like a non-recursive let expression. Hopefully it can aid readability; at the least one can easily print out the contents of each binding.

function diag3(a) result(r) integer, intent(in) :: a(:,:,:) integer :: i,j,r(minval(shape(a))) associate (sa => shape(a) ) associate (sc => scanl_product(sa)) associate (eo => eoshift(sc,-1) ) associate (zps => sum(eo) ) associate (ps => product(sa) ) associate (mi => minval(sa)-1 ) associate (nz => ps - (mi * (zps+1))-1) associate (zpd => spread(.false.,1,nz)) associate (zs => spread(.false.,1,zps)) associate (tzs => [(.true.,zs,j=1,mi)]) associate (msk => [(tzs,.true.,zpd,i=1,1)]) r = pack(pack(a,.true.),msk) end associate end associate end associate end associate end associate end associate end associate end associate end associate end associate end associate end function

## Scanners!

### September 6, 2010

Much can be achieved using the array functions of Fortran alone. Imperative (do/for) loops, for example, can often be avoided. The following code performs a one-dimensional convolution using array *sections*:

a(2:n-1) = a(1:n-2) + a(3:n)

There are also a few array “reduction” functions; such as *all*, *any*, *count*, *maxval*, *minval*, *product*, and *sum*. A higher-order reduction function could implement all of these functions; alas (in Fortran) it doesn’t *a priori *exist. For example, the functionality of *product* could be achieved using syntax such as:

result = reduce( (*), 1, [1,2,3] )

where *result* would be assigned a value of 6.

A common and useful partner to the *reduce* function is the *scan* function. The *scan* function returns not *one* final value, but all intermediate values too. If it existed in Fortran, a use case might look like this:

result = scan( (*), 1, [1,2,3] )

Assuming the multiplicative *unit* (1) is not included in the output, this would produce a result of [ 1, 2, 6 ]. I’d now like to implement this particular (multiplicative) scan using only existing Fortran array intrinsic functions; i.e. no loops. The earlier input of [ 1, 2, 3 ] will be used as a working example. (We will return to this scan function in the following post on diagonal selection.)

## Creating the Scan Expression

The intrinsic function *spread* will, given an array of rank n, produce a new array with rank n+1. The new array is constructed by replicating the input a number of times specified by *spread*‘s third (and final) argument. The second argument to *spread* specifies which dimension (>=1) to replicate over. We can visualise this choice with a pencil representing our 1D input. If we wish to replicate the pencil to make a square, we can begin *either* with the pencil positioned horizontally; *or,* vertically. Similarly, if we instead had a book and wished to create a cuboid, we begin with three choices.

A simple way to visualise the output of *spread*, is simply to consider the resulting arrangement of array values in memory. Often, Fortran array data is arranged in a contiguous linear arrangement, and so the output of:

spread( [1,2,3], 1, 3 )

is stored in memory as [ 1, 1, 1, 2, 2, 2, 3, 3, 3 ]. The output of:

spread( [1,2,3], 2, 3 )

however, is stored as [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ]. Both are 2D arrays with a shape of [ 3, 3]; and the difference is clear. Here we take the second choice, and replicate the input vector (that is, a 1D array) across the second dimension.

We now use the *eoshift* intrinsic function to shift the individual rows of this 3×3 array to the right, filling the holes left behind with the value provided by *eoshift*‘s third (and final) argument, 1; aka. the multiplicative unit. The second argument is a vector indicating the degree of shifting required for each row. As the default shift direction is to the *left*, we are looking for a vector, [ -2, -1, 0 ]. It may be constructed using the following *implied-do* expression:

[(i,i=-(size([1,2,3])-1),0)]

So, obtaining the length of the input using the *size* intrinsic, we have built a new 1D array containing the arithmetic progression from *minus* one less than the length, up to zero.

Returning to *eoshift*, the following expression:

eoshift( spread( [1,2,3] ,2, size([1,2,3]) ), & [(i,i=-(size([1,2,3])-1),0)], & 1)

evaluates to a 2D array with shape [ 3, 3 ]; its contents in memory are: [ 1, 1, 1, 1, 1, 2, 1, 2, 3 ].

Fittingly, the *product* reduction intrinsic can at last be employed. As well as operating on a vector, *product* can be applied to all vectors aligned with the axis specified by its second (and optional) argument. In this case that is axis #1, and so final result can be obtained by applying the *product* intrinsic to the result of the *eoshift* expression from above:

product( eoshift( spread( [1,2,3] ,2, size([1,2,3]) ), & [(i,i=-(size([1,2,3])-1),0)], & 1), & 1)

## Wrapping up as a Function

Rather than crafting such an expression inline around each 1D array expression we wish the scanned product of, we can at least create a function; *scanl_product*. The *execution-part* of this function is basically the *product* expression above, with a 1D array variable abstraction *v* replacing the literal [ 1, 2, 3 ]:

function scanl_product(v) result(vout) integer, intent(in) :: v(:) integer :: i, vout(size(v)) vout = product( eoshift( spread( v, 2, size(v) ), & [(i,i=-(size(v)-1),0)], & 1), & 1) end function

## Fragrant Fortran

### February 23, 2010

Fortran is a programming language like C, with more capital letters, and less pointers. Since around 1990, the Fortran standard has also supported arrays as first class language elements. Such array features, provided by Fortran 90, Fortran 95, and Fortran 2003, are reminiscent of Kenneth Iverson’s seminal APL (**A** **P**rogramming **L**anguage), and J; though less developed.

Fortran 9x includes many useful and essential procedures for manipulating arrays. For example, the function *shape* takes one array argument (of any dimensionality) and return a one-dimensional array (often referred to as a vector) which lists the extent/size of each dimension. The length of this vector is therefore equal to the dimensionality of the argument to the *shape* function.

Tens of such array functions are specified by the Fortran standard, and included in essentially every Fortran compiler. Along with the more numerous scalar functions, these are collectively referred to as the Fortran intrinsics.

Fortran array intrinsics are often defined for arrays of arbitrary dimensionality (rank). However, the end user is made to work hard to replicate such genericity. A user-defined function accepting or returning arrays must specify a fixed dimensionality. Consequently, Fortran’s expression language is massively more concise than its procedure language. Hence,

m = sum(a) / size(a)

will assign the mean (excepting zero-sized arrays, and other desiderata) of input array *a*, to *m*, regardless of the rank of the *a*. A function to perform a similar action on a *one-dimensional* array could look like this:

function mean1(a) result(m) real, intent(in) :: a(:) real :: m m = sum(a) / size(a) end function

which could be called with:

m = mean1(a)

We would need another for an array of rank 2:

function mean2(a) result(m) real, intent(in) :: a(:,:) ! Extra colon ! real :: m m = sum(a) / size(a) end function

The two versions of mean differ only by the rank in array *a*‘s declaration. Also, apart from such “boilerplate” declarations, the core of our function is lexically the same as the earlier expression; “m = sum(a) / size(a)”. At least Fortran arrays have a maximum rank of seven, so we would “only” need seven versions of this *mean* function. Tragically however, if a function takes *n* array arguments, we will need to write *7^n* functions.

One ray of light comes in the form of Fortran’s interface statement. The interface statement allows a user to access multiple procedures via a single name; similar to C++ overloading. For example, the following interface statement allows a call to *mean* to be instantiated appropriately, determined by the types of its arguments.

interface mean module procedure mean1, mean2 end interface mean

It’s worth mentioning that Professor Martin Erwig’s Parametric Fortran project provides one solution to this convoluted aspect of Fortran.

Finally, note also that alongside array expressions, 1990 also eschewed ye olde brutal and capitalist “FORTRAN” in favour of the effervescent stateswoman we know and love as “Fortran” today.