When most programmers think of module systems, they think of rather utilitarian systems for namespace control and splitting programs across multiple files. And in most languages, the module system is indeed little more than this. But in Futhark, we have adopted an ML-style higher-order module system that permits abstraction over modules :cite:`Elsman:2018:SIH:3243631.3236792`. The module system is not just a method for organising Futhark programs, it is also a powerful facility for writing generic code. Most importantly, all module language constructs are eliminated from the program at compile time, using a technique called static interpretation :cite:`elsman99,Annenkov:phdthesis`. As a consequence, from a programmer's perspective, there is no overhead involved with making use of module language features.
Each source file is implicitly a module, but we can also define modules inside a file via the module language. This means we are actually defining nested modules - nested inside the module defined by the file itself. To understand how modules work, it is useful to ignore their relation to files at first - in contrast to most other languages, it is mostly incidental, as files are not the foundation of the Futhark module system.
At the most basic level, a module (called a structure in Standard ML) is a collection of declarations:
module add_i32 = { type t = i32 def add (x: t) (y: t): t = x + y def zero: t = 0 }
Declarations are value bindings, type bindings, module bindings, and a few other things that are allowed to occur at the top level.
After the module binding above, add_i32.t
is an alias for the type
i32
, and add_i32.add
is a function that adds two values of
type i32
. The only peculiar thing about this notation is the equal
sign before the opening brace. The declaration above is actually a
combination of a module binding
module add_i32 = ...
and a module expression
{ type t = i32 def add (x: t) (y: t): t = x + y def zero: t = 0 }
In this case, the module expression encapsulates a number of declarations enclosed in curly braces. In general, as the name suggests, a module expression is an expression that returns a module. A module expression is syntactically and conceptually distinct from a regular value expression, but serves much the same purpose. The module language is designed such that evaluation of a module expression can always be done at compile time.
Apart from a sequence of declarations, a module expression can also be merely the name of another module
module foo = add_i32
Now every name defined in add_i32
is also available in the module
foo
. At compile-time, only a single version of the add
function is defined, so there is no overhead involved.
As a starting point, every name defined by a declaration inside of a
module will be visible outside that module. We can make a declaration
invisible to users of the module by prefixing it with local
:
module m = { local def helper x = x + 2 def f x = helper (helper x) }
In this contrived example, m.f
will be visible, but m.helper
will not. Do not use local
to hide the definitions of types - it
will not work. In :numref:`module-types` we'll see facilities for
making types abstract.
To make the names of a module available without having to prefix the
module name, you can use the open
declaration. For example, after
the definition above, we can use open m
to make the function f
available in the rest of the current module and to users of the
current module. This is an important but somewhat subtle detail:
module m2 = { open m }
This makes m2.f
available because m
exposes a binding f
.
If you don't want this behaviour, use local open
.
While Futhark's module system is not file oriented, there is still a close interaction. You can access code in other files as follows:
import "module"
The above declaration will include all non-local
top-level
definitions from module.fut
and make them available in the current
module, but will not make them available to users of the module.
The .fut
extension is implied.
You can also include files from subdirectories::
import "path/to/a/file"
The above will include the file path/to/a/file.fut
relative to the
including file.
If we are defining a top-level function (or any other top-level
construct) that we do not want to be visible outside the current file,
we can prefix it with local
:
local def i_am_hidden x = x + 2
The above uses import
as a declaration. We can also use it as a
module expression. This makes qualified imports possible:
module M = import "module"
In fact, a plain import "module"
declaration is equivalent to:
local open import "module"
This declaration opens "module"
in the current file, but does not
propagate its contents to modules that in turn import
the current
file. If we wished to re-export names from another file, we would
say:
open import "module"
What we have seen so far is nothing more than a simple namespace mechanism. The ML module system only becomes truly powerful once we introduce module types and parametric modules (in Standard ML, these are called signatures and functors).
A module type is the counterpart to a value type. It describes which
names are defined, and as what. We can define a module type that
describes add_i32
:
module type i32_adder = { type t = i32 val add : t -> t -> t val zero : t }
As with modules, we have the notion of a module type expression. In this case, the module type expression is a sequence of specifications enclosed in curly braces. A specification specifies how a name must be defined: as a value (including functions) of some type, as a type abbreviation, or as an abstract type (which we will return to later).
We can assert that some module implements a specific module type via a module type ascription:
module foo = add_i32 : i32_adder
Syntactic sugar lets us move the module type to the left of the equal sign:
module add_i32: i32_adder = { ... }
When we are ascribing a module with a module type, the module type functions as a filter, removing anything not explicitly mentioned in the module type:
module bar = add_i32 : { type t = i32 val zero : t }
An attempt to access bar.add
will result in a compilation error,
as the ascription has hidden it. This is known as an opaque
ascription, because it obscures anything not explicitly mentioned in
the module type. The module system in Standard ML supports both opaque
and transparent ascription, but in Futhark we support only opaque
ascription. This example also demonstrates the use of an anonymous
module type. Module types are structural (just like value types), and
are named only for convenience.
We can use type ascription with abstract types to hide the definition of a type from the users of a module:
module speeds: { type thing val car : thing val plane : thing val futhark : thing val speed : thing -> i32 } = { type thing = i32 def car: thing = 0 def plane: thing = 1 def futhark: thing = 2 def speed (x: thing): i32 = if x == car then 120 else if x == plane then 800 else if x == futhark then 10001 else 0 -- will never happen }
The (anonymous) module type asserts that a distinct type thing
must exist, but does not mention its definition. There is no way for a
user of the speeds
module to do anything with a value of type
speeds.thing
apart from passing it to speeds.speed
. The
definition is entirely abstract. Furthermore, no values of type
speeds.thing
exists except those that are created by the speeds
module.
It is a common pattern to define generic modules with abstract types, that are then specialised or refined with concrete types. For example, consider a module type describing monoids:
module type monoid = { type t val add : t -> t -> t val zero : t }
This module type specifies the presence of an abstract type t
,
as well as a function operating on values of type t
. But in many
cases when we define modules implementing this module type, we don't
want the module to operate on an abstract type. Consider if we did
this:
module monoid_i32 = add_i32 : monoid
Now the function monoid_i32.add
operates on some abstract type
t
, rather than on i32
, and the only value of that type that we
can access is monoid_i32.zero
. This is not particularly useful.
Instead, what we want to state is that the module implements
monoid
, but specifically for the case where t
is i32
.
This can be done with a module type refinement:
module monoid_i32 = add_i32 : monoid with t = i32
Here, monoid with t = i32
is a module type expression that
produces another module type. In this case, the resulting module type
is equivalent to monoid
, but with t
replaced with i32
everywhere. We can also bind the resulting module type to a name if
we wish:
module type i32_monoid = monoid with t = i32
This is completely equivalent to writing out the module type in full:
module type i32_monoid = { type t = i32 val add : i32 -> i32 -> i32 val zero : i32 }
As with all other Futhark types, module types are completely identified by their structure, not their names. Binding module types to names is done only for brevity. This makes them dissimilar to the "interfaces" of most other programming languages, which are identified by specific names.
While module types serve some purpose for namespace control and
abstraction, their most interesting use is in the definition of
parametric modules. A parametric module is conceptually equivalent to
a function. Where a function takes a value as input and produces a
value, a parametric module takes a module and produces a module. For
example, we can define a parametric module that accepts a module
satisfying the monoid
module type given above, and produces a
module containing a function for collapsing an array
module sum (M: monoid) = { def sum (a: []M.t): M.t = reduce M.add M.zero a }
There is an implied assumption here, which is not captured by the type
system: The function add
must be associative and have zero
as
its neutral element. These constraints come from the parallel semantics
of reduce
, and the algebraic concept of a monoid. Notice that in
monoid
, no definition is given of the type t
---we only assert
that there must be some type t
, and that certain operations are
defined for it.
We can use the parametric module sum
as follows:
module sum_i32 = sum add_i32
We can now refer to the function sum_i32.sum
, which has type
[]i32 -> i32
. The type is only abstract inside the definition of the
parametric module. We can instantiate sum
again with another module,
this time an anonymous module:
module prod_f64 = sum { type t = f64 def add (x: f64) (y: f64): f64 = x * y def zero: f64 = 1.0 }
The function prod_f64.sum
has type []f64 -> f64
, and computes
the product of an array of numbers (we should probably have picked a
more generic name than sum
for this function).
Operationally, each application of a parametric module results in its definition being duplicated and references to the module parameter replace by references to the concrete module argument. This is quite similar to how C++ templates are implemented. Indeed, parametric modules can be seen as a simplified variant with no specialisation, and with module types to ensure rigid type checking. In C++, a template is type-checked when it is instantiated, whereas a parametric module is type-checked when it is defined.
Parametric modules, like other modules, can contain more than one
declaration. This feature is useful for giving related functionality a
common abstraction, for example to implement linear algebra operations
that are polymorphic over the type of scalars. The following example
uses an anonymous module type for the module parameter and the
open
declaration for bringing the names from a module into the
current scope:
module linalg(M : { type scalar val zero : scalar val add : scalar -> scalar -> scalar val mul : scalar -> scalar -> scalar }) = { open M def dotprod [n] (xs: [n]scalar) (ys: [n]scalar) : scalar = reduce add zero (map2 mul xs ys) def matmul [n] [p] [m] (xss: [n][p]scalar) (yss: [p][m]scalar) : [n][m]scalar = map (\xs -> map (dotprod xs) (transpose yss)) xss }