-
Notifications
You must be signed in to change notification settings - Fork 54
Insts
Till now, we have assumed that variables may be instantiated or not, but there is more to it.
An "instantiatedness tree" has two types of nodes, and approximates the instantiation state of a term if, for every node:
-
If a node in the instantiatedness tree is "free" then the corresponding node in the term tree is a free variable that does not share with any other variable.
-
If a node in the instantiatedness tree is "bound" then the corresponding node in the term tree is a function symbol, possibly one of several alternatives. The instantiatedness tree does not tell us which function symbol the term is bound to. Instead, for each possible function symbol it tells us exactly which arguments of the function symbol will be free and which will be bound. The same principle applies recursively to these bound arguments.
The mode system provides free
and ground
as names for instantiatedness
trees all of whose nodes are free and bound respectively (with exceptions
for solver type values, which we will not get into).
Examples of instantiatedness trees, or "insts", in concrete Mercury syntax:
-
ground
approximates1
,"foo"
,[a, b, c]
and any other non-unique term without free variables -
bound(1)
approximates the term1
-
bound(1 ; 2 ; 3)
approximates the terms1
,2
and3
-
bound(no ; yes(free))
approximatesno
andyes(V)
where V is a free variable
The last example shows that insts were designed to be capable of expressing partially instantiated data structures (data structures with holes). However, partial instantiation is not supported by the current Mercury compiler.
You can declare names for instantiatedness trees:
:- inst dog_or_cat == bound(dog ; cat).
:- inst list == bound([] ; [ground | list]).
A more readable syntax exists for bound
instantiatedness trees.
The list example could also be written as:
:- inst list
---> []
; [ground | list].
This mirrors the type definition so you can see the corresponding nodes in both trees easily:
:- type list(T)
---> []
; [T | list(T)].
The inst list
is not useful because is equivalent to ground
.
We can omit function symbols to make an inst more precise,
restricting the set of terms that it approximates.
A non-empty list is not bound to []
at the top-level so an inst for
non-empty lists is:
:- inst non_empty_list
---> [ground | ground].
Inst definitions may be parametric. If we have:
:- type pet
---> dog
; cat
; rock.
:- inst dog_or_cat
---> dog
; cat.
:- inst list(Inst)
---> []
; [Inst | list(Inst)].
then a list with inst list(dog_or_cat)
can contain dogs
and cats
,
but not rocks
.
A mode mapping is a mapping from an initial instantiatedness tree to
a final instantiatedness tree. We have used a couple of mode mappings
already without explaining them as such. The in
and out
modes are
defined in the builtin
module:
:- mode in == ground >> ground.
:- mode out == free >> ground.
On the left and right hand sides of the >>
symbol are the initial and
final insts. The mode in
says that the term will be approximated by
ground
before the goal and after the goal. The mode out
says that
the term must be approximated by free
before the goal, but ground
after the goal.
Mode definitions may also be parametric.
The builtin
module defines these modes:
:- mode in(Inst) == Inst >> Inst.
:- mode out(Inst) == free >> Inst.
You can define your own mode mappings, though you usually won't need to.
The compiler tracks the instantiation state of variables in a procedure using insts. Consider this silly example:
:- type pet
---> dog
; cat
; rock.
:- pred choose(int, pet, string).
:- mode choose(in, out, out) is semidet.
choose(Num, Pet, Name) :-
% program point 1
(
Num = 1,
Pet = dog
% program point 2
;
Num = 2,
Pet = cat
% program point 3
),
% program point 4
(
Pet = dog,
Name = "Rex"
;
Pet = cat,
Name = "Tiger"
),
% program point 5
true.
At the start of the procedure, program point 1, the instantiation states
of Num, Pet and Name are approximated by ground
, free
and free
respectively. This comes directly from the definitions of modes in
and out
.
At program point 2, Num = 1
must have succeeded, and Pet will be bound.
The insts approximating their instantiation states will become bound(1)
and bound(dog)
respectively.
On the other hand, the insts for Num and Pet at program point 3 will be
bound(2)
and bound(cat)
.
At program point 4, ground
would approximate the instantiation states
of Num and Pet, but that is unnecessarily imprecise.
Instead, the insts from program point 2 and program point 3 are merged:
the inst for Num becomes bound(1 ; 2)
, and the inst for Pet becomes
bound(dog ; cat)
.
This retains the information that Num can be bound to either 1
or 2
,
and Pet can be bound to either dog
or cat
, but does not capture
the fact that if Num is bound to 1
then Pet must be bound to dog
,
or if Num is bound to 2
then Pet must be bound to cat
.
After program point 4, we have a switch on Pet. The inst for Pet tells
us that Pet can be bound to one of two function symbols and, since both
alternatives are covered, the switch is complete. If the inst for Pet
had been ground
then the switch would be inferred to be incomplete
as it does not cover the function symbol rock
.
At program point 5, the inst for Pet remains bound(dog ; cat)
,
and the inst for Name becomes bound("Rex" ; "Tiger")
.
At the end of the procedure, the compiler checks that the final
instantiation states of the terms in the arguments are indeed approximated
by the corresponding final insts declared for each argument.
bound(1 ; 2)
is approximated by ground
(the final inst of in
) and
bound(dog ; cat)
and bound("Rex" ; "Tiger")
approximated by ground
(the final inst of out
), so all is well.
The head/2
predicate returns the first element of a list:
:- pred head(list(T), T).
:- mode head(in, out) is semidet.
head([X | _], X).
The declared mode is semidet
as head/2
must fail on an empty list.
We can declare another mode that restricts the first argument to non-empty
lists, using the non_empty_list
inst and the in(Inst)
mode presented
previously.
:- mode head(in(non_empty_list), out) is det.
which is the same as:
:- mode head(
non_empty_list >> non_empty_list,
free >> ground
) is det.
which further expands to:
:- mode head(
bound([ground | ground]) >> bound([ground | ground]),
free >> ground
) is det.
This mode can only be called when it is known that the first argument is
a non-empty list. As it will never fail, its determinism can be tightened
to det
, a nice improvement.
We can improve things a little more (but this is going beyond what most
Mercury code does). The output argument of head/2
has final inst
ground
so even if the caller had a more precise inst for the first
argument, when head/2
returns, the caller will get back an element
with inst ground
.
To retain the inst information from the first to the second argument,
we can make use of inst parameters.
First, we need to provide a parameterised definition of non_empty_list
.
:- inst non_empty_list(Inst)
---> [Inst | list(Inst)].
Then we can use that inst in the mode declaration of head/2
.
:- mode head(in(non_empty_list(I)), out(I)) is det.
The inst variable I
approximates the instantiation state of every
element of the input list. head/2
binds the second argument to an
element of the input list so, when it returns, the instantiation state
of the second argument can also be approximated by I
.
There is also an implicit constraint that I
must be compatible with
ground
, e.g. not free or unique. See "Constrained polymorphic modes"
in the reference manual for more information.
In practice, it is hard to maintain precise insts for too long.
Most Mercury predicates and functions, including the standard library,
only use the simple modes in
and out
so you very often end up
with ground
insts. At this time, it is probably impractical to use
polymorphic modes everywhere.
A somewhat common pattern that employs named insts is when breaking up a large predicate into smaller predicates. The top predicate switches on a variable, then each switch arm calls a helper predicate to handle the subset of function symbols that the variable could be bound to in that branch.
:- type foobar
---> foo(a, b, c, d, e)
; bar(f, g).
:- inst foo
---> foo(ground, ground, ground, ground, ground).
:- pred do_foobar(foobar, result).
:- mode do_foobar(in, out) is det.
do_foobar(FooBar, Result) :-
(
FooBar = foo(_, _, _, _, _),
do_foo(FooBar, Result)
;
FooBar = bar(_, _),
do_bar(FooBar, Result)
).
:- pred do_foo(foobar, result).
:- mode do_foo(in(foo), out) is det.
do_foo(Foo, Result) :-
Foo = foo(A, B, C, D, E),
...
Including the more precise inst in the mode declaration allows a helper
predicate like do_foo/2
to consider only a subset of the function
symbols of the argument's type. An alternative would be to pass each of
the foo/5
arguments separately to do_foo/2
.
Insts will also show up in the discussion of our next topic, higher-order programming in Mercury.
:- module insts.
:- interface.
:- import_module io.
:- pred main(io::di, io::uo) is det.
:- implementation.
:- import_module list.
/*
:- type list(T)
---> []
; [T | list(T)].
*/
% Already defined in the list module.
:- inst list(Inst)
---> []
; [Inst | insts.list(Inst)].
% Already defined in the list module.
:- inst non_empty_list
---> [ground | ground].
:- inst non_empty_list(Inst)
---> [Inst | insts.list(Inst)].
:- pred head(list(T), T).
:- mode head(in, out) is semidet.
% :- mode head(in(non_empty_list(I)), out) is det.
:- mode head(in(non_empty_list(I)), out(I)) is det.
head([X | _], X).
:- type pet
---> dog
; cat
; rock.
:- inst dog_or_cat
---> dog
; cat.
main(!IO) :-
% Try this:
% Pets = [],
% Try this:
% Pets = [dog, cat, rock],
Pets = [dog, cat],
head(Pets, Pet),
(
Pet = dog,
write_string("My first pet is a dog.\n", !IO)
;
Pet = cat,
write_string("My first pet is a cat.\n", !IO)
).