Skip to content

Merlin Language Reference

Selina edited this page Jul 8, 2024 · 5 revisions

ESMira allows you to create some custom behavior through scripts (see the general Scripting page for more info). These scripts are written in Merlin, ESMira's own scripting language. This page will give you an overview of the language, which you can use together with the Examples page to achieve your goal. Also see the Native Functions page for an overview of builtin functions you can use in your scripts.

Script Structucre

Merlin is mostly a C-style language, and inherits some of its syntactic features. Most statements are ended by a semicolon (;). Blocks are enclosed by curly braces ({}). Whitespace is mostly irrelevant, so indentation isn't important.

Identifiers

Identifiers (i.e., names of code-variables, functions, etc.) can be any string of upper or lower case alphabetic characters (a to z and A to Z), digits (0 to 9), or the undescore character (_), with the restriction that they may not start with a digit. Identifiers are case-sensitive, meaning that tomato, Tomato and TOMATO would be treated as three different identiers.

Keywords

Apart from the restrictions mentioned in the section above, certain keywords are reserved as part of the language. The following table lists the keywords used by Merlin:

Keyword Description
if see if statements
elif see if statements
else see if statements
for see for statements
in see for statements
while see while statements
function declares a function
and logical and operator
or logical or operator
return see return statements
init see initialization blocks
object creates a new object
true convenience literal, equates to 1
false convenience literal, equates to 0
none none value literal, see types

Types

Merlin is dynamically typed and doesn't use type annotations in the code. Internally the language uses the following types.

Numbers

All numbers in Merlin are represented as double precision floating point numbers. Number literals consist of one or multiple digits (0 to 9), optionally followed by a dot (.) and further digits. Note that when numbers without a decimal part are printed are outputted as strings (e.g., when saved as a questionnaire variable), the decimal point is omitted.

Strings

Strings store text values. A string literal starts with a double quotation mark (") followed by any number of characters up until another double quotation mark is encountered. Merlin also supports multi-line strings.

Strings try to auto-coerce to numbers wherever possible. This means that "5" * "2"; is a valid expression, evaluating to the number 10.

Arrays

Arrays can hold multiple other values of any type. Array literals start with an opening bracket ([) followed by any number of values, separated by commas (,), and closed off by a closing bracket (]).

Accessing values in an array also uses brackets. Important note: The index of the first element is 1, not 0. Append an array variable by brackets containing an index expression, which is either a number or an array of numbers. If a number is used, the array will return the element at that index. If an array of numbers is used, the array will return a new array containing the elements at all the indices in the index array.

a = [ "cat", "dog", "squirrel" ]; // Creates an array with the elements "cat", "dog", and "squirrel".
a[2] = "bear"; // Sets the second element to "bear". The array now contains "cat", "bear", "squirrel".
small_animals = a[[1, 3]]; // Sets small_animals to an array containing the first and third element of a, i.e., "cat" and "squirrel".
horde_of_squirrels = a[[3, 3, 3, 3]]; // The same index can appear multiple times in an index array, so horde_of_squirrels now contains four elements, all with the value "squirrel".

Arrays allow most of the operations that numbers do, and will try to apply them to all their elements.

a = [1, 2, 3]; // Creates an array with the elements 1, 2, and 3.
b = a * 3; // Creates a new array by multiplying each element of a with 3. b now contains the numbers 3, 6, and 9.

This also works with arrays.

a = [1, 2]; // Creates an array with the elements 1 and 2.
b = [10, 100]; // Creates an array with the elements 10 and 100.
c = a * b; // Creates a new array by multiplying the elements of a and b in order. C now contains the values 10 (1 * 10) and 200 (2 * 100).

If the arrays are of different sizes, the smaller array will wrap around as many times as necessary to cover the bigger array.

a = [1, 2, 3, 4, 5]; // Creates a new array with the elements 1, 2, 3, 4, and 5.
b = [0, 1]; // Creates a new array with the elements 0 and 1.
c = a * b. // Creates a new array by multiplying a and b. c now contains the values 0 (1 * 0), 2 (2 * 1), 0 (3 * 0), 4 (4 * 1), and 0 (5 * 0).

The previous example works by aligning the arrays as follows.

[ 1, 2,   3, 4,  5 ]
[ 0, 1 ][ 0, 1][ 0, <- The smaller array is cut off for the last repetition

Objects

Objects in Merlin are very simple. They are essentially just boxes of named values. The object keyword creates an empty object. There is no syntax for object literals. Once created, an object's fields can be accessed by a dot (.), followed by an identifier.

o = object; // Creates a new empty object.
o.animal = "squirrel"; // Stores the value "squirrel" in the field animal of the object o.
o.one = 1; // Stores the value 1 in the field one.
o.two = 2; // Stores the value 2 in the field two.
o.one + o.two; // Evaluates to 3.

None

Merlin has a none type, similar to null, representing a missing value. Some operations and functions may return none, for example when retrieving a value from a questionnaire item that hasn't been filled out. A none value can also be explicitly created with the none keyword.

Note that none will "swallow" other types in most operations. E.g., 1 + none evaluates to none. Thus, none can safely be used in calculations without causing runtime errors, but the calculation will most likely result in a none value.

Truthiness

Merlin does not have an explicit boolean type. Hovever, in contexts like the condition of an if statement, other types can be evaluated for their truthiness. The following table shows the truthiness rules for each type.

Type Description
Number A number is truthy when it is non-zero. A number is falsy when it is 0.
String If a string can be converted to a number, the rules for numbers apply. Othewise, an empty string is falsy, and any other string is truthy.
Array An array is truthy when all of its elements are truthy. Truthiness-rules are applied recursively.
Object An empty object is falsy, any other object is truthy.
none none is always falsy.

Operators

The following table shows the available operators, listed from highest to lowest precedence.

Operator Description
( ) Grouping. Parentheses group an expression in order to elevate its precedence.
x.field \n x[index] \n f() Object access, array access, and function calls.
f() >> g() Function pipe operator. Inserts the result of f() as the first argument of g(), making it equivalent to g(f()). Multiple functions can be chained with this operator, and will be evaluated left to right.
!x \n -x Boolean not and unary negation.\nAs Merlin lacks a boolean type, the boolean not will invert the truthiness of a numeric value by setting any non-zero number to 0, and setting a 0 to 1.
x:y\nx:y:z Sequence operator. Creates a an array of sequential numbers from x to y (inclusive). Perdefault, the step size is 1, but it can also be specified with z. If the step size is too big for the sequence to end on y, the sequence will extend end with the next number after y. z must be positive. A descending sequence can be created by specifying a y smaller than x.
x * y \n x \ y Multiplication and Division.
x + y \n x - y \n x .. y Addition, Subtraction, and Concatenation. The concatenation operator can concatenate two strings or two arrays. If left operand is an array and the right operand is not, the right operand will be appended to the array. Addition of two non-numeric strings also results in concatenation.
x > y\nx >= y\nx < y\nx <= y Comparison.
x == y\nx != y Equality.
x and y Logical and. See note on logical and/or and arrays.
x or y Logical or. See note on logical and/or and arrays.
x = y Assignment.

Note on Logical AND/OR and Arrays

The behavior of the and and or operatos depends on the type of the first operand. If the first operand is anything other than an array, that operator will be evaluated for its truthiness, and the operation might short-circuit. I.e., if the first operand of an and expression is falsy, it will be returned without evaluating the second operand, and similarly if the first operand of an or expression is truthy, it will be returned without evaluating the second operand. Otherwise the second operand will be evaluated and returned (even if it may evaluate to an array).

However, if the first operand evaluates to an array, the operation will return an array where the and/or operation has been applied to all elements.

none and 3; // This will evaluate to none, as the left operand is falsy and will be returned.
2 and 3; // This will evaluate to 3, as both the left and right operands are truthy so the right one will be returned.
[0, 2] and 3; // This will evaluate to [0, 3], as an and operation is applied to each element.
[0, 2] and [2, 0] // And operations with arrays for both operands work as well. This will evaluate to [0, 0].

Comments

Code comments start with //. The remaining line after // will be ignored by the script.

If Statements

Code may branch with if statements. These start with an if, followed by a condition expression enclosed in parentheses and a statement that should be executed if the condition is true/truthy. This first branch may be followed by any number of elif branches, which are each also followed by a condition in parentheses and a statement. The statement also may have an optional closing else (without a condition).

if(x == 1)
    y = y + 1; // y gets incremented if x is 1.
elif(x == -1) { // The statement may also be a block statement, allowing multiple statements in a branch.
    y = y - 1;
    z = 0;
} else
    y = 0; // This gets executed if x is neither 1 nor -1.

Loops

Merlin supports two kinds of looks, while and for loops. while loops require a condition. The contained code will be run over and over as long as the condition evaluates as truthy.

x = 10;
while(x > 0) {
    x = x - 1;
}

For loops are based on arrays. They follow the pattern for(variableName in array). The body of the for loop will be executed once for each element in the array. For each iteration, a variable with the specified variable name will be available, set to the corresponding arary element.

for(fruit in ["orange", "apricot", "apple"])
    "I'm eating an "..fruit.."."; // Will in turn evaluate to "I'm eating an orange.", "I'm eating an apricot.", and "I'm eating an apple.".

for(i in 1:10){ // Iterating over a range of numbers is also easy with sequence expressions.
    i; // Evaluates in turn to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
}

Functions

Functions can be created with the function keyword, according to the following pattern:

function function_name(parameter_1, parameter_2) {
    // function code here
}

Functions can be created in any scope, and they can overload or shadow existing functions, depending on whether a function af the same name exists in the same or an enclosing scope, respectively. A function can return a value with the return keyword. Otherwise it will return when it is fully executed. See Return Values below.

Return Values

This section explains what value is returned at which point. This is also linked to expression statements. If an expression statement, i.e., an expression without any surrounding statement, is evaluated, the resulting value is stored in the current scope. If the scope, i.e., the current function or script as a whole, ends, the stored value is returned.

function f() {
    "apricot"; // This expression sets the implicit return value to the string "apricot".
}

function g() {} // The initial implicit return value is none.

f(); // This will return the string "apricot".
g(); // Thsi will return none.

A return statement can be used with or without an explicit return value. If the return keyword is followed by an expression, the value of that expression will be returned. If the return keyword is used on its own, the current implicit return value will be returned.

function f(x) {
    if(x == 1)
        return; // Returns the initial implicit none return value.
    "apricot"; // Sets the implicit return value to "apricot".
    if(x == 2)
        return; // Returns the implicit return value "apricot".
    elif(x == 3)
        return "orange"; // Returns the explicit return value "orange".
} // If the end of the function is reached, the last implicit return value, "apricot", will be returned.

f(1); // Retuns none.
f(2); // Returns "apricot".
f(3); // Returns "orange".
f(4); // Returns "apricot".

Note that the script as a whole may also have a return value that is used by ESMira, e.g., for determining the relevance of a questionnaire item.

Globals and the Init Block

Merlin scripts are expected to be executed in a recurring fashion. Every time a user opens a questionnaire in ESMira, the scripts within that questionnaier may be run. In some use-cases it might be desirable to retain calculated values and recall them when a questionnaire is executed again, e.g., to calculate a moving average, or when another questionnaire is executed, e.g., to disable a question that is made irrelevant by information from an initial demographic questionnaire.

To achieve this, ESMira manages a globals object in Merlin's script environment. After a script is executed, ESMira will exract the current state of the globals object, save it, and restore and inject it into the environment of new scripts. This means that initially, there will be an empty globals object in every script. If any fields are set on globals, these values will be present in subsequent scripts. Globals will be persistent between all script executions within the same study.

As mentioned above, globals is initially empty, and trying to access any field would result in a runtime error. However, in many cases you want to store and later retrieve an earlier value. In order to easily do this, you can use an init block. The code within an init block is executed normally, with the exception that assignment within an init block cannot overwrite existing values. Therefore you can use an init block to set an initial value, making sure that it exists for other code to use, without needing to worry that this code might overwrite anything later.

The following example shows how to calculate a cumulative average of an item called demo_item using an init block and the globals object.

init {
    globals.values = []; // If no values field exists on globals, this will create it.
}

new_value = getQuestionnaireVar("demo_item"); // Retrieving the new value from the questionnaire.
if(new_value != none) { // Avoid using missing values for the average.
    globals.values = globals.values..new_value; // The new value is appended to the values field of globals.
}
m = mean(globals.values); // m is set to the mean of the values array.
... // Do whatever you need to do with that mean ...
Clone this wiki locally