Another post.

diku-dk · Oct 13, 2023 · 3a12d23 · 3a12d23
1 parent 056ad5d
commit 3a12d23
Showing 1 changed file with 313 additions and 0 deletions.
diff --git a/blog/2023-10-13-smlfut.md b/blog/2023-10-13-smlfut.md
@@ -0,0 +1,313 @@
+---
+title: Bridging Futhark and SML
+description: Connecting Futhark to the industrial Standard ML language will surely bring us into the mainstream.
+---
+
+Futhark is not a general purpose language, and Futhark programs are
+typically used as libraries from other languages. This mainly happens
+[through a C-based
+API](https://futhark-lang.org/blog/2017-09-26-calling-futhark-from-c-and-haskell.html),
+but of course some programmers do not like writing their application
+code in C. Fortunately, due to C's popularity, most languages have
+ways of invoking code written in C. The [raw C
+API](https://futhark.readthedocs.io/en/latest/c-api.html) exposed by a
+compiled Futhark program is not particularly convenient by modern
+standards, so it is usually a good idea to write a bit of glue code to
+wrap the C types in more ergonomic high level types. Construction of
+such glue code can be automated, and we call such a generator a
+[bridge](https://futhark-lang.org/docs.html#bridges). Bridges already
+exist for languages such as
+[Haskell](https://gitlab.com/Gusten_Isfeldt/futhask),
+[Python](https://github.com/pepijndevos/futhark-pycffi/), Rust
+([three](https://github.com/Erk-/genfut)
+[of](https://github.com/zshipko/futhark-bindgen)
+[them](https://github.com/luleyleo/cargo-futhark)), and
+[OCaml](https://github.com/zshipko/futhark-bindgen). All of these were
+written by people not directly involved with the Futhark compiler
+itself. In this post I will discuss the construction of a new bridge,
+written by myself, for the [Standard ML](https://smlfamily.github.io/)
+(SML) language. It may be interesting to people who also need to worry
+about language interoperability.
+
+## Standard ML
+
+SML is a functional language stretching back to the 70s, and was one
+of the main drivers behind language features that are now common, such
+as Hindley-Milner type inference. It was also arguably one of the
+first statically typed functional languages that were generally
+usable; with several production-grade compilers available since the
+early 90s. SML is also unusual for being perhaps the *only* industrial
+programming language with [a formal
+definition](https://github.com/SMLFamily/The-Definition-of-Standard-ML-Revised).
+
+Today, SML is nearly dead. The reasons why languages such as OCaml and
+Haskell managed to overtake SML in popularity are interesting, but a
+different topic. However, *nearly dead* is not dead, and not only are
+multiple high quality SML compilers still maintained
+([MLton](http://mlton.org/), [MLKit](https://elsman.com/mlkit/),
+[Poly/ML](https://www.polyml.org/),
+[SML#](https://smlsharp.github.io/en/),
+[SML/NJ](https://www.smlnj.org/), and more), [interesting applications
+written in SML](https://isabelle.in.tum.de/) are also still around,
+and [cutting edge research on language
+implementation](https://github.com/MPLLang/mpl) is still being
+conducted using SML compilers. Personally, SML was the first language
+I was taught at [DIKU](https://di.ku.dk/), and although I dismissed it
+at the time when compared to Common Lisp and Haskell, I eventually
+started to appreciate its simplicity. Finally, I work about 10 metres
+from the maintainers of two *different* SML implementations (MLKit and
+[MosML](https://mosml.org/)). Despite its obscurity, SML is
+over-represented in my daily life and research.
+
+## Bridge construction
+
+Futhark's C API is designed to be simple to use not just from C, but
+also from languages that have fairly crude facilities for invoking C.
+We call the language calling Futhark-generated C code the *bridge
+language*. In particular, all types Futhark exposes are *fixed size*:
+either by being standard primitive types (e.g. `int32_t`), or by being
+pointers. Users of the API *never* have to allocate memory on the C
+heap or use `sizeof` or similar. For example, constructing a
+configuration object is done with a function with the type
+
+```C
+struct futhark_context_config *futhark_context_config_new(void);
+```
+
+where `struct futhark_context_config` is an opaque struct. This means
+that you cannot use the common performance trick of allowing the
+caller to allocate the memory in advance, but in practice most Futhark
+functions do so much internal work that this would not be a meaningful
+performance advantage. Not all languages can easily interoperate with
+unpredictably sized C structs, but most of them can figure out how to
+pass pointers around.
+
+Some functions *do* require the caller to allocate memory in advance,
+such as the function that copies data from a Futhark array (which may
+reside on the GPU) to some location in memory. We call this a `values`
+function, and for arrays of type `[]f64` it might look like this:
+
+```C
+int futhark_values_f64_1d(struct futhark_context *ctx, struct futhark_f64_1d *arr, double *data);
+```
+
+The `data` argument must point to some place in memory with enough
+room for the entire array. The size of the array is obtained with a
+different API function and multiplied with the element size, which is
+statically predictable.  There are two subtleties here:
+
+1. While most languages allow you to create an empty array of some
+   size, passing this array as a pointer to a C function is not
+   necessarily straightforward. This is only safe if the language
+   guarantees that the in-memory representation of the array is
+   "unboxed"; meaning it corresponds to what C expects. If the
+   language does not guarantee this, you need to allocate raw memory,
+   ask Futhark to copy the array to that location, and then copy the
+   elements into the bridge language array - which can be quite slow.
+
+2. The `values` function is allowed to copy *asynchronously*, meaning
+   the copy might still be physically ongoing when the function
+   returns. This allows overlapping copies with other work, which can
+   be important for performance. Unfortunately, many of the potential
+   bridge languages use *garbage collection*, and most garbage
+   collectors do not guarantee that objects do not move around in
+   memory. If an array was moved whilst Futhark was copying data into
+   it, the results would likely be disastrous. Some languages allow
+   one to allocate non-movable "pinned" memory, but unless great care
+   is taken, it is best to generate glue code that performs a full
+   synchronisation after calling the `values` function.
+
+Another minor thing, but which turns out to be a big convenience, is
+that the Futhark compiler will emit a
+[manifest](https://futhark.readthedocs.io/en/latest/c-api.html#manifest)
+(a JSON file) that describes the entire generated API in a
+machine-readable way. This is useful, as which functions are available
+depends on which entry points are defined in the Futhark program, and
+which types they expose. Prior to adding the manifest, bridges had to
+analyse the generated C header file, which is awkward and error-prone.
+
+Futhark's C API reports errors via return codes, following the usual
+0-means-success convention. This is notoriously error prone and
+tedious to handle in C, but at least it is very easy to wrap in
+whichever error handling facility is customary in the bridge language,
+such as exceptions.
+
+Memory management is a more tricky business. Futhark does not allow
+the construction of circular data structures and thus internally uses
+reference counting, but it has no way of knowing when the user is done
+with some piece of data. Therefore the user is responsible for
+eventually freeing all data returned by Futhark, using a
+Futhark-provided function (that in practice just decrements a
+reference count). While C programmers are famously known for *never*
+making mistakes when doing manual memory management, programmers of
+more high level languages are more imperfect, and therefore it is
+advisable for a bridge to hook into the automatic memory management
+(reference counting,
+[finalizers](https://en.wikipedia.org/wiki/Finalizer)) of the bridge
+language.
+
+## Implmentation of `smlfut`
+
+The Futhark/SML-bridge has been implemented as a program
+[smlfut](https://github.com/diku-dk/smlfut). You pass it a Futhark
+manifest file and it will spit out SML code as well as a bit of glue C
+code. I had hoped to avoid the need for generating C, but I needed to
+do a few (uninteresting) things that could not be expressed directly
+in SML.
+
+SML has a [very nice module
+system](https://jozefg.bitbucket.io/posts/2015-01-08-modules.html)
+that allows one to write a *signature* that abstractly describes the
+interface implemented by a module. `Smlfut` makes good use of this.
+For example, the Futhark program
+
+```Futhark
+def inc (xs: []i32) = map (+2) xs
+```
+
+currently gives rise to this collection of SML signatures:
+
+```SML
+signature FUTHARK_POLY_ARRAY =
+sig
+  type array
+  type ctx
+  type shape
+  type elem
+  val new: ctx -> elem ArraySlice.slice -> shape -> array
+  val free: array -> unit
+  val shape: array -> shape
+  val values: array -> elem Array.array
+  val values_into: array -> elem ArraySlice.slice -> unit
+end
+
+signature FUTHARK_OPAQUE =
+sig
+  type t
+  type ctx
+  val free : t -> unit
+end
+
+signature FUTHARK_RECORD =
+sig
+  include FUTHARK_OPAQUE
+  type record
+  val values : t -> record
+  val new : ctx -> record -> t
+end
+
+signature FUTHARK = sig
+  val backend : string
+  val version : string
+
+  type ctx
+  exception error of string
+  type cfg = {logging:bool, debugging:bool, profiling:bool, cache:string option}
+
+  structure Config : sig
+    val default : cfg
+    val logging : bool -> cfg -> cfg
+    val debugging : bool -> cfg -> cfg
+    val profiling : bool -> cfg -> cfg
+    val cache : string option -> cfg -> cfg
+  end
+
+  structure Context : sig
+    val new : cfg -> ctx
+    val free : ctx -> unit
+    val sync : ctx -> unit
+  end
+
+  (* []i32 *)
+  structure Int32Array1 : FUTHARK_POLY_ARRAY
+  where type ctx = ctx
+    and type shape = int
+    and type elem = Int32.int
+
+  structure Opaque : sig
+  end
+
+  structure Entry : sig
+    val inc : ctx -> Int32Array1.array -> Int32Array1.array
+  end
+end
+```
+
+The particularly interested can [read the fine
+manual](https://github.com/diku-dk/smlfut/releases/download/latest/smlfut.pdf),
+but the most interesting thing is that Futhark arrays must be
+associated with some SML type, here the built-in polymorphic `array`
+type, such that Futhark arrays can be converted to SML-comprehensible
+data. More on that design decision in a bit.
+
+It is also interesting (for people who find that sort of thing
+interesting) that identifiers from the Futhark program can influence
+identifiers in the SML program; in this case, the name of the entry
+point `inc`. What happens if a Futhark entry point has a name that is
+not a valid SML identifier? For `smlfut`, I have decided to emit an
+error. Users can always pick better public names, and this is better
+than coming up with complicated escaping rules. (This rule of course
+does not apply to names that are not visible in the API.)
+
+### Compiler support
+
+Since SML has so many compilers in use, my initial plan was to support
+several of them. However, due to unexpected technical friction, only
+MLton (and its fork [MPL](https://github.com/MPLLang/)) is currently
+supported. The main obstacle is that while the [SML Basis
+Library](https://smlfamily.github.io/Basis/top-level-chapter.html)
+defines an interface for [monomorphic
+arrays](https://smlfamily.github.io/Basis/mono-array.html), which I
+would expect implies an unboxed representation, only MLton actually
+seems to implement this interface for all necessary primitive types.
+For example, MLKit (generally my favourite SML compiler) only
+implements monomorphic arrays for the types `Char8` (`uint8_t`) and
+`Real` (`double`). SML/NJ, which generally has good support for the
+Basis Library (they invented it), also has a rather anemic selection
+of monomorphic arrays. This is ironic because in MLton, *polymorphic
+arrays also have an unboxed monomorphic representation*, so I don't
+actually need to use monomorphic array types in order to access their
+elements from C. Since `smlfut` already has a command line option for
+switching between monomorphic and polymorphic array interfaces, I am
+considering adding a switch that allows one to copy Futhark arrays
+into `Char8` monomorphic arrays, and make it to to the user to
+interpret the raw bytes as proper numbers.
+
+While [MLton's FFI](http://mlton.org/ForeignFunctionInterface) is
+somewhat basic, the Futhark C API was designed to accomodate this, and
+I did not encounter any real obstacles. I did not get very far with
+the MLKit implementation, but this was due to array type challenges,
+not the FFI.
+
+### Resource management
+
+While MLton *does* support finalizers, neither MLKit nor MPL does. As
+a result, `smlfut` requires manual resource management by the user.
+There is no nice way to put it: this is a *major* downside. I hope to
+come up with a solution somehow, but beyond simply using finalizers
+*when available*, I'm not sure what to do.
+
+### What does it feel like to `smlfut`?
+
+It feels good, if somewhat verbose. This is a small chunk of SML code
+that makes use of the Futhark program above:
+
+```SML
+val ctx =
+  Futhark.Context.new Futhark.Config.default
+val arr_in =
+  Futhark.Int32Array1.new ctx (ArraySlice.full (Array.fromList [1, 2, 3])) 3
+val arr_out =
+  Futhark.Entry.inc ctx arr_in
+val arr_sml =
+  Futhark.Int32Array1.values arr_out
+val () =
+  Futhark.Int32Array1.free arr_in
+val () =
+  Futhark.Int32Array1.free arr_out
+val () =
+  Futhark.Context.free ctx
+```
+
+And in particular, it has already proven useful for some work I've
+been invited to participate in. More on that in the future, hopefully.