diff --git a/blog/2023-10-13-smlfut.md b/blog/2023-10-13-smlfut.md new file mode 100644 index 0000000..f3e615f --- /dev/null +++ b/blog/2023-10-13-smlfut.md @@ -0,0 +1,313 @@ +--- +title: Bridging Futhark and SML +description: Connecting Futhark to the industrial Standard ML language will surely bring us into the mainstream. +--- + +Futhark is not a general purpose language, and Futhark programs are +typically used as libraries from other languages. This mainly happens +[through a C-based +API](https://futhark-lang.org/blog/2017-09-26-calling-futhark-from-c-and-haskell.html), +but of course some programmers do not like writing their application +code in C. Fortunately, due to C's popularity, most languages have +ways of invoking code written in C. The [raw C +API](https://futhark.readthedocs.io/en/latest/c-api.html) exposed by a +compiled Futhark program is not particularly convenient by modern +standards, so it is usually a good idea to write a bit of glue code to +wrap the C types in more ergonomic high level types. Construction of +such glue code can be automated, and we call such a generator a +[bridge](https://futhark-lang.org/docs.html#bridges). Bridges already +exist for languages such as +[Haskell](https://gitlab.com/Gusten_Isfeldt/futhask), +[Python](https://github.com/pepijndevos/futhark-pycffi/), Rust +([three](https://github.com/Erk-/genfut) +[of](https://github.com/zshipko/futhark-bindgen) +[them](https://github.com/luleyleo/cargo-futhark)), and +[OCaml](https://github.com/zshipko/futhark-bindgen). All of these were +written by people not directly involved with the Futhark compiler +itself. In this post I will discuss the construction of a new bridge, +written by myself, for the [Standard ML](https://smlfamily.github.io/) +(SML) language. It may be interesting to people who also need to worry +about language interoperability. + +## Standard ML + +SML is a functional language stretching back to the 70s, and was one +of the main drivers behind language features that are now common, such +as Hindley-Milner type inference. It was also arguably one of the +first statically typed functional languages that were generally +usable; with several production-grade compilers available since the +early 90s. SML is also unusual for being perhaps the *only* industrial +programming language with [a formal +definition](https://github.com/SMLFamily/The-Definition-of-Standard-ML-Revised). + +Today, SML is nearly dead. The reasons why languages such as OCaml and +Haskell managed to overtake SML in popularity are interesting, but a +different topic. However, *nearly dead* is not dead, and not only are +multiple high quality SML compilers still maintained +([MLton](http://mlton.org/), [MLKit](https://elsman.com/mlkit/), +[Poly/ML](https://www.polyml.org/), +[SML#](https://smlsharp.github.io/en/), +[SML/NJ](https://www.smlnj.org/), and more), [interesting applications +written in SML](https://isabelle.in.tum.de/) are also still around, +and [cutting edge research on language +implementation](https://github.com/MPLLang/mpl) is still being +conducted using SML compilers. Personally, SML was the first language +I was taught at [DIKU](https://di.ku.dk/), and although I dismissed it +at the time when compared to Common Lisp and Haskell, I eventually +started to appreciate its simplicity. Finally, I work about 10 metres +from the maintainers of two *different* SML implementations (MLKit and +[MosML](https://mosml.org/)). Despite its obscurity, SML is +over-represented in my daily life and research. + +## Bridge construction + +Futhark's C API is designed to be simple to use not just from C, but +also from languages that have fairly crude facilities for invoking C. +We call the language calling Futhark-generated C code the *bridge +language*. In particular, all types Futhark exposes are *fixed size*: +either by being standard primitive types (e.g. `int32_t`), or by being +pointers. Users of the API *never* have to allocate memory on the C +heap or use `sizeof` or similar. For example, constructing a +configuration object is done with a function with the type + +```C +struct futhark_context_config *futhark_context_config_new(void); +``` + +where `struct futhark_context_config` is an opaque struct. This means +that you cannot use the common performance trick of allowing the +caller to allocate the memory in advance, but in practice most Futhark +functions do so much internal work that this would not be a meaningful +performance advantage. Not all languages can easily interoperate with +unpredictably sized C structs, but most of them can figure out how to +pass pointers around. + +Some functions *do* require the caller to allocate memory in advance, +such as the function that copies data from a Futhark array (which may +reside on the GPU) to some location in memory. We call this a `values` +function, and for arrays of type `[]f64` it might look like this: + +```C +int futhark_values_f64_1d(struct futhark_context *ctx, struct futhark_f64_1d *arr, double *data); +``` + +The `data` argument must point to some place in memory with enough +room for the entire array. The size of the array is obtained with a +different API function and multiplied with the element size, which is +statically predictable. There are two subtleties here: + +1. While most languages allow you to create an empty array of some + size, passing this array as a pointer to a C function is not + necessarily straightforward. This is only safe if the language + guarantees that the in-memory representation of the array is + "unboxed"; meaning it corresponds to what C expects. If the + language does not guarantee this, you need to allocate raw memory, + ask Futhark to copy the array to that location, and then copy the + elements into the bridge language array - which can be quite slow. + +2. The `values` function is allowed to copy *asynchronously*, meaning + the copy might still be physically ongoing when the function + returns. This allows overlapping copies with other work, which can + be important for performance. Unfortunately, many of the potential + bridge languages use *garbage collection*, and most garbage + collectors do not guarantee that objects do not move around in + memory. If an array was moved whilst Futhark was copying data into + it, the results would likely be disastrous. Some languages allow + one to allocate non-movable "pinned" memory, but unless great care + is taken, it is best to generate glue code that performs a full + synchronisation after calling the `values` function. + +Another minor thing, but which turns out to be a big convenience, is +that the Futhark compiler will emit a +[manifest](https://futhark.readthedocs.io/en/latest/c-api.html#manifest) +(a JSON file) that describes the entire generated API in a +machine-readable way. This is useful, as which functions are available +depends on which entry points are defined in the Futhark program, and +which types they expose. Prior to adding the manifest, bridges had to +analyse the generated C header file, which is awkward and error-prone. + +Futhark's C API reports errors via return codes, following the usual +0-means-success convention. This is notoriously error prone and +tedious to handle in C, but at least it is very easy to wrap in +whichever error handling facility is customary in the bridge language, +such as exceptions. + +Memory management is a more tricky business. Futhark does not allow +the construction of circular data structures and thus internally uses +reference counting, but it has no way of knowing when the user is done +with some piece of data. Therefore the user is responsible for +eventually freeing all data returned by Futhark, using a +Futhark-provided function (that in practice just decrements a +reference count). While C programmers are famously known for *never* +making mistakes when doing manual memory management, programmers of +more high level languages are more imperfect, and therefore it is +advisable for a bridge to hook into the automatic memory management +(reference counting, +[finalizers](https://en.wikipedia.org/wiki/Finalizer)) of the bridge +language. + +## Implmentation of `smlfut` + +The Futhark/SML-bridge has been implemented as a program +[smlfut](https://github.com/diku-dk/smlfut). You pass it a Futhark +manifest file and it will spit out SML code as well as a bit of glue C +code. I had hoped to avoid the need for generating C, but I needed to +do a few (uninteresting) things that could not be expressed directly +in SML. + +SML has a [very nice module +system](https://jozefg.bitbucket.io/posts/2015-01-08-modules.html) +that allows one to write a *signature* that abstractly describes the +interface implemented by a module. `Smlfut` makes good use of this. +For example, the Futhark program + +```Futhark +def inc (xs: []i32) = map (+2) xs +``` + +currently gives rise to this collection of SML signatures: + +```SML +signature FUTHARK_POLY_ARRAY = +sig + type array + type ctx + type shape + type elem + val new: ctx -> elem ArraySlice.slice -> shape -> array + val free: array -> unit + val shape: array -> shape + val values: array -> elem Array.array + val values_into: array -> elem ArraySlice.slice -> unit +end + +signature FUTHARK_OPAQUE = +sig + type t + type ctx + val free : t -> unit +end + +signature FUTHARK_RECORD = +sig + include FUTHARK_OPAQUE + type record + val values : t -> record + val new : ctx -> record -> t +end + +signature FUTHARK = sig + val backend : string + val version : string + + type ctx + exception error of string + type cfg = {logging:bool, debugging:bool, profiling:bool, cache:string option} + + structure Config : sig + val default : cfg + val logging : bool -> cfg -> cfg + val debugging : bool -> cfg -> cfg + val profiling : bool -> cfg -> cfg + val cache : string option -> cfg -> cfg + end + + structure Context : sig + val new : cfg -> ctx + val free : ctx -> unit + val sync : ctx -> unit + end + + (* []i32 *) + structure Int32Array1 : FUTHARK_POLY_ARRAY + where type ctx = ctx + and type shape = int + and type elem = Int32.int + + structure Opaque : sig + end + + structure Entry : sig + val inc : ctx -> Int32Array1.array -> Int32Array1.array + end +end +``` + +The particularly interested can [read the fine +manual](https://github.com/diku-dk/smlfut/releases/download/latest/smlfut.pdf), +but the most interesting thing is that Futhark arrays must be +associated with some SML type, here the built-in polymorphic `array` +type, such that Futhark arrays can be converted to SML-comprehensible +data. More on that design decision in a bit. + +It is also interesting (for people who find that sort of thing +interesting) that identifiers from the Futhark program can influence +identifiers in the SML program; in this case, the name of the entry +point `inc`. What happens if a Futhark entry point has a name that is +not a valid SML identifier? For `smlfut`, I have decided to emit an +error. Users can always pick better public names, and this is better +than coming up with complicated escaping rules. (This rule of course +does not apply to names that are not visible in the API.) + +### Compiler support + +Since SML has so many compilers in use, my initial plan was to support +several of them. However, due to unexpected technical friction, only +MLton (and its fork [MPL](https://github.com/MPLLang/)) is currently +supported. The main obstacle is that while the [SML Basis +Library](https://smlfamily.github.io/Basis/top-level-chapter.html) +defines an interface for [monomorphic +arrays](https://smlfamily.github.io/Basis/mono-array.html), which I +would expect implies an unboxed representation, only MLton actually +seems to implement this interface for all necessary primitive types. +For example, MLKit (generally my favourite SML compiler) only +implements monomorphic arrays for the types `Char8` (`uint8_t`) and +`Real` (`double`). SML/NJ, which generally has good support for the +Basis Library (they invented it), also has a rather anemic selection +of monomorphic arrays. This is ironic because in MLton, *polymorphic +arrays also have an unboxed monomorphic representation*, so I don't +actually need to use monomorphic array types in order to access their +elements from C. Since `smlfut` already has a command line option for +switching between monomorphic and polymorphic array interfaces, I am +considering adding a switch that allows one to copy Futhark arrays +into `Char8` monomorphic arrays, and make it to to the user to +interpret the raw bytes as proper numbers. + +While [MLton's FFI](http://mlton.org/ForeignFunctionInterface) is +somewhat basic, the Futhark C API was designed to accomodate this, and +I did not encounter any real obstacles. I did not get very far with +the MLKit implementation, but this was due to array type challenges, +not the FFI. + +### Resource management + +While MLton *does* support finalizers, neither MLKit nor MPL does. As +a result, `smlfut` requires manual resource management by the user. +There is no nice way to put it: this is a *major* downside. I hope to +come up with a solution somehow, but beyond simply using finalizers +*when available*, I'm not sure what to do. + +### What does it feel like to `smlfut`? + +It feels good, if somewhat verbose. This is a small chunk of SML code +that makes use of the Futhark program above: + +```SML +val ctx = + Futhark.Context.new Futhark.Config.default +val arr_in = + Futhark.Int32Array1.new ctx (ArraySlice.full (Array.fromList [1, 2, 3])) 3 +val arr_out = + Futhark.Entry.inc ctx arr_in +val arr_sml = + Futhark.Int32Array1.values arr_out +val () = + Futhark.Int32Array1.free arr_in +val () = + Futhark.Int32Array1.free arr_out +val () = + Futhark.Context.free ctx +``` + +And in particular, it has already proven useful for some work I've +been invited to participate in. More on that in the future, hopefully.