Design the handling of function pointers

As we do not have the ability to arbitrarily set the program counter when compiling via FlatLowered, we have had to come up with a clever way to emulate calling functions via pointers in Hieratika. This PR includes that design, which makes certain assumptions that seem to hold for generic Rust code, but may do not hold for all LLVM IR. This makes it sufficient for an initial implementation, and for getting Rust code working, but may later see evolution to better support more complex notions of function pointers.
reilabs · Jan 13, 2025 · f9efacc · f9efacc
1 parent 3af45ce
commit f9efacc
Showing 1 changed file with 73 additions and 3 deletions.
diff --git a/docs/Memory Model.md b/docs/Memory Model.md
@@ -40,7 +40,7 @@ like `atomicrmw`. This memory model design is concerned with the core memory pol
 to be able to allocate memory, both on the heap and on the "stack", while also being able to
 manipulate that memory.
 
-Hieratika defines two polyfills and two _sets of_ polyfills for interacting with memory. The two
+Hieratika define two polyfills and two _sets of_ polyfills for interacting with memory. The two
 polyfills are as follows:
 
 - `fn alloc(size_bits: usize, count: usize) -> ptr`: This polyfill allocates a contiguous region of
@@ -148,7 +148,76 @@ follows:
 
 ### Function Pointers
 
-This section is TBD, and will be filled in as part of the work on function pointers.
+Function pointers are in common usage in LLVM, making it crucial that Hieratika is able to support
+their usage. To that end, the following section outlines their integration with the memory model.
+
+The precondition for this design is to first understand that we are not considering function
+pointers to be _true pointers_ at this stage. While LLVM insists that they _are_—and offsets from
+them can be used to access adjacent data such as
+[prefix](https://llvm.org/docs/LangRef.html#prefix-data) or
+[prologue](https://llvm.org/docs/LangRef.html#prologue-data) data—these features do not seem to be
+used by Rust and hence can be safely ignored for now.
+
+As a result, Hieratika treats function pointers _specially_, as follows:
+
+- In LLVM IR, function pointer values are _always_ derived from function pointer constants,
+  referring directly to the encoded name of some function.
+- The Hieratika compiler, at the point of generating function stubs, will generate global constants
+  for these function pointer constants, derived from the correct `blockaddress` expression,
+  containing a _relocatable_ reference to a block (`block_ptr`), and information about which module
+  the block was defined in (`module_id`).
+- The compiler then generates _dispatch functions_ (named
+  `format!("__hieratika_dispatch_local_{function_type}")`). These dispatch functions take the
+  correct arguments `A...` for their type, as well as the function pointer `ptr` and match on the
+  provided function pointer's `block_ptr` portion.
+- If the `block_ptr` matches, the local dispatch function will call the correct target function (in
+  the current module) for this function pointer, passing the provided arguments and thereby calling
+  the correct function for that pointer.
+
+Consider, by way of example, a dispatch function `__hieratika_dispatch_local_(i8, i8) -> i16`. It
+would have an implementation similar to the following pseudocode.
+
+```rust
+fn __hieratika_dispatch_local_i8_i8_i16(function_pointer: ptr, arg1: i8, arg2: i8) -> i16 {
+    match function_pointer.block_ptr {
+        0 => function_zero(arg1, arg2),
+        1 => function_one(arg1, arg2),
+        // ...
+        _ => panic!("Found function pointer {function_pointer} for function (i8, i8) -> i16 but no such target exists")
+    }
+}
+```
+
+This results in a raft of dispatch functions for all possible function types in a module, which can
+then be used to _locally_ discover the correct function to call through a function pointer.
+
+This, however, is insufficient for _general-case_ function pointer dispatch. Where LLVM's
+traditional [`blockaddress`](https://llvm.org/docs/LangRef.html#addresses-of-basic-blocks)
+expressions cannot escape the local module, function pointers _easily_ can be passed between modules
+and then called.
+
+In order to solve this problem, the design for function pointers in Hieratika has a _second_ part,
+known as the _meta_-dispatch table.
+
+- These tables, generated by the _linker_—instead of the single-module hieratika compiler—are
+  similar to the local dispatch tables above, but instead of dispatching based on the `block_ptr`
+  portion of the function pointer, it instead dispatched based on the `module_id`.
+- If the `module_id` matches, the meta dispatch function will call the correct target _local_
+  dispatch function for that module, passing the provided arguments and the function pointer, and
+  allowing the local dispatch function to select the correct implementation.
+
+At first glance, this dual-layer dispatch seems quite expensive, essentially amounting to performing
+comparisons of the function pointer to all possible targets in a program. However, Hieratika uses a
+slightly more sensible search mechanism.
+
+- Both `block_ptr` and `module_id` are (re-)allocated (at link time) such that they are each in a
+  contiguous block of integer identifiers.
+- The dispatch function can then perform _binary search_ on the `module_id` to resolve to the
+  correct local dispatch function in $\mathbb{O}(\log_2 n)$ time, instead of $\mathbb{O}(n)$ time.
+
+If, in the future, we work with languages that do make use of prefix or prologue data, this approach
+will be entirely insufficient. To make _that_ work, we would need to re-work our handling of
+function pointers to be _true_ pointers.
 
 ## Felt-Aligned Addressing - An Alternative Model
 
@@ -172,7 +241,8 @@ as a value of another type under byte-addressing and alignment rules—was rampa
 
 As an example, it proved common to see IR that allocated `[4 x i8]` and then wrote an `i16` to the
 first two bytes and read an `i16` from the other two. As, in the felt-aligned model, the first two
-`i8`s would be written to individual felts, reading them back as an `i16` is significantly complex.
+`i8`s would be written to individual felts, reading them back as an `i16` is significantly more
+complex.
 
 To that end, the project was forced to abandon this model in favor of a more-traditional
 byte-aligned addressing model.