uip | title | description | author | status | type | category | created |
---|---|---|---|---|---|---|---|
0127 |
Lagoon IEEE 754 Reals |
Support jetted array operations |
~lagrev-nocfep |
Draft |
Standards Track |
Arvo |
2024-03-26 |
Nock and its predicated software programs are designed to facilitate computing as a legible state machine. At-scale scientific computing has never really been part of the design criteria for the Urbit ecosystem. While it's hard to imagine density functional theory quantum chemistry packages preferring Hoon, it is far easier to envision a world of machine learning and even LLMs that are built on top of or adjacent to Urbit. At their core, essentially all of the “scientific computing” approaches rely heavily on linear algebra packages: classically BLAS and LAPACK, but many others as well today. It behooves us to provide this to future developers building on Mars.
Lagoon has matured through several refactors into a developer-friendly interface. It is time to prepare and release a subset of completed array-operational work to the developer community. Since Lagoon relies on jets to operate reasonably quickly, we will prepare and release %real
-valued operations first.
Lagoon (Linear AlGebra in hOON) will offer BLAS-like vector and matrix operations (like NumPy's linear algebra essentials, but not solvers). This will facilitate a common and performant interface for all developers relying on a common lagnuage of data types and operators to represent vector, matrix, and tensor data. By focusing on %real
s first, we can vet the interface further in production and lay the groundwork for expansion in further types in subsequent releases.
Lagoon defines the following data types for %real
-valued numbers (which are IEEE 754 floats compatible with corresponding @r
types in Hoon). (Note that we are commenting out other prospective types for which some work has been done.)
:: /sur/lagoon
:::: Types for Lagoon compatibility
::
|%
+$ ray :: $ray: n-dimensional array
$: =meta :: descriptor
data=@ux :: data, row-major order
==
::
+$ prec [a=@ b=@] :: fixed-point precision, a+b+1=bloq
+$ meta :: $meta: metadata for a $ray
$: shape=(list @) :: list of dimension lengths
=bloq :: logarithm of bitwidth
=kind :: name of data type
fxp=(unit prec) :: fixed-point scale
==
::
+$ kind :: $kind: type of array scalars
$? %real :: IEEE 754 float
:: %uint :: unsigned integer
:: %int2 :: 2s-complement integer
:: %cplx :: BLAS-compatible packed floats
:: %unum :: unum/posit
:: %fixp :: fixed-precision
==
::
+$ baum :: $baum: ndray with metadata
$: =meta ::
data=ndray ::
==
::
+$ ndray :: $ndray: n-dim array as nested list
$@ @ :: single item
(list ndray) :: nonempty list of children, in row-major order
::
+$ slice (unit [(unit @) (unit @)])
--
The core data type is a +$ray
, which consists of a pair of metadata +$meta
and data as a bit-aligned atom. The metadata which we track for any particular array are:
-
shape=(list @)
, the dimensions of an array. -
bloq
, the block size of an array in standard Hoon$2^n$ terms. -
kind
, whether the value is%real
or otherwise. (Only%real
will be supported in this UIP's release.) -
fxp=(unit prec)
, the binary-point scale of a fixed-precision number as a unit. (This will always be~
for%real
s.)
data
are in CBLAS-like row-major order, with the leading value of the array in the most signficant position bitwise. A +$ray
's data
atom has a leading 1
bit at the most significant bit of the array plus one. This allows us to represent leading zeroes.
There is also a +$ndray
type corresponding to an unpacked list of the same data.
At the current time, Lagoon defines the following operators, a handful of which are for internal library convenience. This interface is the result of several complete refactors and we do not foresee additional major alterations in the course of producing the final product.
++print
++slog
++to-tank
++get-term
++squeeze
++submatrix
++product
++gather
++get-item
++set-item
++get-row
++set-row
++get-col
++set-col
++get-bloq-offset
++get-item-number
++strides
++get-dim
++get-item-index
++ravel
++en-ray
++de-ray
++get-item-baum
++fill
++spac
++unspac
++scalar-to-ray
++eye
++zeros
++ones
++iota
++magic
++range
++linspace
++urge
++scale
++max
++argmax
++min
++argmin
++cumsum
++prod
++reshape
++stack
++hstack
++vstack
++transpose
++diag
++trace
++dot
++mmul
++abs
++add-scalar
++sub-scalar
++mul-scalar
++div-scalar
++mod-scalar
++add
++sub
++mul
++div
++mod
++pow-n
++pow
++exp
++log
++gth
++gte
++lth
++lte
++mpow-n
++is-close
++any
++all
++fun-scalar
++trans-scalar
++el-wise-op
++bin-op
The IEEE 754 rounding mode is a property of the operation rather than of the value(s) involved. To effect this correctly, /lib/lagoon
implements a ++lake
wrapper in a door-like pattern to modify the core's behavior. This information propagates to gates.
Jets will resolve at the level of (e.g.) ++add
rather than ++bin-op
.
Jets are built on top of the SoftBLAS which implements BLAS operations on top of SoftFloat. The build process for vere
has been modified to link SoftBLAS (in the lagoon-jets
branch of urbit/vere
).
Because jets need to be compiled into the vere
binary at the current time, /lib/lagoon
needs to ship on the %base
desk. However, we propose a departure from past convention: jet the library in /lib
rather than move it into /sys
.
In the long run, as more possibilities for userspace jets become available, /lib/lagoon
may be moved out of %base
. However, one change in policy at a time is all that we propose now.
Lagoon will jet /lib
into the non
chapter and place corresponding jets in /i
.
IEEE 754 %real
s are a well-understood, well-vetted component in the numerical ecosystem for scientific computing and Urbit's @r
affordances.
We have comprehensive unit test coverage of /lib/lagoon
and SoftBLAS, lending credence to verified behavior.
Taken together, these lead us to believe that introducing %real
to the developer community first is a conservative assay to roll out or try several new features:
- Jetted
/lib
. - Tetchy hardware–software coupling
- Linear algebra interface and affordances
Primary development is taking place in urbit/numerics
.
Details are available in ~mopfel-winrux/numeric-computation-and-machine-learning
and in summary form.
Copyright and related rights waived via CC0.