-
Notifications
You must be signed in to change notification settings - Fork 7
Built in Functions
These standard functions are supported in GPU code without changes, or with slight limitations:
- abs
Translated to C (f|ll)abs(f) functions, depending on the argument type.
- + – * / 1+ 1-
These operations are directly translated to C arithmetic operators, and thus follow C type promotion and rounding rules: e.g. integer division truncates.
As a special exception (/ x) always returns a floating-point value because it would be useless and dangerous otherwise.
- t nil
These constants are interpreted as having boolean type; nil may also be used as an init expression for a declared variable of any type to specify the lack of initialization.
- and or
These support only boolean arguments, and return booleans.
- zerop < <= = /= >= >
- max min
Should work as documented.
- nonzerop
A counterpart to zerop, defined by this package.
- eq eql
Basically equivalent to =, implemented for better macro support.
- logand logior logxor logeqv lognot
- logandc1 logandc2 lognand lognor logorc1 logorc2
- sin asin sinh asinh
- cos acos cosh acosh
- tan atan tanh atanh
- exp sqrt log expt
These function are mapped to C standard library calls and have equivalent precision properties. In cases when the result would normally be a complex value, NaN is returned.
Unlike the standard lisp library versions, these functions return only one value:
- ffloor fceiling ftruncate fround
Implemented via equivalent library calls; always return a floating point-value.
- floor ceiling truncate round
Always return an integer. In complex cases implemented via the floating-point equivalents.
- rem
Follows standard C arithmetic promotion rules for arguments. Difficult integer cases are implemented through floating-point code.
These functions are specific to GPU code:
- (aref array indexes…)
- (array-total-size array)
- (array-dimension array dim-idx)
Identical to the standard lisp versions.
- (raw-aref array index)
Due to the frequent use of pitched allocation to achieve perfect data alignment in gpu arrays, row-major-aref is difficult to implement transparently. This is its functional equivalent that acknowledges the presense of alignment holes in its index range.
- (array-raw-extent array)
Returns the size of the index range for raw-aref. For pitched arrays it is greater than array-total-size.
- (array-raw-stride array dim-idx)
Returns the stepping for the corresponding dimension with corrections for pitch.
- (array-raw-index array indexes…)
Similar to array-row-major-index, but intended to be used with raw-aref.
These expressions are equivalent:
(aref a i j k) (raw-aref a (array-raw-index a i j k)) (raw-aref a (+ (* i (array-raw-stride a 0)) (* j (array-raw-stride a 1)) k))
The value of the array parameter must be statically resolvable to the original global variable, kernel parameter or fully declared local array variable.
Tuples are special fixed-sized vectors that are directly supported by hardware in some way. The exact implementation, allowed combinations of type and size, and supported operations heavily depend on the target. The tuple type is denoted by (tuple elt-type size).
- (tuple x y z…)
Creates a tuple out of its arguments. The element type is determined using rules similar to the ones used by C arithmetic operations, but without mandatory upgrading to int.
- (untuple val) → x, y, z…
Unpacks a tuple into multiple returned values.
- (tuple-aref array indexes…)
Accesses the innermost dimension of the array as a tuple. The list of indexes must contain one value less than the rank of the array. The innermost dimension must have constant size that is allowed for a tuple.
- (tuple-raw-aref array index size)
Accesses size elements starting at index as a tuple. The size argument must be specified as an integer constant.
The following functions and symbol macros may be used to retrieve thread grid dimensions and indexes.
The macros expand to invocations of the corresponding functions; “x”, “y” and “z” correspond to dimensions 0, 1 and 2. Without arguments the functions return information for all dimensions as a tuple.
- thread-idx
- thread-idx-x thread-idx-y thread-idx-z
- (thread-index &optional dim-idx)
Retrieves the index of the current thread within the block.
- thread-cnt
- thread-cnt-x thread-cnt-y thread-cnt-z
- (thread-count &optional dim-idx)
Retrieves the in-block thread grid dimensions.
- block-idx
- block-idx-x block-idx-y block-idx-z
- (block-index &optional dim-idx)
Retrieves the index of the current block within the global grid.
- block-cnt
- block-cnt-x block-cnt-y block-cnt-z
- (block-count &optional dim-idx)
Retrieves the dimensions of the global block grid.
The number of supported dimensions depends on the compilation target. The dimension index argument must be specified as an integer constant.
The following built-in function can be used for thread synchronization:
- (barrier &optional mode)
The mode may be:
-
:block (default)
Waits until all threads in the block reach the same point.
-
:block-fence
Ensures that all preceeding global and shared variable writes are visible to other threads in the block, and all following reads will see up-to-date data.
-
:grid-fence
Likewise, but for all currently running blocks.
-
:system-fence
Likewise for the whole system, including the implicit PCI-E bus transfers to/from main memory (requires Fermi).
The set of actually supported modes depends on the target.