Wizard's First Rule
This release is named after "Wizard's First Rule" (1994), the first book of Terry Goodkind masterpiece "The Sword of Truth".
I am very excited to announce the third release of Arraymancer which includes numerous improvements, features and (unfortunately!) breaking changes.
Warning ⚠: ALL deprecated procs will be removed next release due to deprecated spam and to reduce maintenance burden.
Changes:
-
Very Breaking
- Tensors uses reference semantics now:
let a = b
will share data by default and copies must be made explicitly.- There is no need to use
unsafe
proc to avoid copies especially for slices. - Unsafe procs are deprecated and will be removed leading to a smaller and simpler codebase and API/documentation.
- Tensors and CudaTensors now works the same way.
- Use
clone
to do copies. - Arraymancer now works like Numpy and Julia, making it easier to port code.
- Unfortunately it makes it harder to debug unexpected data sharing.
- There is no need to use
- Tensors uses reference semantics now:
-
Breaking (?)
- The max number of dimensions supported has been reduced from 8 to 7 to reduce cache misses.
Note, in deep learning the max number of dimensions needed is 6 for 3D videos: [batch, time, color/feature channels, Depth, Height, Width]
- The max number of dimensions supported has been reduced from 8 to 7 to reduce cache misses.
-
Documentation
- Documentation has been completely revamped and is available here: https://mratsim.github.io/Arraymancer/
-
Huge performance improvements
- Use non-initialized seq
- shape and strides are now stored on the stack
- optimization via inlining all higher-order functions
apply_inline
,map_inline
,fold_inline
andreduce_inline
templates are available.
- all higher order functions are parallelized through OpenMP
- integer matrix multiplication uses SIMD, loop unrolling, restrict and 64-bit alignment
- prevent false sharing/cache contention in OpenMP reduction
- remove temporary copies in several proc
- runtime checks/exception are now behind
unlikely
A*B + C
andC+=A*B
are automatically fused in one operation- do not initialized result tensors
-
Neural network:
- Added
linear
,sigmoid_cross_entropy
,softmax_cross_entropy
layers - Added Convolution layer
- Added
-
Shapeshifting:
- Added
unsqueeze
andstack
- Added
-
Math:
- Added
min
,max
,abs
,reciprocal
,negate
and in-placemnegate
andmreciprocal
- Added
-
Statistics:
- Added variance and standard deviation
-
Broadcasting
- Added
.^
(broadcasted exponentiation)
- Added
-
Cuda:
- Support for convolution primitives: forward and backward
- Broadcasting ported to Cuda
-
Examples
- Added perceptron learning
xor
function example
- Added perceptron learning
-
Precision
- Arraymancer uses
ln1p
(ln(1 + x)
) andexp1m
procs (exp(1 - x)
) where appropriate to avoid catastrophic cancellation
- Arraymancer uses
-
Deprecated
- Version 0.3.1 with the ALL deprecated proc removed will be released in a week. Due to issue nim-lang/Nim#6436,
even using non-deprecated proc likezeros
,ones
,newTensor
you will get a deprecated warning. newTensor
,zeros
,ones
arguments have been changed fromzeros([5, 5], int)
tozeros[int]([5, 5])
- All
unsafe
proc are now default and deprecated.
- Version 0.3.1 with the ALL deprecated proc removed will be released in a week. Due to issue nim-lang/Nim#6436,