Skip to content

Commit

Permalink
Update of documentation for 0.2.0
Browse files Browse the repository at this point in the history
  • Loading branch information
mratsim committed Sep 24, 2017
1 parent 28615c1 commit cc316c5
Show file tree
Hide file tree
Showing 4 changed files with 1,300 additions and 140 deletions.
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,13 +58,14 @@ Putting a research model in production, on a drone or as a webservice for exampl
All those pain points may seem like a huge undertaking however thanks to the Nim language, we can have Arraymancer:
- Be as fast as C
- Accelerated routines with Intel MKL/OpenBLAS or even NNPACK
- Access to CUDA and reusing existing Torch, Tensorflow or Nervana Neon kernels
- A Python-like syntax with custom operators `a * b` for tensor multiplication instead of `a.dot(b)` (Numpy/Tensorflow) or `a.mm(b)` (Torch) and Numpy-like slicing ergonomics `t[0..4, 2..10|2]`
- Access to CUDA and generate custom CUDA kernels on the fly via metaprogramming.
- A Python-like syntax with custom operators `a * b` for tensor multiplication instead of `a.dot(b)` (Numpy/Tensorflow) or `a.mm(b)` (Torch)
- Numpy-like slicing ergonomics `t[0..4, 2..10|2]`

## Future ambitions
Because apparently to be successful you need a vision, I would like Arraymancer to be:
- The go-to tool for Deep Learning video processing. I.e. `vid = load_video("./cats/youtube_cat_video.mkv")`
- Target javascript, WebAssembly, ARM devices, AMD Rocm, OpenCL.
- Target javascript, WebAssembly, Apple Metal, ARM devices, AMD Rocm, OpenCL, you name it.
- Target cryptominers FPGAs because they drove the price of GPUs for honest deep-learners too high.

## Support (Types, OS, Hardware)
Expand Down Expand Up @@ -95,7 +96,7 @@ For now Arraymancer is still at the ndarray stage, however a [vision package](ht

### Speed

On the demo benchmark, Arraymancer already reach speeds with comparable to Torch on logistic regression on OpenBLAS, though further MKL optimization are possible (batched matmul probably):
On the demo benchmark, Arraymancer already reach speeds with comparable to Torch on logistic regression on OpenBLAS, though further MKL optimizations are possible (batched matmul probably):

| Library | Timing |
| ------ | ------ |
Expand Down Expand Up @@ -135,7 +136,7 @@ Here is a comparative table, not that this feature set is developing very rapidl
| Iterating on a Tensor |[x]|[]|
| Slicing a Tensor |[x]|[x]|
| Slice mutation `a[1,_] = 10` |[x]|[]|
| Comparison `==`|[x]|[]|
| Comparison `==`|[x]| Coming soon|
| Element-wise basic operations|[x]|[x]|
| Universal functions |[x]|[x]|
| Automatically broadcasted operations |[x]| Coming soon|
Expand Down
4 changes: 2 additions & 2 deletions docs/Linear algebra notation comparison.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
| Language/lib | Normal matmul | element-wise matmul (Hadamard) | vec-vec dot product | mat-vec multiplication|
| ------------- | ---------------------------- | --- | --- | --- |
| Arraymancer | A * B | \|*\| | A * B | A * B |
| Arraymancer | A * B | .* | dot(A, B) | A * B |
| neo/linalg | A * B | \|*\| | A * B | A * B |
| Julia | A * B | .* | | dot(A, B) | A * B |
| Julia & Matlab | A * B | .* | dot(A, B) | A * B |
| Numpy ndarray| np.dot(A, B) or np.matmul(A, B) or A @ B| np.multiply(A, B) or A * B | np.dot(A, B) or np.inner(A, B) | np.dot(A, B) |
| R | A %*% B | A * B | A %*% B or dot(A, B)| A %*% B |
| Tensorflow | tf.matmul(A, B) or A @ B | tf.multiply(A, B) | tf.matmul(a, b, transpose_a=False, transpose_b=True) or tf.tensordot(a, b, 1) or tf.einsum('i,i->', x, y) | same reshape/transpose/einsum shenanigans as vec-vec|
Expand Down
Loading

0 comments on commit cc316c5

Please sign in to comment.