Minimalistic homemade PyTorch alternative, written in C99 and Python.
Explore the docs »
View Demo
|
Report Bug
|
Request Feature
Table of Contents
- [2025/01/14] 🎉 CPU backend now uses multiple threads with dynamic scaling and thread pooling.
- [2025/01/02] 🎈 Magnetron released on GitHub.
This project started as a learning experience and a way to understand the inner workings of PyTorch and other deep learning frameworks.
The goal is to create a minimalistic but still powerful deep learning framework that can be used for research and production.
The framework is written in C99 and Python and is designed to be easy to understand and modify.
- The project is still in its early stages and many features are missing.
- Developed by a single person in their free time.
- The project is not yet fully optimized for performance.
To get a local copy up and running follow these simple steps.
Magnetron itself has no Python dependencies except for CFFI to call the C library from Python.
Some examples use matplotlib and numpy for plotting and data generation, but these are not required to use the framework.
- Linux, MacOS or Windows
- A C99 compiler (gcc, clang, msvc)
- Python 3.6 or higher
A pip installable package will be provided, as soon as all core features are implemented.
- Clone the repo
cd magnetron/python
(VENV recommended).pip install -r requirements.txt
Install dependencies for examples.cd magnetron_framework && bash install_wheel_local.sh && cd ../
Install the Magnetron wheel locally, a pip installable package will be provided in the future.python examples/simple/xor.py
Run the XOR example.
See the Examples directory for examples on how to use the framework. For usage in C and C++ see the Unit Tests directory in the root of the project.
- 6 Dimensional, linearized tensors
- Automatic Differentiation
- Multithreaded CPU Compute, SIMD optimized operators (SSE4, AVX2, AVX512, ARM NEON)
- Modern Python API (similar to PyTorch)
- Many operators with broadcasting support and in-place variants
- High level neural network building blocks
- Dynamic computation graph (eager evaluation)
- Modern PRNGs: Mersenne Twister and PCG
- Validation and friendly error messages
- Custom compressed tensor file formats
Code from the XOR example:
def forward(self, x: Tensor) -> Tensor:
return (self.weight @ prev + self.bias).sigmoid()
Operation | Description |
---|---|
clone(x) | Creates a copy of the tensor |
view(x) | Reshapes without changing data |
transpose(x) | Swaps tensor dimensions |
permute(x, d0, ...) | Reorders tensor dimensions |
mean(x) | Mean across dimensions |
min(x) | Minimum value of tensor |
max(x) | Maximum value of tensor |
sum(x) | Sum of elements |
abs(x) | Element-wise absolute value |
neg(x) | Element-wise negation |
log(x) | Element-wise natural logarithm |
sqr(x) | Element-wise square |
sqrt(x) | Element-wise square root |
sin(x) | Element-wise sine |
cos(x) | Element-wise cosine |
softmax(x) | Softmax along dimension |
sigmoid(x) | Element-wise sigmoid |
relu(x) | ReLU activation |
gelu(x) | GELU activation |
add(x, y) | Element-wise addition |
sub(x, y) | Element-wise subtraction |
mul(x, y) | Element-wise multiplication |
div(x, y) | Element-wise division |
matmul(A, B) | Matrix multiplication |
The goal is to implement training and inference for LLMs and other state of the art models, while providing a simple and small codebase that is easy to understand and modify.
- Compute on GPU (Cuda)
- Low-precision datatypes (f16, bf16, int8)
- Distributed Training and Inference
- CPU and GPU kernel JIT compilation
- Better examples with real world models (LLMs and state of the art models)
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated. If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
Distributed under the Apache 2 License. See LICENSE.txt
for more information.