TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training

This repository contains the official implementation of TensorGRaD, a memory-efficient gradient optimization framework for training large-scale neural operators. TensorGRaD uses a robust combination of low-rank tensor decomposition and unstructured sparsification to compress gradient updates. TensorGRaD achieves significant memory savings while maintaining or even improving model performance.

Installation

Start from a clean conda environment:

# Create and activate a new conda environment with Python 3.10
conda create -n tensorgrad python=3.10
conda activate tensorgrad

# Install PyTorch with CUDA support
conda install pytorch torchvision -c pytorch

# Clone the repository
git clone https://github.com/neuraloperator/tensorgrad.git
cd tensorgrad

# Install dependencies
pip install -r requirements.txt

Overview

TensorGRaD is a drop-in optimizer that replaces standard optimizers like AdamW. It applies compression at the gradient level through:

Low-rank decomposition using a Tucker higher-order low-rank decomposition
Gradient sparsification using structured or unstructured sparsity (top-k, random-k, or probabilistic)
Composite projectors that combine low-rank and sparse compression in a compositional manner: TensorGRaD first applies either a low-rank or sparse decomposition to the gradient, then compresses the residual using the second method. This sequential scheme ensures that low-rank and sparse components complement each other for more effective compression.

TensorGRaD supports mixed-precision training and is implemented for scientific ML workloads that optimize tensors.

Directory Structure

tensorgrad/: Optimizer implementations
- adamw.py: Single projector optimizers
- tensorgrad.py: Composite projector variant (TensorGRaD)
- projectors/: Includes all projector logic (tensor/matrix, sparse/low-rank)
scripts/experiments/: Runs for ablation studies and benchmarks (low-rank, sparse, mixed)
scripts/profiling/: Memory profiling tools for different architectures
train_ns_repro_tensorgrad.py: Main training script on Navier–Stokes

Running Experiments

Use YAML-based configs and command-line overrides for training:

python train_ns_repro_tensorgrad.py --config_file ns_tensorgrad_repro_config.yaml

Or use the prepared bash scripts in scripts/experiments/.

Example Configurations

Optimizer Selection

--opt.tensorgrad True    # Enable TensorGRaD
--opt.tensorgrad False   # Use AdamW

TensorGRaD - unstructured sparse + low-rank

--opt.proj_type unstructured_sparse \
--opt.sparse_ratio 0.05 \
--opt.sparse_type randk \
--opt.second_proj_type low_rank \
--opt.second_rank 0.20

Single Projector (Low-Rank or Sparse)

--opt.proj_type low_rank \
--opt.rank 0.25

or

--opt.proj_type structured_sparse \
--opt.sparse_ratio 0.25 \
--opt.sparse_type randk

Additional Flags

--opt.update_proj_gap 1000         # Projection update interval
--fno.fno_block_precision mixed    # Activations: mixed precision
--fno.fno_block_weights_precision half  # Weights: half precision

Datasets

Built-in Support

Navier–Stokes ($Re=1000$)
- Resolutions: 128×128 and 1024×1024
- Automatically downloaded via neuraloperator
Navier–Stokes ($Re=10^5$)
- High-resolution (1024×1024)
- Download manually from Hugging Face
- Requires nsforcing_test_1024.hdf5 and nsforcing_train_1024.hdf5
- See paper for pretraining and dataset details

Custom Data

To prepare your own data:

Follow the structure used in neuraloperator
Review FullSizeNavierStokes class in tensorgrad/navier_stokes.py
Utilities are available in dataset_creation/

Memory Profiling

Use bash scripts under scripts/profiling/ for benchmarking.

bash scripts/profiling/128modes_256channels_4layers/US_LR_025.sh

Profiling outputs are written to memstats/ and profiler_outputs/.

Citation

If you use this code, please cite:

@misc{tensorgrad,
  title={TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training},
  author={Sebastian Loeschcke and David Pitt and Robert Joseph George and Jiawei Zhao and Cheng Luo and Yuandong Tian and Jean Kossaifi and Anima Anandkumar},
  year={2025},
  eprint={2501.02379},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
config		config
data/navier_stokes/ns_data		data/navier_stokes/ns_data
dataset_creation		dataset_creation
neuraloperator		neuraloperator
scripts		scripts
tensorgrad		tensorgrad
tensorly		tensorly
.gitignore		.gitignore
README.md		README.md
parse_memory_breakdowns.py		parse_memory_breakdowns.py
plot_timing.py		plot_timing.py
pt_to_hdf5.py		pt_to_hdf5.py
requirements.txt		requirements.txt
test_dcp_load.py		test_dcp_load.py
train_ns_repro_tensorgrad.py		train_ns_repro_tensorgrad.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training

Installation

Overview

Directory Structure

Running Experiments

Example Configurations

Optimizer Selection

TensorGRaD - unstructured sparse + low-rank

Single Projector (Low-Rank or Sparse)

Additional Flags

Datasets

Built-in Support

Custom Data

Memory Profiling

Citation

About

Uh oh!

Releases

Packages

Languages

neuraloperator/tensorgrad

Folders and files

Latest commit

History

Repository files navigation

TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training

Installation

Overview

Directory Structure

Running Experiments

Example Configurations

Optimizer Selection

TensorGRaD - unstructured sparse + low-rank

Single Projector (Low-Rank or Sparse)

Additional Flags

Datasets

Built-in Support

Custom Data

Memory Profiling

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages