Skip to content

New library: Reverse-mode automatic differentiation #1302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 49 commits into
base: develop
Choose a base branch
from

Conversation

demroz
Copy link

@demroz demroz commented Aug 16, 2025

This pull request introduces a new library for reverse-mode automatic differentiation. Its a tape based reverse mode autodiff, so the idea is to build a computational graph, and then call backward on the entire graph to compute all the gradients at once.

Currently it supports all the basic operations (+,-,/,/), everything listed in conceptual requirements for real number types, and boost calls to erf, erc, erf_inv and erfc_inv. Everything is tested up to the 4th derivative. The list of tests:
test_reverse_mode_autodiff_basic_math_ops.cpp
test_reverse_mode_autodiff_comparison_operators.cpp
test_reverse_mode_autodiff_constructors.cpp
test_reverse_mode_autodiff_error_functions.cpp
test_reverse_mode_autodiff_flat_linear_allocator.cpp
test_reverse_mode_autodiff_stl_support.cpp

There are also two examples in the example directory:
reverse_mode_linear_regression_example.cpp -> simple linear regression that demonstrates how this library can be used for optimization

autodiff_reverse_black_scholes.cpp -> a rewrite of the forward mode equivalent.

Important notes

  1. The intent of the library is to be an engine for gradient based optimization. (In the future I'd be interested in adding some gradient based optimizers to boost if there is interest.) I wrote it with first derivatives in mind. Although everything is tested up to the 4th derivative, the library is intended to be optimally used for calculating first order derivatives. Its best to use forward mode autodiff if very high orders are needed.
  2. The library is based on expression templates. This means that although my autodiff variable rvar does satisfy all the requirements of a real-type, it doesn't necessarily play nicely with boost special functions. For example:
rvar<double,1> x = make_rvar<T,1>(1.0);
rvar<double,1> y = make_rvar<T,1>(1.0);
rvar<double,1> z = make_rvar<T,1>(1.0);
auto f = x+y*z;

f in this case is not actually type rvar, but add_expr<rvar,mult_expr<rvar,rvar>>

  1. reverse_mode_autodiff_memory_management.hpp has a flat_linear_allocator class thats essentially a specialized memory arena. It uses placement new for memory allocations. This is a deliberate design choice. The flat_linear_allocator destructor explicitly calls the destructors of individual elements. Explicit calls to delete shouldn't be needed here

Thank you, and looking forward to your feedback

@demroz
Copy link
Author

demroz commented Aug 17, 2025

looks like msvc errors are gone. Not sure why CI fails but looking inside it doesn't seem to be related to any of my code

@ckormanyos
Copy link
Member

ckormanyos commented Aug 17, 2025

Not sure why CI fails but looking inside it doesn't seem to be related to any of my code

The failure(s) in CI seem harmless, as you mentioned.

The job that failed CI is on Drone. This, in fact, very many jobs running on various older runner images on Drone. Most of the tests for older compilers such as GCC 7, 8 or old clang. And non-x86_64 architectures (insofar as these are available) run on Drone.

But sometimes, Drone has a hard time acquiring a VM or setting it up properly. So your failures failed in about 3 minutes - too short for a normal run, as these are mostly 7 - 30 minutes. It looks like all the Drone failures were the failure to properly setup LINUX. It gets annoying, but Drone has to be manually judged sometimes to see if failures were real or bogus. This gets doubly annoying when there is actually mixture of real and bogus errors. Which is what happens when we write new code. Sigh, that's just how it goes.

On the upside, GHA is super-reliable nowdays.

@ckormanyos
Copy link
Member

Hi @demroz I will be taking a dedicated look at this hopefully in the next few days. I intend to extend one of your examples and really get my hands onto this code and learn more about it.

I'll also be looking into top-level quality aspects such as retaining code coverage and compiler warnings (if there are any). I might have questions along the way. Please give me a few days on this.

We could also benefit later from some feedback later from John and Matt.

Let me get into this thing for a few days.

Copy link
Member

@mborland mborland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of my comments can be repeated in many places (e.g. constants, using statements, etc), but I didn't as to not completely overload this.

@NAThompson
Copy link
Collaborator

NAThompson commented Aug 18, 2025

Just as a quick suggestion: The test_functions_for_optimization.hpp might be a very nice place to do unit tests for this as most of these are well suited for reverse mode AD; e.g., mapping from ℝ^n →ℝ.

As an aside: The gradient based optimizers are a great idea, and would make a nice addition to the black-box optimizers we currently have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants