Skip to content

Commit

Permalink
No byproduct by default (#57)
Browse files Browse the repository at this point in the history
* no byproduct by default

* fixes and update examples

* consistent byproduct type parameter

* same byproduct type parameter

* Improve docs and revert to handle_byproduct / return_byproduct

* Move byproduct syntax to its own example

* Work in progress

* Clarify public/private API in docs

* Split nightly CI

---------

Co-authored-by: Guillaume Dalle <[email protected]>
  • Loading branch information
mohamed82008 and gdalle authored Jun 1, 2023
1 parent ed66786 commit 70dab9e
Show file tree
Hide file tree
Showing 20 changed files with 386 additions and 217 deletions.
33 changes: 33 additions & 0 deletions .github/workflows/CI-nightly.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: CI-nightly
on:
push:
branches:
- main
tags:
- '*'
pull_request:
concurrency:
# Skip intermediate builds: always.
# Cancel intermediate builds: only if it is a pull request build.
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
jobs:
test:
name: Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }}
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
version: ['nightly']
os: [ubuntu-latest]
arch: [x64]
allow_failure: [false]
steps:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@v1
with:
version: ${{ matrix.version }}
arch: ${{ matrix.arch }}
- uses: julia-actions/cache@v1
- uses: julia-actions/julia-buildpkg@v1
- uses: julia-actions/julia-runtest@v1
5 changes: 0 additions & 5 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,6 @@ jobs:
os: [ubuntu-latest]
arch: [x64]
allow_failure: [false]
include:
- version: 'nightly'
os: ubuntu-latest
arch: x64
allow_failure: true
steps:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@v1
Expand Down
2 changes: 2 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
Krylov = "ba0b0d4f-ebba-5204-a429-3ac8c609bfb7"
LinearOperators = "5c8ed15e-5a4c-59e4-a42b-c7e8811fb125"
Requires = "ae029012-a4dd-5104-9daa-d747884805df"
SimpleUnPack = "ce78b400-467f-4804-87d8-8f486da07d0a"

[weakdeps]
ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
Expand All @@ -26,6 +27,7 @@ ForwardDiff = "0.10"
Krylov = "0.8, 0.9"
LinearOperators = "2.2"
Requires = "1.3"
SimpleUnPack = "1.1"
julia = "1.6"

[extras]
Expand Down
7 changes: 6 additions & 1 deletion docs/Manifest.toml
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ uuid = "615f187c-cbe4-4ef1-ba3b-2fcf58d6d173"
version = "0.1.1"

[[deps.ImplicitDifferentiation]]
deps = ["AbstractDifferentiation", "Krylov", "LinearOperators", "Requires"]
deps = ["AbstractDifferentiation", "Krylov", "LinearOperators", "Requires", "SimpleUnPack"]
path = ".."
uuid = "57b37032-215b-411a-8a7c-41a003a55207"
version = "0.5.0-DEV"
Expand Down Expand Up @@ -591,6 +591,11 @@ git-tree-sha1 = "e2cc6d8c88613c05e1defb55170bf5ff211fbeac"
uuid = "efcf1570-3423-57d1-acb7-fd33fddbac46"
version = "1.1.1"

[[deps.SimpleUnPack]]
git-tree-sha1 = "58e6353e72cde29b90a69527e56df1b5c3d8c437"
uuid = "ce78b400-467f-4804-87d8-8f486da07d0a"
version = "1.1.0"

[[deps.SnoopPrecompile]]
deps = ["Preferences"]
git-tree-sha1 = "e760a70afdcd461cf01a575947738d359234665c"
Expand Down
2 changes: 2 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
using ChainRulesCore: ChainRulesCore
using Documenter
using ForwardDiff: ForwardDiff
using ImplicitDifferentiation
using Literate

Expand Down
19 changes: 14 additions & 5 deletions docs/src/api.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,21 @@
# API reference

## Index
## Public

```@index
```@docs
ImplicitFunction
```

## Docstrings
## Internals

```@docs
ImplicitDifferentiation.Forward
ImplicitDifferentiation.Conditions
ImplicitDifferentiation.PushforwardMul!
ImplicitDifferentiation.PullbackMul!
```

```@autodocs
Modules = [ImplicitDifferentiation]
## Index

```@index
```
8 changes: 4 additions & 4 deletions docs/src/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,19 @@ In the future, we would like to add [Enzyme.jl](https://github.com/EnzymeAD/Enzy
## Higher-dimensional arrays

For simplicity, our examples only display functions that eat and spit out vectors.
However, arbitrary array shapes are supported, as long as the forward _and_ conditions callables return similar arrays.
However, arbitrary array shapes are supported, as long as the forward mapping _and_ conditions return similar arrays.
Beware however, sparse arrays will be densified in the differentiation process.

## Scalar input / output

Functions that eat or spit out a single number are not supported.
The forward _and_ conditions callables need arrays: for example, instead of returning `value` you should return `[value]` (a 1-element `Vector`).
The forward mapping _and_ conditions need arrays: for example, instead of returning `value` you should return `[value]` (a 1-element `Vector`).
Consider using an `SVector` from [StaticArrays.jl](https://github.com/JuliaArrays/StaticArrays.jl) if you seek increased performance.

## Multiple inputs / outputs

In this package, implicit functions can only take a single input array `x` and output a single output array `y` (plus the byproduct `z`).
But sometimes, your forward pass or conditions may require multiple input arrays, say `a` and `b`:
But sometimes, your forward mapping or conditions may require multiple input arrays, say `a` and `b`:

```julia
function f(a, b)
Expand All @@ -41,7 +41,7 @@ The same trick works for multiple outputs.

## Using byproducts

At first glance, it is not obvious why we impose that the `forward` callable returns a byproduct `z` in addition to `y`.
At first glance, it is not obvious why we impose that the forward mapping should return a byproduct `z` in addition to `y`.
It is mainly useful when the solution procedure creates objects such as Jacobians, which we want to reuse when computing or differentiating the `conditions`.
We will provide simple examples soon.
In the meantime, an advanced application is given by [DifferentiableFrankWolfe.jl](https://github.com/gdalle/DifferentiableFrankWolfe.jl).
Expand Down
6 changes: 3 additions & 3 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,16 @@ CurrentModule = ImplicitDifferentiation

# ImplicitDifferentiation.jl

[ImplicitDifferentiation.jl](https://github.com/gdalle/ImplicitDifferentiation.jl) is a package for automatic differentiation of functions defined implicitly, i.e., mappings
[ImplicitDifferentiation.jl](https://github.com/gdalle/ImplicitDifferentiation.jl) is a package for automatic differentiation of functions defined implicitly, i.e., _forward mappings_

```math
x \in \mathbb{R}^n \longmapsto y(x) \in \mathbb{R}^m
```

whose output is defined by conditions
whose output is defined by _conditions_

```math
F(x,y(x)) = 0 \in \mathbb{R}^m
c(x,y(x)) = 0 \in \mathbb{R}^m
```

## Background
Expand Down
39 changes: 19 additions & 20 deletions examples/0_basic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ In this example, we demonstrate the basics of our package on a simple function t
=#

using ChainRulesCore #src
using ChainRulesTestUtils #src
using ForwardDiff
using ImplicitDifferentiation
using JET #src
Expand Down Expand Up @@ -86,46 +85,43 @@ We represent it using a type called `ImplicitFunction`, which you will see in ac
=#

#=
First we define a `forward` pass correponding to the function we consider.
It returns the actual output $y(x)$ of the function, as well as the optional byproduct $z(x)$.
Here we don't need any additional information, so we set $z(x)$ to $0$.
Importantly, this forward pass _doesn't need to be differentiable_.
First we define a forward mapping correponding to the function we consider.
It returns the actual output $y(x)$ of the function, and can be thought of as a black box solver.
Importantly, this Julia callable _doesn't need to be differentiable by automatic differentiation packages but the underlying function still needs to be mathematically differentiable_.
=#

function forward(x)
y = mysqrt(x)
z = 0
return y, z
end
forward(x) = mysqrt(x)

#=
Then we define `conditions` $F(x, y, z) = 0$ that the output $y(x)$ is supposed to satisfy.
These conditions must be array-valued, with the same size as $y$, and take $z$ as an additional argument.
And unlike the forward pass, _the conditions need to be differentiable_ with respect to $x$ and $y$.
Here they are very obvious: the square of the square root should be equal to the original value.
Then we define `conditions` $c(x, y) = 0$ that the output $y(x)$ is supposed to satisfy.
These conditions must be array-valued, with the same size as $y$.
Unlike the forward mapping, _the conditions need to be differentiable by automatic differentiation packages_ with respect to both $x$ and $y$.
Here the conditions are very obvious: the square of the square root should be equal to the original value.
=#

function conditions(x, y, z)
function conditions(x, y)
c = y .^ 2 .- x
return c
end

#=
Finally, we construct a wrapper `implicit` around the previous objects.
What does this wrapper do?
By default, `forward` is assumed to return a single output and `conditions`
is assumed to accept 2 arguments.
=#

implicit = ImplicitFunction(forward, conditions)

#=
What does this wrapper do?
When we call it as a function, it just falls back on `first ∘ implicit.forward`, so unsurprisingly we get the first output $y(x)$.
=#

implicit(x) sqrt.(x)
@test implicit(x) sqrt.(x) #src

#=
And when we try to compute its Jacobian, the [implicit function theorem](https://en.wikipedia.org/wiki/Implicit_function_theorem) is applied in the background to circumvent the lack of differentiablility of the forward pass.
And when we try to compute its Jacobian, the [implicit function theorem](https://en.wikipedia.org/wiki/Implicit_function_theorem) is applied in the background to circumvent the lack of differentiablility of the forward mapping.
=#

# ## Forward and reverse mode autodiff
Expand All @@ -151,14 +147,17 @@ We can even go higher-order by mixing the two packages (forward-over-reverse mod
The only technical requirement is to switch the linear solver to something that can handle dual numbers:
=#

linear_solver(A, b) = (Matrix(A) \ b, (solved=true,))
implicit2 = ImplicitFunction(forward, conditions, linear_solver)
manual_linear_solver(A, b) = (Matrix(A) \ b, (solved=true,))

implicit_higher_order = ImplicitFunction(
forward, conditions; linear_solver=manual_linear_solver
)

#=
Then the Jacobian itself is differentiable.
=#

h = rand(2)
J_Z(t) = Zygote.jacobian(implicit2, x .+ t .* h)[1]
J_Z(t) = Zygote.jacobian(implicit_higher_order, x .+ t .* h)[1]
ForwardDiff.derivative(J_Z, 0) Diagonal((-0.25 .* h) ./ (x .^ 1.5))
@test ForwardDiff.derivative(J_Z, 0) Diagonal((-0.25 .* h) ./ (x .^ 1.5)) #src
27 changes: 9 additions & 18 deletions examples/1_unconstrained_optim.jl
Original file line number Diff line number Diff line change
Expand Up @@ -32,35 +32,26 @@ In this case, the optimization problem boils down to the componentwise square ro
Note the presence of a keyword argument.
=#

function mysqrt_optim(x; method)
function forward_optim(x; method)
f(y) = sum(abs2, y .^ 2 .- x)
y0 = ones(eltype(x), size(x))
result = optimize(f, y0, method)
return Optim.minimizer(result)
end

#=
First, we create the forward pass which returns the solution $y(x)$.
Remember that it should also return a byproduct $z(x)$, which is useless here.
=#
function forward_optim(x; method)
y = mysqrt_optim(x; method)
z = 0
return y, z
end

#=
Even though they are defined as a gradient, it is better to provide optimality conditions explicitly: that way we avoid nesting autodiff calls.
Remember, the conditions should accept three arguments to take additional information into account when needed.
Moreover, the forward pass and the conditions should accept the same set of keyword arguments.
Even though they are defined as a gradient, it is better to provide optimality conditions explicitly: that way we avoid nesting autodiff calls. By default, the conditions should accept two arguments as input.
The forward mapping and the conditions should accept the same set of keyword arguments.
=#

function conditions_optim(x, y, z; method)
function conditions_optim(x, y; method)
∇₂f = 2 .* (y .^ 2 .- x)
return ∇₂f
end

# We now have all the ingredients to construct our implicit function.
#=
We now have all the ingredients to construct our implicit function.
=#

implicit_optim = ImplicitFunction(forward_optim, conditions_optim)

Expand Down Expand Up @@ -89,7 +80,7 @@ Unsurprisingly, the Jacobian is the identity.
In this instance, we could use ForwardDiff.jl directly on the solver, but it returns the wrong result (not sure why).
=#

ForwardDiff.jacobian(_x -> mysqrt_optim(x; method=LBFGS()), x)
ForwardDiff.jacobian(_x -> forward_optim(x; method=LBFGS()), x)

# ## Reverse mode autodiff

Expand All @@ -102,7 +93,7 @@ In this instance, we cannot use Zygote.jl directly on the solver (due to unsuppo
=#

try
Zygote.jacobian(_x -> mysqrt_optim(x; method=LBFGS()), x)[1]
Zygote.jacobian(_x -> forward_optim(x; method=LBFGS()), x)[1]
catch e
e
end
20 changes: 6 additions & 14 deletions examples/2_nonlinear_solve.jl
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ F(x, y) = y \odot y - x = 0
In this case, the optimization problem boils down to the componentwise square root function, but we implement it using a black box solver from [NLsolve.jl](https://github.com/JuliaNLSolvers/NLsolve.jl).
=#

function mysqrt_nlsolve(x; method)
function forward_nlsolve(x; method)
F!(storage, y) = (storage .= y .^ 2 - x)
initial_y = ones(eltype(x), size(x))
result = nlsolve(F!, initial_y; method)
Expand All @@ -40,17 +40,9 @@ end

#-

function forward_nlsolve(x; method)
y = mysqrt_nlsolve(x; method)
z = 0
return y, z
end

#-

function conditions_nlsolve(x, y, z; method)
F = y .^ 2 .- x
return F
function conditions_nlsolve(x, y; method)
c = y .^ 2 .- x
return c
end

#-
Expand All @@ -77,7 +69,7 @@ ForwardDiff.jacobian(_x -> implicit_nlsolve(_x; method=:newton), x)

#-

ForwardDiff.jacobian(_x -> mysqrt_nlsolve(_x; method=:newton), x)
ForwardDiff.jacobian(_x -> forward_nlsolve(_x; method=:newton), x)

# ## Reverse mode autodiff

Expand All @@ -87,7 +79,7 @@ Zygote.jacobian(_x -> implicit_nlsolve(_x; method=:newton), x)[1]
#-

try
Zygote.jacobian(_x -> mysqrt_nlsolve(_x; method=:newton), x)[1]
Zygote.jacobian(_x -> forward_nlsolve(_x; method=:newton), x)[1]
catch e
e
end
Loading

0 comments on commit 70dab9e

Please sign in to comment.