Issue 7/implicit functions #8

IvanYashchuk · 2020-04-15T09:50:06Z

Hello @charlesm93, here are my friendly suggestions on the section you wrote.

I propose renaming "algebraic equations" -> "nonlinear system" as the same results hold for this more general case (including also PDEs).
Implicit function theorem just states that du/dpsi exists under certain conditions, therefore, I renamed the section to "tangent linear method".
I think that there is no need to introduce the Lagrangian approach for the adjoint method, as it is naturally derived from the tangent method. I propose to remove that section.
In "practical considerations" I have added the expressions needed for the implementation of forward and reverse AD of nonlinear solvers.

Ref. #7

bob-carpenter · 2020-04-15T15:44:43Z

Thanks for the PR!

"Algebraic equations" is how these are typically described in the literature. I'd like to continue to at least connect to the mainstream literature.

Part of the motivation for writing this volume is to provide an accessible presentation. So I'd like to keep the presentations to the simplest possible level rather than the most general mathematical level. I'm just bringing this up as a criterion to consider this PR.

I'll let @charlesm93 respond directly on the content.

This reverts commit 2c459c3.

charlesm93 · 2020-04-15T20:23:11Z

Hi @IvanYashchuk,
thank you for chiming in. I'll go through your PR, but first a few comments:

Another reason for keeping "algebraic equation" is that the handbook is meant to be usable as a lookup textbook.
Statements of the implicit function theorem I've encountered also give you the Jacobian matrix. In any case, to derive the Jacobian, you need u to be a function of psi. Is "tangent linear method" a termed used in the certainf fields?
the case of PDEs is different because the target is an integral of f, rather than a function of it. The adjoint method is then more sophisticated.
You're right that we don't need to discuss the Lagrangian approach for algebraic equation, but I'm introducing it here for pedagogical reasons. It is easy to understand with algebraic equations and much more difficult for ODEs, DAEs, etc where we will need them.

What I do in this section might make more sense once I've written the other sections on implicit functions.

IvanYashchuk · 2020-04-16T06:57:46Z

Is "tangent linear method" a termed used in the certain fields?

It is a common term, even used in the current draft

ad-handbook/index.Rmd

Line 103 in 09c8556

Forward-mode automatic differentiation employs the tangent method of

the case of PDEs is different because the target is an integral of f, rather than a function of it.

There are many different PDEs and there are many different objective (target?) functions. AD rules are independent of the form of the objective functions. Stationary PDEs are nonlinear equations of form F(x) = 0, linear stationary PDEs can be written in the same form, time-dependent PDEs are often reduced to a sequence of stationary PDEs.

Alright, I agree that the Lagrangian approach is needed for ODEs, etc. Anyways the actual rules for AD are more important than the derivation of them.

charlesm93 · 2020-04-16T13:49:02Z

It is a common term,

Point taken.

Adding a summary at the end, as you have done, with only the results plainly stated works well. These sections can be named "tangent linear method" and "adjoint method" (again, as you have done). The first sections, in which the differentiation algorithms are derived, can be named "Implicit function theorem" and "Lagrangian approach". The implicit function theorem allows you to derive the tangent linear and adjoint method.

AD rules are independent of the form of the objective functions

Some properties of the objective function matter. A scalar objective motivates an adjoint method, while a long vector may warrant a tangent method.

Stationary PDEs are nonlinear equations of form F(x) = 0

Isn't an equation of the form F(x) = 0 simply an algebraic equation? To have a PDE, F must also depend on partial derivatives of x. Unless of course you can point me to an example where this is not the case.

Anyways the actual rules for AD are more important than the derivation of them.

It depends on the reader. For Stan developers, being able to extend results to other cases can sometimes be important, as the AD handbook may not be exhaustive.

charlesm93

Overall this is very good. Please address the requested changes.

I'm still drafting this section (and the other chapters on implicit functions). In the future, please wait for a chapter to be finished and merged to the master branch before proposing a PR. You can always bring up issues. All that said, thank you for your help!

algebraic-equations.Rmd

charlesm93 · 2020-04-16T13:59:25Z

algebraic-equations.Rmd

+with one additional matrix multiplication operation
+\begin{equation*}
+  \frac{\mathrm d j}{\mathrm d \psi}^{*} = - \frac{\partial f}{\partial \psi}^{*} \lambda
+  + \frac{\partial j}{\partial \psi}^{*}.
 \end{equation*}


This is a pain to read without the pdf output. Can you summarize the change you made here?

Here are derivations with rendered latex, though the symbols I used there are different
https://colab.research.google.com/drive/1zA75xSfsy2d7-7ojWoUbmCqS6CZy30m3

Seeing I'm doing a code review, linking to similar code that does the calculations with different symbols is not good enough. The code you have doesn't compile, and I had to make a few corrections to render it. You use * for transpose, but this is inconsistent with previous notation where T is used. You also do not use []^{-1} as I have requested for other sections of your code.

Last but not least, it is unclear to me why you have replaced the original calculations.

algebraic-equations.Rmd

IvanYashchuk · 2020-04-16T14:43:35Z

I think I've addressed all the needed changes.
Rendered latex derivation: #8 (comment)

charlesm93

There is still a bit of work to be done and you need to test that your code compiles.

(I realize that I probably misinterpreted your friendly suggestions as a formal PR, and I could've taken some of your suggestions without doing code review. But since we're almost done, we may as well make a merge.)

charlesm93 · 2020-04-17T18:50:46Z

algebraic-equations.Rmd

+with one additional matrix multiplication operation
+\begin{equation*}
+  \frac{\mathrm d j}{\mathrm d \psi}^{*} = - \frac{\partial f}{\partial \psi}^{*} \lambda
+  + \frac{\partial j}{\partial \psi}^{*}.
 \end{equation*}


Seeing I'm doing a code review, linking to similar code that does the calculations with different symbols is not good enough. The code you have doesn't compile, and I had to make a few corrections to render it. You use * for transpose, but this is inconsistent with previous notation where T is used. You also do not use []^{-1} as I have requested for other sections of your code.

Last but not least, it is unclear to me why you have replaced the original calculations.

charlesm93 · 2020-04-17T18:51:39Z

algebraic-equations.Rmd

    + \frac{\partial f}{\partial u}\frac{\mathrm d u}{\mathrm d \psi} = 0 \\
  \iff & \frac{\partial f}{\partial u}\frac{\mathrm d u}{\mathrm d \psi} 
    = - \frac{\partial f}{\partial \psi}  \\
  \iff & \frac{\mathrm d u}{\mathrm d \psi} = - \left [\frac{\partial f}{\partial u} \right]^{-1}
    \frac{\partial f}{\partial \psi}
 \end{align*}
 where we assume the requisite differentiation and inversion are possible.
+The linear system that is solved for $\frac{\mathrm d u}{\mathrm d \psi}$ is called the _tangent linear system_.


This may be true but I don't see the point of introducing this notion here.

IvanYashchuk added 6 commits April 15, 2020 09:55

Fixed typos

393ea4a

Added a few fixes to implicit function theorem section

3560b76

Added (imo better) derivation of the adjoint approach

24a7879

Added info on forward and reverse AD

41b22a2

Renamed algebraic eqns -> nonlinear system

2c459c3

Fixed typo

c5b9c16

Revert "Renamed algebraic eqns -> nonlinear system"

027c306

This reverts commit 2c459c3.

charlesm93 self-assigned this Apr 16, 2020

charlesm93 requested changes Apr 16, 2020

View reviewed changes

IvanYashchuk added 2 commits April 16, 2020 17:29

Addressed the requested changes

028b658

Move new content to summary of results section

96f0b3f

charlesm93 requested changes Apr 17, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 7/implicit functions #8

Issue 7/implicit functions #8

IvanYashchuk commented Apr 15, 2020

bob-carpenter commented Apr 15, 2020

charlesm93 commented Apr 15, 2020 •

edited

Loading

IvanYashchuk commented Apr 16, 2020

charlesm93 commented Apr 16, 2020 •

edited

Loading

charlesm93 left a comment •

edited

Loading

charlesm93 Apr 16, 2020

IvanYashchuk Apr 16, 2020

charlesm93 Apr 17, 2020

IvanYashchuk commented Apr 16, 2020

charlesm93 left a comment

charlesm93 Apr 17, 2020

charlesm93 Apr 17, 2020

Issue 7/implicit functions #8

Are you sure you want to change the base?

Issue 7/implicit functions #8

Conversation

IvanYashchuk commented Apr 15, 2020

bob-carpenter commented Apr 15, 2020

charlesm93 commented Apr 15, 2020 • edited Loading

IvanYashchuk commented Apr 16, 2020

charlesm93 commented Apr 16, 2020 • edited Loading

charlesm93 left a comment • edited Loading

Choose a reason for hiding this comment

charlesm93 Apr 16, 2020

Choose a reason for hiding this comment

IvanYashchuk Apr 16, 2020

Choose a reason for hiding this comment

charlesm93 Apr 17, 2020

Choose a reason for hiding this comment

IvanYashchuk commented Apr 16, 2020

charlesm93 left a comment

Choose a reason for hiding this comment

charlesm93 Apr 17, 2020

Choose a reason for hiding this comment

charlesm93 Apr 17, 2020

Choose a reason for hiding this comment

charlesm93 commented Apr 15, 2020 •

edited

Loading

charlesm93 commented Apr 16, 2020 •

edited

Loading

charlesm93 left a comment •

edited

Loading