Skip to content

Latest commit

 

History

History
220 lines (165 loc) · 12 KB

9___advanced.adoc

File metadata and controls

220 lines (165 loc) · 12 KB

Advanced Features

Partial Derivatives

FMI 3.0 provides optional access to partial derivatives for variables of an FMU. Partial derivatives can be used:

  • in Newton algorithms to solve algebraic loops,

  • in implicit integration algorithms in Model Exchange,

  • in iterative Co-Simulation algorithms, and

  • in optimization algorithms.

To avoid expensive numeric approximations of these derivatives, FMI offers dedicated functions to retrieve partial derivatives for variables of an FMU:

  • fmi3GetDirectionalDerivative to compute the directional derivatives \(\mathbf{v}_{\mathit{sensitivity}} = \mathbf{J} \cdot \mathbf{v}_{\mathit{seed}}\), and

  • fmi3GetAdjointDerivative to calculate the adjoint derivatives \(\mathbf{\bar{v}}_{\mathit{sensitivity}}^T = \mathbf{\bar{v}}_{\mathit{seed}}^T \cdot \mathbf{J}\)

with the Jacobian

\[\mathbf{J} = \begin{bmatrix} \frac{\partial g_1}{\partial v_{\mathit{known},1}} & \cdots & \frac{\partial g_1}{\partial v_{\mathit{known},n}} \\ \vdots & \ddots & \vdots \\ \frac{\partial g_m}{\partial v_{\mathit{known},1}} & \cdots & \frac{\partial g_m}{\partial v_{\mathit{known},n}} \end{bmatrix}\]

where \(\mathbf{v}_{\mathit{known}}\) are the \(n\) knowns, and \(\mathbf{g}\) are the \(m\) functions to calculate the \(m\) unknown variables \(\mathbf{v}_{\mathit{unknwon}}\) from the knowns.

The functions can be used to compute Jacobian-vector products or to construct the partial derivative matrices column-wise or row-wise by choosing the seed vector \(\mathbf{v}_{\mathit{seed}} \in \mathbb{R}^n\) or \(\mathbf{\bar{v}}_{\mathit{seed}} \in \mathbb{R}^m\), respectively, accordingly.

For information on the call signature see the FMI specification.

Directional Derivatives

The function fmi3GetDirectionalDerivative computes the directional derivative

\[\mathbf{v}_{\mathit{sensitivity}} = \mathbf{J} \cdot \mathbf{v}_{\mathit{seed}}\]

One can either retrieve the \(\mathit{i}\)-th column of the Jacobian by specifying the \(\mathit{i}\)-th unit vector \(\mathbf{e}_{\mathit{i}}\) as the seed vector \(\mathbf{v}_{\mathit{seed}}\), or compute a Jacobian-vector product \(\mathbf{Jv}\) by using \(\mathbf{v}\) as the seed vector \(\mathbf{v}_{\mathit{seed}}\). Therefore, the function can be utilized for the following purposes, among others:

  • Solving algebraic loops with a nonlinear solver requires matrix \({\frac{\partial \mathbf{g}}{\partial \mathbf{u}}}\).

  • Numerical integrators of stiff methods need matrix \({\frac{\partial \mathbf{f}}{\partial \mathbf{x}}}\).

  • If the FMU is connected with other FMUs, the partial derivatives of the state derivatives and outputs with respect to the continuous states and the inputs are needed in order to compute the Jacobian for the system of the connected FMUs.

  • If the FMU shall be linearized, the same derivatives as in the previous item are needed.

  • If the FMU is used as the model for an extended Kalman filter, \({\frac{\partial \mathbf{f}}{\partial \mathbf{x}}}\) and \({\frac{\partial \mathbf{g}}{\partial \mathbf{x}}}\) are needed.

  • If matrix-free linear solvers shall be used, Jacobian-vector products \({\mathbf{Jv}}\) are needed (e.g. as a user-supplied routine in CVODE [CVODE570]).

Example 1:
Assume an FMU has the output equations

\[\begin{bmatrix} y_1 \\ y_2 \end{bmatrix} = \begin{bmatrix} g_1(x, u_1, u_3, u_4) \\ g_2(x, u_1) \end{bmatrix}\]

and this FMU is connected, so that \({y_1, u_1, u_3}\) appear in an algebraic loop. Then the nonlinear solver needs a Jacobian and this Jacobian can be computed (without numerical differentiation) provided the partial derivative of \({y_1}\) with respect to \({u_1}\) and \({u_3}\) is available. Depending on the environment where the FMUs are connected, these derivatives can be provided:

(a) with one wrapper function around function fmi3GetDirectionalDerivative to compute the directional derivatives with respect to these two variables (in other words, \({v_{\mathit{unknown}} = y_1}\), \({v_{\mathit{known}} = \left \{ u_1, u_3 \right \}}\)), and then the environment calls this wrapper function with \({v_{\mathit{seed}} = \left \{ 1, 0 \right \}}\) to compute the partial derivative with respect to \({u_1}\) and \({v_{\mathit{seed}} = \left \{ 0, 1 \right \}}\) to compute the partial derivative with respect to \({u_3}\), or

(b) with two direct function calls of fmi3GetDirectionalDerivative (in other words, \({v_{\mathit{unknown}} = y_1, v_{\mathit{known}} = u_1, v_{\mathit{seed}} = 1}\); and \({v_{\mathit{unknown}} = y_1, v_{\mathit{known}} = u_3, v_{\mathit{seed}} = 1}\)).

Note that a direct implementation of this function with analytic derivatives:

(a) Provides the directional derivative for all input variables; so in the above example: \({\Delta y_1 = \frac{\partial g_1}{\partial x} \cdot \Delta x + \frac{\partial g_1}{\partial u_1} \cdot \Delta u_1 + \frac{\partial g_1}{\partial u_3} \cdot \Delta u_3 + \frac{\partial g_1}{\partial u_4} \cdot \Delta u_4}\)

(b) Initializes all seed-values to zero; so in the above example: \({\Delta x = \Delta u_1 = \Delta u_3 = \Delta u_4 = 0}\)

(c) Computes the directional derivative with the seed-values provided in the function arguments; so in the above example: \({v_{\mathit{sensitivity}} = \Delta y_1 (\Delta x = 0, \Delta u_1 = 1, \Delta u_3 = 0, \Delta u_4 = 0)}\)] and \({v_{\mathit{sensitivity}} = \Delta y_1 (\Delta x = 0, \Delta u_1 = 0, \Delta u_3 = 1, \Delta u_4 = 0)}\)]

Example 2:
If a dense matrix shall be computed, the columns of the matrix can be easily constructed by successive calls of fmi3GetDirectionalDerivative. For example, constructing the system Jacobian \({\mathbf{A} = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}}\) as dense matrix can be performed in the following way:

    //   c[]      column vector

    // set time, states and inputs
    fmi3SetTime(S, time);
    fmi3SetContinuousStates(S, x, nx);
    // fmi3Set{VariableType}(S, ...);

    // if required at this step, compute the Jacobian as a dense matrix
    for (i = 0; i < nx; i++) {
        // construct the Jacobian matrix column wise
        fmi3GetDirectionalDerivative(S, vr_dx, nx, &vr_x[i], 1, &dk, 1, c, nx);
        for (j = 0; j < nx; j++) {
            J[j][i] = c[j];
        }
    }

If the sparsity of a matrix shall be taken into account, then the matrix can be constructed in the following way:

  • The incidence information of the matrix (whether an element is zero or not zero) is extracted from the XML file from element ModelStructure.

  • A so called graph coloring algorithm is employed to determine the columns of the matrix that can be computed by one call of fmi3GetDirectionalDerivative.

  • For the columns determined in (2), one call to fmi3GetDirectionalDerivative is made. After each such call, the elements of the resulting directional derivative vector are copied into their correct locations of the partial derivative matrix.

More details and implementational notes are available from [ABL12].

Example 3:
Directional derivatives for higher dimension variables are almost treated in the same way. Consider, for example, an FMU which calculates its output \({Y}\) by multiplying its 2x2 input \({U}\) with a 3x2 constant gain \({K}\), with

\[K= \begin{bmatrix} a, b \\ c, d \\ e, f \end{bmatrix}\]

The output \({Y=K U}\) is a matrix of size 3x2. The directional derivative of an output element \({Y(i,j)}\) with respect to the input \({U}\) and the seed \({\Delta U}\) is:

\[\Delta Y(i,j) = \frac{\partial Y(i,j)}{\partial U(1,1)} \cdot \Delta U(1,1) + \frac{\partial Y(i,j)}{\partial U(1,2)} \cdot \Delta U(1,2) + \frac{\partial Y(i,j)}{\partial U(2,1)} \cdot \Delta U(2,1) + \frac{\partial Y(i,j)}{\partial U(2,2)} \cdot \Delta U(2,2)\]
\[\Delta \mathbf{Y} = \begin{bmatrix} a \Delta U(1,1)+b \Delta U(2,1), a \Delta U(1,2)+ b \Delta U(2,2) \\ c \Delta U(1,1)+d \Delta U(2,1), c \Delta U(1,2)+ d \Delta U(2,2) \\ e \Delta U(1,1)+f \Delta U(2,1), e \Delta U(1,2)+ f \Delta U(2,2) \end{bmatrix}\]

To get the directional derivative of \({Y}\) with respect to \({U(2,1)}\) the command fmi3GetDirectionalDerivative(m, vr_Y, 1, vr_U, 1, {0.0, 0.0, 1.0, 0.0}, 4, dd, 6) can be used where vr_Y and vr_U are references of the variable \({Y}\) and \({U}\), respectively. Note that in order to get the directional derivative of \({Y}\) with respect to \({U(2,1)}\), the seed value {0, 0, 1.0, 0} has been used. The retrieved directional derivative dd is stored in a matrix of size 3x2, so nSensitivity is 6.

Adjoint Derivatives

The function fmi3GetAdjointDerivative computes the adjoint derivative

\[\mathbf{\bar{v}}_{\mathit{sensitivity}}^T = \mathbf{\bar{v}}_{\mathit{seed}}^T \cdot \mathbf{J} \quad \text{or} \quad \mathbf{\bar{v}}_{\mathit{sensitivity}} = \mathbf{J}^T \cdot \mathbf{\bar{v}}_{\mathit{seed}}\]

One can either retrieve the \(\mathit{i}\)-th row of the Jacobian by specifying the \(\mathit{i}\)-th unit vector \(\mathbf{e}_{\mathit{i}}\) as the seed vector \(\mathbf{\bar{v}}_{\mathit{seed}}\), or compute a vector-Jacobian product \(\mathbf{v}^T\mathbf{J}\) by using \(\mathbf{v}\) as the seed vector \(\mathbf{\bar{v}}_{\mathit{seed}}\).

Adjoint derivatives are beneficial in several contexts:

  • in artificial intelligence (AI) frameworks the adjoint derivatives are called "vector gradient products" (VJPs). There adjoint derivatives are used in the backpropagation process to perform gradient-based optimization of parameters using reverse mode automatic differentiation (AD), see, e.g., [BPRS15].

  • in parameter estimation (see [BKF17])

Typically, reverse mode automatic differentiation (AD) is more efficient for these use cases than forward mode AD because the number of knowns is much higher than the number of unknowns (\(\mathit{n} \gg \mathit{m}\)), as explained in the cited references. If the full Jacobian is needed and the number of knowns and unknowns are somewhat equal (\(\mathit{m} \approx \mathit{n}\)) or small, the column-wise construct using fmi3GetDirectionalDerivative is generally more efficient.

Example:
Assume an FMU has the output equations

\[\begin{bmatrix} y_1 \\ y_2 \end{bmatrix} = \begin{bmatrix} h_1(u_1, u_2) \\ h_2(u_1, u_2) \end{bmatrix}\]

and \(\left( w_1, w_2 \right)^T \cdot \mathbf{ \frac{\partial h}{\partial u} }\) for some vector \(\left( w_1, w_2 \right)^T\) is needed. Then one can get this with one function call of fmi3GetAdjointDerivative (with arguments \(\mathbf{v}_{\mathit{unknown}} = \text{valueReferences of} \left \{ y_1, y_2 \right \}\), \(\mathbf{v}_{\mathit{known}} = \text{valueReferences of} \left \{ u_1, u_2 \right \}\), \(\mathbf{\bar{v}}_{\mathit{seed}} = \left( w_1, w_2 \right)^T\)), while with fmi3GetDirectionalDerivative at least two calls would be necessary to first construct the Jacobian column-wise and then multiplying from the right with \(\left( w_1, w_2 \right)^T\).

If a dense matrix shall be computed, the rows of the matrix can be easily constructed by successive calls of fmi3GetAdjointDerivative. For example, constructing the system Jacobian \({\mathbf{A} = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}}\) as a dense matrix can be performed in the following way:

    for (i = 0; i < nx; i++) {
        // construct the Jacobian matrix column wise
        fmi3GetAdjointDerivative(S, &vr_dx[i], 1, vr_x, nx, &dk, 1, &J[i][0], nx);
    }