Refactor to directly support higher order derivatives (infiniteopt#341)

* Refactor to support higher order derivatives * fix test bugs * Minor fixes * fix finitedifference behavior * more fixes
zavalab · May 18, 2024 · dc1e303 · dc1e303
1 parent f1b41c2
commit dc1e303
Show file tree

Hide file tree

Showing 26 changed files with 748 additions and 441 deletions.
diff --git a/docs/src/develop/extensions.md b/docs/src/develop/extensions.md
@@ -200,11 +200,12 @@ we provide an API to do just this. A complete template is provided in
 to help streamline this process. The extension steps are:
 1. Define the new method `struct` that inherits from the correct 
    [`AbstractDerivativeMethod`](@ref) subtype
-2. Extend [`InfiniteOpt.generative_support_info`](@ref InfiniteOpt.generative_support_info(::AbstractDerivativeMethod)) 
+2. Extend [`allows_high_order_derivatives`](@ref)
+3. Extend [`InfiniteOpt.generative_support_info`](@ref InfiniteOpt.generative_support_info(::AbstractDerivativeMethod)) 
    if the method is a [`GenerativeDerivativeMethod`](@ref)
-3. Extend [`InfiniteOpt.evaluate_derivative`](@ref).
+4. Extend [`InfiniteOpt.evaluate_derivative`](@ref).
 
-To exemplify this process let's implement explicit Euler which is already 
+To exemplify this process let's implement 1st order explicit Euler which is already 
 implemented via `FiniteDifference(Forward())`, but let's make our own anyway for 
 the sake of example. For a first order derivative ``\frac{d y(t)}{dt}`` explicit 
 Euler is expressed:
@@ -228,10 +229,23 @@ support scheme without adding any additional supports. If our desired method
 needed to add additional supports (e.g., orthogonal collocation over finite 
 elements) then we would need to have used [`GenerativeDerivativeMethod`](@ref).
 
-Since, this is a `NonGenerativeDerivativeMethod` we skip step 2. This is 
+Now we need to decide if this method will directly support higher order derivatives. 
+In this case, let's say it won't and define:
+```jldoctest deriv_ext; output = false
+InfiniteOpt.allows_high_order_derivatives(::ExplicitEuler) = false
+
+# output
+
+
+```
+Conversely, we could set the output to `true` if we wanted to directly support higher 
+order derivatives. In which case, we would need to query [`derivative_order`](@ref) 
+in [`InfiniteOpt.evaluate_derivative`](@ref) and account for the order as needed.
+
+Since, this is a `NonGenerativeDerivativeMethod` we skip step 3. This is 
 however exemplified in the extension template.
 
-Now we just need to do step 3 which is to extend 
+Now we just need to do step 4 which is to extend 
 [`InfiniteOpt.evaluate_derivative`](@ref). This function generates all the 
 expressions necessary to build the derivative evaluation equations (derivative 
 constraints). We assume these relations to be of the form ``h = 0`` where ``h`` 
@@ -251,11 +265,11 @@ With this in mind let's now extend `InfiniteOpt.evaluate_derivative`:
 ```jldoctest deriv_ext; output = false
 function InfiniteOpt.evaluate_derivative(
     dref::GeneralVariableRef, 
+    vref::GeneralVariableRef, # the variable that the derivative acts on
     method::ExplicitEuler,
     write_model::JuMP.AbstractModel
-    )::Vector{JuMP.AbstractJuMPScalar}
+    )
     # get the basic derivative information 
-    vref = derivative_argument(dref)
     pref = operator_parameter(dref)
     # generate the derivative expressions h_i corresponding to the equations of 
     # the form h_i = 0
@@ -279,7 +293,8 @@ We used [`InfiniteOpt.make_reduced_expr`](@ref) as a convenient helper function
 to generate the semi-infinite variables/expressions we need to generate at each 
 support point. Also note that [`InfiniteOpt.add_generative_supports`](@ref) needs 
 to be included for `GenerativeDerivativeMethods`, but is not necessary in this 
-example.
+example. We also would have needed to query [`derivative_order`](@ref) and take it 
+into account if we had selected this method to support higher order derivatives.
 
 Now that we have completed all the necessary steps, let's try it out! 
 ```jldoctest deriv_ext
@@ -296,8 +311,8 @@ julia> evaluate(dy)
 
 julia> derivative_constraints(dy)
 2-element Vector{InfOptConstraintRef}:
- y(5) - y(0) - 5 ∂/∂t[y(t)](0) = 0.0
- y(10) - y(5) - 5 ∂/∂t[y(t)](5) = 0.0
+ y(5) - y(0) - 5 d/dt[y(t)](0) = 0.0
+ y(10) - y(5) - 5 d/dt[y(t)](5) = 0.0
 ```
 We implemented explicit Euler and it works! Now go and extend away!
 

diff --git a/docs/src/guide/derivative.md b/docs/src/guide/derivative.md
diff --git a/docs/src/guide/transcribe.md b/docs/src/guide/transcribe.md
@@ -199,26 +199,26 @@ Thus, we have an optimization problem whose decision space is infinite with
 respect to time ``t`` and position ``x``. Now let's transcript it following the 
 above steps. First, we need to specify the infinite parameter supports and for 
 simplicity let's choose the following sparse sets:
- - ``t \in \{0, 10\}``
+ - ``t \in \{0, 5, 10\}``
  - ``x \in \{[-1, -1]^T, [-1, 1]^T, [1, -1]^T, [1, 1]^T\}``.
  To handle the derivative ``\frac{\partial g(t, x)}{\partial t}``, we'll use  
- backward finite difference so no additional supports will need to be added.
+ backward finite difference, so no additional supports will need to be added.
 
 Now we expand the two integrals (measures) via a finite approximation using only 
 the above supports and term coefficients of 1 (note this is not numerically 
 correct but is done for conciseness in example). Doing this, we obtain the 
 form:
 ```math
 \begin{aligned}
-	&&\min_{y(t), g(t, x)} &&& y^2(0) + y^2(10) \\
+	&&\min_{y(t), g(t, x)} &&& y^2(0) + y^2(5) + y^2(10) \\
 	&&\text{s.t.} &&& y(0) = 1 \\
   &&&&& g(0, x) = 0 \\
 	&&&&& \frac{\partial g(t, [-1, -1])}{\partial t} + \frac{\partial g(t, [-1, 1])}{\partial t} + \frac{\partial g(t, [1, -1])}{\partial t} + \frac{\partial g(t, [1, 1])}{\partial t} = 42, && \forall t \in [0, 10] \\
   &&&&& 3g(t, x) + 2y^2(t) \leq 2, && \forall t \in T, \ x \in [-1, 1]^2. \\
 \end{aligned}
 ```
 Notice that the infinite variable ``y(t)`` in the objective measure has been 
-replaced with finite transcribed variables ``y(0)`` and ``y(10)``. Also, the 
+replaced with finite transcribed variables ``y(0)``, ``y(5)``, ``y(10)``. Also, the 
 infinite derivative ``\frac{\partial g(t, x)}{\partial t}`` was replaced with  
 partially transcribed variables in the second constraint in accordance with the 
 measure over the positional domain ``x``.
@@ -230,13 +230,14 @@ and the third constraint needs to be transcribed for each unique combination
 of the time and position supports. Applying this transcription yields: 
 ```math
 \begin{aligned}
-	&&\min_{y(t), g(t, x)} &&& y^2(0) + y^2(10) \\
+	&&\min_{y(t), g(t, x)} &&& y^2(0) + y^2(5) + y^2(10) \\
 	&&\text{s.t.} &&& y(0) = 1 \\
   &&&&& g(0, [-1, -1]) = 0 \\
   &&&&& g(0, [-1, 1]) = 0 \\
   &&&&& g(0, [1, -1]) = 0 \\
   &&&&& g(0, [1, 1]) = 0 \\
 	&&&&& \frac{\partial g(0, [-1, -1])}{\partial t} + \frac{\partial g(0, [-1, 1])}{\partial t} + \frac{\partial g(0, [1, -1])}{\partial t} + \frac{\partial g(0, [1, 1])}{\partial t} = 42\\
+  &&&&& \frac{\partial g(5, [-1, -1])}{\partial t} + \frac{\partial g(5, [-1, 1])}{\partial t} + \frac{\partial g(5, [1, -1])}{\partial t} + \frac{\partial g(5, [1, 1])}{\partial t} = 42\\
   &&&&& \frac{\partial g(10, [-1, -1])}{\partial t} + \frac{\partial g(10, [-1, 1])}{\partial t} + \frac{\partial g(10, [1, -1])}{\partial t} + \frac{\partial g(10, [1, 1])}{\partial t} = 42\\
   &&&&& 3g(0, [-1, -1]) + 2y^2(0) \leq 2 \\
   &&&&& 3g(0, [-1, 1]) + 2y^2(0) \leq 2 \\
@@ -252,10 +253,14 @@ infinite equation in this case this we only have 2 supports in the time domain
 is then transcribed over the spatial domain to yield:
 ```math
 \begin{aligned}
-&&& g(10, [-1, -1]) = g(0, [-1, -1]) + 10\frac{\partial g(10, [-1, -1])}{\partial t} \\
-&&& g(10, [-1, 1]) = g(0, [-1, 1]) + 10\frac{\partial g(10, [-1, 1])}{\partial t} \\
-&&& g(10, [1, -1]) = g(0, [1, -1]) + 10\frac{\partial g(10, [1, -1])}{\partial t} \\
-&&& g(10, [1, 1]) = g(0, [1, 1]) + 10\frac{\partial g(10, [1, 1])}{\partial t}
+&&& g(5, [-1, -1]) = g(0, [-1, -1]) + 5\frac{\partial g(5, [-1, -1])}{\partial t} \\
+&&& g(5, [-1, 1]) = g(0, [-1, 1]) + 5\frac{\partial g(5, [-1, 1])}{\partial t} \\
+&&& g(5, [1, -1]) = g(0, [1, -1]) + 5\frac{\partial g(5, [1, -1])}{\partial t} \\
+&&& g(5, [1, 1]) = g(0, [1, 1]) + 5\frac{\partial g(5, [1, 1])}{\partial t} \\
+&&& g(10, [-1, -1]) = g(5, [-1, -1]) + 5\frac{\partial g(10, [-1, -1])}{\partial t} \\
+&&& g(10, [-1, 1]) = g(5, [-1, 1]) + 5\frac{\partial g(10, [-1, 1])}{\partial t} \\
+&&& g(10, [1, -1]) = g(5, [1, -1]) + 5\frac{\partial g(10, [1, -1])}{\partial t} \\
+&&& g(10, [1, 1]) = g(5, [1, 1]) + 5\frac{\partial g(10, [1, 1])}{\partial t}
 \end{aligned}
 ```
 
@@ -275,7 +280,7 @@ using InfiniteOpt
 inf_model = InfiniteModel()
 
 # Define parameters and supports
-@infinite_parameter(inf_model, t in [0, 10], supports = [0, 10])
+@infinite_parameter(inf_model, t in [0, 10], supports = [0, 5, 10])
 @infinite_parameter(inf_model, x[1:2] in [-1, 1], supports = [-1, 1], independent = true)
 
 # Define variables
@@ -313,56 +318,72 @@ julia> build_optimizer_model!(inf_model)
 julia> trans_model = optimizer_model(inf_model);
 
 julia> print(trans_model)
-Min y(support: 1)² + y(support: 2)²
+Min y(support: 1)² + y(support: 2)² + y(support: 3)²
 Subject to
  y(support: 1) = 1
  g(support: 1) = 0
- g(support: 3) = 0
- g(support: 5) = 0
+ g(support: 4) = 0
  g(support: 7) = 0
- ∂/∂t[g(t, x)](support: 1) + ∂/∂t[g(t, x)](support: 3) + ∂/∂t[g(t, x)](support: 5) + ∂/∂t[g(t, x)](support: 7) = 42
- ∂/∂t[g(t, x)](support: 2) + ∂/∂t[g(t, x)](support: 4) + ∂/∂t[g(t, x)](support: 6) + ∂/∂t[g(t, x)](support: 8) = 42
- g(support: 1) - g(support: 2) + 10 ∂/∂t[g(t, x)](support: 2) = 0
- g(support: 3) - g(support: 4) + 10 ∂/∂t[g(t, x)](support: 4) = 0
- g(support: 5) - g(support: 6) + 10 ∂/∂t[g(t, x)](support: 6) = 0
- g(support: 7) - g(support: 8) + 10 ∂/∂t[g(t, x)](support: 8) = 0
+ g(support: 10) = 0
+ ∂/∂t[g(t, x)](support: 1) + ∂/∂t[g(t, x)](support: 4) + ∂/∂t[g(t, x)](support: 7) + ∂/∂t[g(t, x)](support: 10) = 42
+ ∂/∂t[g(t, x)](support: 2) + ∂/∂t[g(t, x)](support: 5) + ∂/∂t[g(t, x)](support: 8) + ∂/∂t[g(t, x)](support: 11) = 42
+ ∂/∂t[g(t, x)](support: 3) + ∂/∂t[g(t, x)](support: 6) + ∂/∂t[g(t, x)](support: 9) + ∂/∂t[g(t, x)](support: 12) = 42
+ g(support: 1) - g(support: 2) + 5 ∂/∂t[g(t, x)](support: 2) = 0
+ g(support: 2) - g(support: 3) + 5 ∂/∂t[g(t, x)](support: 3) = 0
+ g(support: 4) - g(support: 5) + 5 ∂/∂t[g(t, x)](support: 5) = 0
+ g(support: 5) - g(support: 6) + 5 ∂/∂t[g(t, x)](support: 6) = 0
+ g(support: 7) - g(support: 8) + 5 ∂/∂t[g(t, x)](support: 8) = 0
+ g(support: 8) - g(support: 9) + 5 ∂/∂t[g(t, x)](support: 9) = 0
+ g(support: 10) - g(support: 11) + 5 ∂/∂t[g(t, x)](support: 11) = 0
+ g(support: 11) - g(support: 12) + 5 ∂/∂t[g(t, x)](support: 12) = 0
  y(support: 1)² + 3 g(support: 1) ≤ 2
  y(support: 2)² + 3 g(support: 2) ≤ 2
- y(support: 1)² + 3 g(support: 3) ≤ 2
- y(support: 2)² + 3 g(support: 4) ≤ 2
- y(support: 1)² + 3 g(support: 5) ≤ 2
- y(support: 2)² + 3 g(support: 6) ≤ 2
+ y(support: 3)² + 3 g(support: 3) ≤ 2
+ y(support: 1)² + 3 g(support: 4) ≤ 2
+ y(support: 2)² + 3 g(support: 5) ≤ 2
+ y(support: 3)² + 3 g(support: 6) ≤ 2
  y(support: 1)² + 3 g(support: 7) ≤ 2
  y(support: 2)² + 3 g(support: 8) ≤ 2
+ y(support: 3)² + 3 g(support: 9) ≤ 2
+ y(support: 1)² + 3 g(support: 10) ≤ 2
+ y(support: 2)² + 3 g(support: 11) ≤ 2
+ y(support: 3)² + 3 g(support: 12) ≤ 2
 ```
 This precisely matches what we found analytically. Note that the unique support 
 combinations are determined automatically and are represented visually as 
 `support: #`. The precise support values can be looked up via `supports`:
 ```jldoctest trans_example
 julia> supports(y)
-2-element Vector{Tuple}:
+3-element Vector{Tuple}:
  (0.0,)
+ (5.0,)
  (10.0,)
 
 julia> supports(g)
-8-element Vector{Tuple}:
+12-element Vector{Tuple}:
  (0.0, [-1.0, -1.0])
+ (5.0, [-1.0, -1.0])
  (10.0, [-1.0, -1.0])
  (0.0, [1.0, -1.0])
+ (5.0, [1.0, -1.0])
  (10.0, [1.0, -1.0])
  (0.0, [-1.0, 1.0])
+ (5.0, [-1.0, 1.0])
  (10.0, [-1.0, 1.0])
  (0.0, [1.0, 1.0])
+ (5.0, [1.0, 1.0])
  (10.0, [1.0, 1.0])
 
 julia> supports(g, ndarray = true) # format it as an n-dimensional array (t by x[1] by x[2])
-2×2×2 Array{Tuple, 3}:
+3×2×2 Array{Tuple, 3}:
 [:, :, 1] =
  (0.0, [-1.0, -1.0])   (0.0, [1.0, -1.0])
+ (5.0, [-1.0, -1.0])   (5.0, [1.0, -1.0])
  (10.0, [-1.0, -1.0])  (10.0, [1.0, -1.0])
 
 [:, :, 2] =
  (0.0, [-1.0, 1.0])   (0.0, [1.0, 1.0])
+ (5.0, [-1.0, 1.0])   (5.0, [1.0, 1.0])
  (10.0, [-1.0, 1.0])  (10.0, [1.0, 1.0])
 ```
 

diff --git a/docs/src/index.md b/docs/src/index.md
@@ -68,6 +68,7 @@ Accepted infinite/finite problem forms currently include:
     - Conic
     - Semi-definite
     - Indicator
+    - Anything else supported by JuMP
 
 ### Infinite-Dimensional Optimization with InfiniteOpt.jl
 See our YouTube overview of infinite-dimensional programming and InfiniteOpt.jl's 

diff --git a/docs/src/manual/derivative.md b/docs/src/manual/derivative.md
@@ -21,6 +21,7 @@ DerivativeRef
 ```@docs
 derivative_argument(::DerivativeRef)
 operator_parameter(::DerivativeRef)
+derivative_order(::DerivativeRef)
 num_derivatives
 all_derivatives
 parameter_refs(::DerivativeRef)
@@ -55,6 +56,7 @@ has_derivative_constraints(::DerivativeRef)
 derivative_constraints(::DerivativeRef)
 delete_derivative_constraints(::DerivativeRef)
 evaluate_derivative
+allows_high_order_derivatives
 generative_support_info(::AbstractDerivativeMethod)
 support_label(::AbstractDerivativeMethod)
 InfiniteOpt.make_reduced_expr

diff --git a/docs/src/manual/expression.md b/docs/src/manual/expression.md
@@ -116,6 +116,7 @@ measure_data(::GeneralVariableRef)
 is_analytic(::GeneralVariableRef)
 derivative_argument(::GeneralVariableRef)
 operator_parameter(::GeneralVariableRef)
+derivative_order(::GeneralVariableRef)
 derivative_method(::GeneralVariableRef)
 evaluate(::GeneralVariableRef)
 derivative_constraints(::GeneralVariableRef)