JuliaGNI · michakraus · Jul 3, 2024 · Jul 1, 2024 · Jul 1, 2024 · Jul 1, 2024
diff --git a/.github/workflows/Latex.yml b/.github/workflows/Latex.yml
@@ -35,6 +35,7 @@ jobs:
           make put_figures_outside_of_minted_environment -C docs
           make do_correct_quotation_marks -C docs
           make make_correct_thrm_and_dfntn_and_xmpl_and_rmrk_and_proof_environment -C docs
+          make references_instead_of_hyperlinks -C docs
       - name: compile tex document
         run: |
           cd docs/build 

diff --git a/docs/Makefile b/docs/Makefile
@@ -11,7 +11,7 @@ latex: latex_no_pdf
 	$(MAKE) compile_tex;
 	$(MAKE) compile_tex
 
-latex_no_pdf_no_images: install_brenier_two_fluid latex_docs_no_pdf copy_png_files put_figures_outside_of_minted_environment do_correct_quotation_marks make_correct_thrm_and_dfntn_and_xmpl_and_rmrk_and_proof_environment
+latex_no_pdf_no_images: install_brenier_two_fluid latex_docs_no_pdf copy_png_files put_figures_outside_of_minted_environment do_correct_quotation_marks make_correct_thrm_and_dfntn_and_xmpl_and_rmrk_and_proof_environment references_instead_of_hyperlinks
 
 latex_no_pdf: latex_images latex_no_pdf_no_images 
 
@@ -95,6 +95,9 @@ do_correct_quotation_marks:
 	sed -i'' -e 's/{\\textquotedbl}/"/g' build/G*.tex;
 	sed -i'' -e 's/ "/ ``/g' build/G*.tex
 
+references_instead_of_hyperlinks:
+	sed -i'' -e 's/\\hyperlinkref{\([0-9]*\)}{\([a-zA-Z -]*\)}/\2 (see \\Cref{\1})/g' build/G*.tex
+
 copy_png_files:
 	find build/manifolds -name \*.png -exec cp {} build \; ;
 	find build/optimizers/manifold_related -name \*.png -exec cp {} build \; ;

diff --git a/docs/make.jl b/docs/make.jl
@@ -25,7 +25,7 @@ const html_format = Documenter.HTML(;
     # specifies that we do not display the package name again (it's already in the logo)
     sidebar_sitename = false,
     # we should get rid of this line again eventually. We will be able to do this once we got rid of library.md
-    size_threshold = 262144,
+    size_threshold = 524288,
     )
 
 # if platform is set to "none" then no output pdf is generated

diff --git a/docs/src/data_loader/data_loader.md b/docs/src/data_loader/data_loader.md
@@ -1,9 +1,6 @@
 # Data Loader 
 
-```@eval
-using GeometricMachineLearning, Markdown
-Markdown.parse(description(Val(:DataLoader)))
-```
+The `DataLoader` in `GeometricMachineLearning` is designed to make training convenient. 
 
 The data loader can be called with various types of arrays as input, for example a [snapshot matrix](snapshot_matrix.md):
 
@@ -25,7 +22,8 @@ SnapshotTensor = rand(Float32, 10, 100, 5)
 dl = DataLoader(SnapshotTensor)
 ```
 
-Here the `DataLoader` has different properties `:RegularData` and `:TimeSeries`. This indicates that in the first case we treat all columns in the input tensor independently (this is mostly used for autoencoder problems), whereas in the second case we have *time series-like data*, which are mostly used for integration problems. 
+Here the `DataLoader` has different properties `:RegularData` and `:TimeSeries`. This indicates that in the first case we treat all columns in the input tensor independently (this is mostly used for autoencoder problems), whereas in the second case we have *time series-like data*, which are mostly used for integration problems. The default when using a matrix is `:RegularData` and the default when using a tensor is `:TimeSeries`.
+
 We can also treat a problem with a matrix as input as a time series-like problem by providing an additional keyword argument: `autoencoder=false`:
 
 ```@example 
@@ -37,11 +35,7 @@ dl = DataLoader(SnapshotMatrix; autoencoder=false)
 dl.input_time_steps
 ```
 
-
-```@eval
-using GeometricMachineLearning, Markdown
-Markdown.parse(description(Val(:data_loader_for_named_tuple)))
-```
+If we deal with hamiltonian systems we typically split the coordinates into a ``q`` and a ``p`` part. Such data can also be used as input arguments for `DataLoader`:
 
 ```@example named_tuple_tensor
 using GeometricMachineLearning # hide
@@ -51,17 +45,15 @@ SymplecticSnapshotTensor = (q = rand(Float32, 10, 100, 5), p = rand(Float32, 10,
 dl = DataLoader(SymplecticSnapshotTensor)
 ```
 
+The dimension of the system is then the sum of the dimensions of the ``q`` and the ``p`` component:
+
 ```@example named_tuple_tensor
 dl.input_dim
 ```
 
 ## The `Batch` struct  
 
-```@eval
-using GeometricMachineLearning, Markdown
-Markdown.parse(description(Val(:Batch)))
-```
-
+If we want to draw mini batches from a data loader, we need to allocate an instance of [`Batch`](@ref). If we call the corresponding functor on an instance of [`DataLoader`](@ref) we get the following result:
 
 ```@example 
 using GeometricMachineLearning # hide
@@ -73,7 +65,15 @@ batch = Batch(3)
 batch(dl)
 ```
 
-This also works if the data are in ``qp`` form: 
+The output of applying the batch functor is always of the form: 
+
+```math
+([(b_{1,1}^t, b_{1,1}^p), (b_{1,2}^t, b_{1,2}^p), \ldots], [(b_{2,1}^t, b_{2, 1}^p), (b_{2, 2}^t, b_{2, 2}^p), \ldots], [(b_{3, 1}^t, b_{3, 2}^p), \ldots], \ldots),
+```
+
+so it is a tuple of vectors of tuples. One vector represents one batch. the tuples in one vector always have two entries: a *time index* and a *parameter index,* indicated by the superscripts ``t`` and ``p`` respectively in the equation above.
+
+This also works if the data are in ``(q, p)`` form:
 
 ```@example
 using GeometricMachineLearning # hide
@@ -103,7 +103,7 @@ Specifically the routines do the following:
 3. ``\mathcal{I}_i \leftarrow \mathtt{indices[(i - 1)} \cdot \mathtt{batch\_size} + 1 \mathtt{:} i \cdot \mathtt{batch\_size]}\text{ for }i=1, \ldots, (\mathrm{last} -1),``
 4. ``\mathcal{I}_\mathtt{last} \leftarrow \mathtt{indices[}(\mathtt{n\_batches} - 1) \cdot \mathtt{batch\_size} + 1\mathtt{:end]}.``
 
-Note that the routines are implemented in such a way that no two indices appear double. 
+Note that the routines are implemented in such a way that no two indices appear double, i.e. we *sample without replacement*. 
 
 ## Sampling from a tensor 
 
@@ -133,4 +133,13 @@ This algorithm can be visualized the following way (here `batch_size = 4`):
 Main.include_graphics("../tikz/tensor_sampling") # hide
 ```
 
-Here the sampling is performed over the second axis (the *time step dimension*) and the third axis (the *parameter dimension*). Whereas each block has thickness 1 in the ``x`` direction (i.e. pertains to a single parameter), its length in the ``y`` direction is `seq_length`. In total we sample as many such blocks as the batch size is big. By construction those blocks are never the same throughout a training epoch but may intersect each other!
+Here the sampling is performed over the second axis (the *time step dimension*) and the third axis (the *parameter dimension*). Whereas each block has thickness 1 in the ``x`` direction (i.e. pertains to a single parameter), its length in the ``y`` direction is `seq_length`. In total we sample as many such blocks as the batch size is big. By construction those blocks are never the same throughout a training epoch but may intersect each other!
+
+## Library Functions
+
+```@docs; canonical=false
+DataLoader
+Batch
+GeometricMachineLearning.number_of_batches
+GeometricMachineLearning.convert_input_and_batch_indices_to_array
+```
diff --git a/docs/src/layers/sympnet_gradient.md b/docs/src/layers/sympnet_gradient.md
@@ -19,36 +19,34 @@ Main.proof(raw"""Proofing this is straightforward by looking at the gradient of
 """ * Main.indentation * raw"""        \nabla_q^2f & \mathbb{I}
 """ * Main.indentation * raw"""    \end{pmatrix},
 """ * Main.indentation * raw"""```
-""" * Main.indentation * raw"""
 """ * Main.indentation * raw"""where ``\nabla_q^2f`` is the Hessian of ``f``. This matrix is symmetric and for any symmetric matrix ``A`` we have that: 
-""" * Main.indentation * raw"""
 """ * Main.indentation * raw"""```math
-""" * Main.indentation * raw"""    \begin{pmatrix}
-""" * Main.indentation * raw"""        \mathbb{I} & \mathbb{O} \\ 
-""" * Main.indentation * raw"""        A & \mathbb{I}
-""" * Main.indentation * raw"""    \end{pmatrix}^T \mathbb{J}_{2n} 
-""" * Main.indentation * raw"""    \begin{pmatrix} 
-""" * Main.indentation * raw"""        \mathbb{I} & \mathbb{O} \\ 
-""" * Main.indentation * raw"""        A & \mathbb{I} 
-""" * Main.indentation * raw"""    \end{pmatrix} = 
-""" * Main.indentation * raw"""    \begin{pmatrix}
-""" * Main.indentation * raw"""        \mathbb{I} & A \\ 
-""" * Main.indentation * raw"""        \mathbb{O} & \mathbb{I}
-""" * Main.indentation * raw"""    \end{pmatrix} 
-""" * Main.indentation * raw"""    \begin{pmatrix} 
-""" * Main.indentation * raw"""        \mathbb{O} & \mathbb{I} \\ 
-""" * Main.indentation * raw"""        -\mathbb{I} & \mathbb{O} 
-""" * Main.indentation * raw"""    \end{pmatrix} 
-""" * Main.indentation * raw"""    \begin{pmatrix}
-""" * Main.indentation * raw"""        \mathbb{I} & \mathbb{O} \\ 
-""" * Main.indentation * raw"""        A & \mathbb{I}
-""" * Main.indentation * raw"""    \end{pmatrix} = 
-""" * Main.indentation * raw"""    \begin{pmatrix}
-""" * Main.indentation * raw"""        \mathbb{O} & \mathbb{I} \\ 
-""" * Main.indentation * raw"""        -\mathbb{I} & \mathbb{O} 
-""" * Main.indentation * raw"""    \end{pmatrix} = \mathbb{J}_{2n}.
+""" * Main.indentation * raw""" \begin{pmatrix}
+""" * Main.indentation * raw"""     \mathbb{I} & \mathbb{O} \\ 
+""" * Main.indentation * raw"""     A & \mathbb{I}
+""" * Main.indentation * raw""" \end{pmatrix}^T \mathbb{J}_{2n} 
+""" * Main.indentation * raw""" \begin{pmatrix} 
+""" * Main.indentation * raw"""     \mathbb{I} & \mathbb{O} \\ 
+""" * Main.indentation * raw"""     A & \mathbb{I} 
+""" * Main.indentation * raw""" \end{pmatrix} = 
+""" * Main.indentation * raw""" \begin{pmatrix}
+""" * Main.indentation * raw"""     \mathbb{I} & A \\ 
+""" * Main.indentation * raw"""     \mathbb{O} & \mathbb{I}
+""" * Main.indentation * raw""" \end{pmatrix} 
+""" * Main.indentation * raw""" \begin{pmatrix} 
+""" * Main.indentation * raw"""     \mathbb{O} & \mathbb{I} \\ 
+""" * Main.indentation * raw"""     -\mathbb{I} & \mathbb{O} 
+""" * Main.indentation * raw""" \end{pmatrix} 
+""" * Main.indentation * raw""" \begin{pmatrix}
+""" * Main.indentation * raw"""     \mathbb{I} & \mathbb{O} \\ 
+""" * Main.indentation * raw"""     A & \mathbb{I}
+""" * Main.indentation * raw""" \end{pmatrix} = 
+""" * Main.indentation * raw""" \begin{pmatrix}
+""" * Main.indentation * raw"""     \mathbb{O} & \mathbb{I} \\ 
+""" * Main.indentation * raw"""     -\mathbb{I} & \mathbb{O} 
+""" * Main.indentation * raw""" \end{pmatrix} = \mathbb{J}_{2n},
 """ * Main.indentation * raw"""```
-""" * Main.indentation * raw"""Thus showing symplecticity.""")
+""" * Main.indentation * raw"""thus showing symplecticity.""")
 ```
 
 If we deal with [`GSympNet`](@ref)s this function ``f`` is 
@@ -60,7 +58,7 @@ If we deal with [`GSympNet`](@ref)s this function ``f`` is
 where ``a, b\in\mathbb{R}^m``, ``K\in\mathbb{R}^{m\times{}n}`` and ``\Sigma`` is the antiderivative of some common activation function ``\sigma``. We routinely refer to ``m`` as the *upscaling dimension* in `GeometricMachineLearning`. Computing the gradient of ``f`` gives: 
 
 ```math
-    [\nabla_qf]_k = \sum_{i=1}^m a_i \sigma(\sum_{j=1}^nk_{ij}q_j + b_i)k_{ik} = K^T a \odot \sigma(Kq + b),
+    [\nabla_qf]_k = \sum_{i=1}^m a_i \sigma(\sum_{j=1}^nk_{ij}q_j + b_i)k_{ik} = K^T (a \odot \sigma(Kq + b)),
 ```
 
 where ``\odot`` is the element-wise product, i.e. ``[a\odot{}v]_k = a_kv_k``. This is the form that *gradient layers* take. In addition to gradient layers `GeometricMachineLearning` also has *linear* and *activation* layers implemented. Activation layers are simplified versions of *gradient layers*. These are equivalent to taking ``m = n`` and ``K = \mathbb{I}.``

diff --git a/docs/src/tutorials/symplectic_autoencoder.md b/docs/src/tutorials/symplectic_autoencoder.md
@@ -139,30 +139,14 @@ We can see that the autoencoder approach has much more approximation capabilitie
 Instead of using a standard integrator we can also use a neural network that is trained on the reduced data. For this: 
 
 ```@example toda_lattice
-data_unprocessed = encoder(sae_nn)(dl.input)
-data_processed = (  q = reshape(data_unprocessed.q, reduced_dim ÷ 2, length(data_unprocessed.q)), 
-                    p = reshape(data_unprocessed.p, reduced_dim ÷ 2, length(data_unprocessed.p))
-                    )
-
-dl_reduced = DataLoader(data_processed; autoencoder = false)
 integrator_batch_size = 128
-integrator_train_epochs = 4
+integrator_train_epochs = 10
 
 integrator_nn = NeuralNetwork(GSympNet(reduced_dim))
 o_integrator = Optimizer(AdamOptimizer(Float64), integrator_nn)
-struct ReducedLoss{ET, DT} <: GeometricMachineLearning.NetworkLoss
-    encoder::ET
-    decoder::DT
-end
-function (loss::ReducedLoss)(model::Chain, params::Tuple, input::CT, output::CT) where {AT <:Array, CT <: NamedTuple{(:q, :p), Tuple{AT, AT}}}
-    GeometricMachineLearning._compute_loss(loss.decoder(model(loss.encoder(input), params)), output)
-end
 
 loss = ReducedLoss(encoder(sae_nn), decoder(sae_nn))
-dl_integration = DataLoader((q = reshape(dl.input.q, size(dl.input.q, 1), size(dl.input.q, 3)),
-                             p = reshape(dl.input.p, size(dl.input.p, 1), size(dl.input.p, 3)));
-                            autoencoder = false
-                            )
+dl_integration = DataLoader(dl; autoencoder = false)
 
 o_integrator(integrator_nn, dl_integration, Batch(integrator_batch_size), integrator_train_epochs, loss)
 
@@ -172,15 +156,17 @@ nothing # hide
 We can now evaluate the solution:
 
 ```@example toda_lattice
-ics = (q = dl_reduced.input.q[:, 1], p = dl_reduced.input.p[:, 1])
+ics = encoder(sae_nn)((q = dl.input.q[:, 1, 1], p = dl.input.p[:, 1, 1]))
 time_series = iterate(integrator_nn, ics; n_points = t_steps)
 prediction = (q = time_series.q[:, end], p = time_series.p[:, end])
 sol = decoder(sae_nn)(prediction)
 
 plot!(sol.q; label = "Neural Network Integrator")
 ```
 
-
+```@eval
+Main.remark(raw"The results presented here are not ideal. In order to be able to quickly run them on a relatively weak gpu we have chosen the number of epochs very low. For more useful results the number of epochs should be increased to at least 1000 or more.")
+```
 
 
 ## References 

diff --git a/scripts/symplectic_autoencoders/online_sympnet.jl b/scripts/symplectic_autoencoders/online_sympnet.jl
@@ -1,18 +1,19 @@
 using CUDA
 using GeometricIntegrators: integrate, ImplicitMidpoint
 using CairoMakie
-using GeometricProblems.TodaLattice: hodeproblem
+using GeometricProblems.TodaLattice: hodeensemble
 using GeometricMachineLearning 
 import Random
 
 backend = CUDABackend()
 
-pr = hodeproblem(; tspan = (0.0, 100.))
+params = [(α = α̃ ^ 2, N = 200) for α̃ in 0.8 : .1 : 2.0]
+pr = hodeensemble(; tspan = (0.0, 100.), parameters = params)
 sol = integrate(pr, ImplicitMidpoint())
 dl_cpu = DataLoader(sol; autoencoder = true)
 dl = DataLoader(dl_cpu, backend, Float32)
 
-const reduced_dim = 2
+const reduced_dim = 4
 
 psd_arch = PSDArch(dl.input_dim, reduced_dim)
 sae_arch = SymplecticAutoencoder(dl.input_dim, reduced_dim; n_encoder_blocks = 4, n_decoder_blocks = 4, n_encoder_layers = 2, n_decoder_layers = 2)
@@ -22,11 +23,13 @@ psd_nn = NeuralNetwork(psd_arch, backend)
 sae_nn = NeuralNetwork(sae_arch, backend)
 
 const n_epochs = 8192
-const batch_size = 512
+const batch_size = 2048
 
 sae_method = AdamOptimizerWithDecay(n_epochs)
 o = Optimizer(sae_nn, sae_method)
 
+println("Number of batches: ", GeometricMachineLearning.number_of_batches(dl, Batch(batch_size)))
+
 psd_error = solve!(psd_nn, dl)
 sae_error = o(sae_nn, dl, Batch(batch_size), n_epochs)
 
@@ -65,14 +68,11 @@ sol_sae_reduced = integrate_reduced_system(sae_rs)
 
 # train sympnet 
 
-data_unprocessed = encoder(sae_nn)(dl.input)
-data_processed = (  q = reshape(data_unprocessed.q, reduced_dim ÷ 2, length(data_unprocessed.q)), 
-                    p = reshape(data_unprocessed.p, reduced_dim ÷ 2, length(data_unprocessed.p))
-                    )
+data_processed = encoder(sae_nn)(dl.input)
 
 dl_reduced = DataLoader(data_processed; autoencoder = false)
 integrator_train_epochs = 8192
-integrator_batch_size = 512
+integrator_batch_size = 2048
 
 seq_length = 4
 integrator_architecture = StandardTransformerIntegrator(reduced_dim; transformer_dim = 10, n_blocks = 3, n_heads = 5, L = 2, upscaling_activation = tanh)
@@ -129,7 +129,7 @@ end
 # plot the reduced data (should be a closed orbit)
 reduced_data_matrix = vcat(dl_reduced.input.q, dl_reduced.input.p)
 fig_reduced = Figure()
-ax_reduced = Axis(reduced_fig[1, 1])
-lines!(ax_reduced, reduced_data_matrix[1, :, 1], reduced_data_matrix[2, :, 1]; color = mgreen, label = "Reduced Data")
+ax_reduced = Axis(fig_reduced[1, 1])
+lines!(ax_reduced, reduced_data_matrix[1, :, 1] |> mtc, reduced_data_matrix[2, :, 1] |> mtc; color = mgreen, label = "Reduced Data")
 axislegend(; position = (.82, .75), backgroundcolor = :transparent, color = text_color)
 save("reduced_data.png", fig_reduced)