refs #92 add muon doc

ISISNeutronMuon · Nov 12, 2024 · 9c6c3fa · 9c6c3fa
1 parent 45536dd
commit 9c6c3fa
Show file tree

Hide file tree

Showing 3 changed files with 339 additions and 1 deletion.
diff --git a/docs/source/examples/index.rst b/docs/source/examples/index.rst
@@ -0,0 +1,17 @@
+.. quickBayes documentation master file, created by
+   sphinx-quickstart on Mon Apr  3 15:37:35 2023.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Examples
+========
+
+Contents
+--------
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+
+   muon.ipynb
+   QENS.ipynb
diff --git a/docs/source/examples/muon.ipynb b/docs/source/examples/muon.ipynb
@@ -0,0 +1,321 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "be4f0f00-aa17-4130-8d9f-791593a44884",
+   "metadata": {},
+   "source": [
+    "## Muon Decay Rates\n",
+    "\n",
+    "In this example we will look at some simulated muon spectroscopy data. This data has been used at the ISIS Neutron and Muon source to test software. The simulated data is described by a background plus some exponential decays. The decay constants can indicate the muon stopping site or the relaxation time for the muon, depending on the experimental setup. There can be multiple muon stopping sites or relaxation times within a sample, so it is important to be able to identify the number of exponential decays. \n",
+    "\n",
+    "The first step in the analysis is to import all functionality that we will need. From `quickBayes` there are three imports; \n",
+    "- The workflow `MuonExpDecay`\n",
+    "- A function for adding fitting functions `CompositeFunction`\n",
+    "- A function for getting the background term from a string `get_background_function`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c58c06d2-268b-4ba3-b855-d3dc05b6d8de",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from quickBayes.functions.composite import CompositeFunction\n",
+    "from quickBayes.utils.general import get_background_function\n",
+    "from quickBayes.workflow.model_selection.muon_decay import MuonExpDecay\n",
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "import os"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a9dfa5bf-e487-4d64-94ef-20d22e7c75b0",
+   "metadata": {},
+   "source": [
+    "The data is contained as part of the tests for `quickBayes`. On loading the data it can be useful to visually inspect it."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b13cec5f-8fa2-4274-abec-aa454d5711cc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "DATA_DIR = os.path.join('..', '..', '..', 'test', 'data', 'muon')\n",
+    "data_file = os.path.join(DATA_DIR, 'muon_expdecay_2.npy')\n",
+    "\n",
+    "sx, sy, se = np.loadtxt(data_file)\n",
+    "\n",
+    "fig, ax = plt.subplots()\n",
+    "ax.errorbar(sx, sy, se, fmt='kx', label='data');\n",
+    "ax.set(xlabel='Time ($\\mu s$)', ylabel='Asymmetry', title='Muon data');\n",
+    "plt.legend();\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0eb3b451-fb86-4871-a342-b0e6db5f2615",
+   "metadata": {},
+   "source": [
+    "Looking at the data its not obvious how many decays are present (the name of the file tells us that the answer is two). To continue we need to set the problem definition. The start value is set by the time zero of the experiment and the end point is chosen based on the signal to noise ratio. The maximum number of decays that we want to consider is four. The results and results_errors are empty as we are doing a fresh calculation."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "08555b48-20e4-41da-9e86-1b34001a3bcb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "start_x = 0.16\n",
+    "end_x = 15.0\n",
+    "max_features = 4\n",
+    "results = {}\n",
+    "results_errors = {}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "22b6a37e-b0da-497b-abcf-24054f6a4817",
+   "metadata": {},
+   "source": [
+    "The `workflow` contains most of the complicated calculations. The `workflow` takes the `results` and `results_errors` and will append to them. The next step is to prepare that data for analysis using the `preprocess_data` method, in this case it just crops the data between the start and end values. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "77e519bb-5ca4-4035-bb99-ef3ff45a78fc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "workflow = MuonExpDecay(results, results_errors)\n",
+    "workflow.preprocess_data(sx, sy, se, start_x, end_x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "be8a8427-54e5-49bd-8f56-41f517ece4c7",
+   "metadata": {},
+   "source": [
+    "The function we want to consider is a function of time ($t$) of the form\n",
+    "$$\n",
+    "y_N(t) = ct + \\sum_{j=1}{N} A_j \\exp(-\\lambda_j t),\n",
+    "$$\n",
+    "where $c$ is a constant background value, $A_j$ is the amplitude of the $j^\\mathrm{th}$ decay, $N$ is the number of decays being considered and $\\lambda_j$ is the decay rate of the $j^\\mathrm{th}$ decay. \n",
+    "\n",
+    "First we use the `get_backgound_function` to get a constant background (flat), the other options are `None` and `linear`. The `CompositeFunction` is a wrapper for adding fitting functions together, this is done by using the `add_function` method. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a13ad359-9f8f-417d-b6d7-9159f2c1cb3b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "BG = get_background_function('flat')\n",
+    "func = CompositeFunction()\n",
+    "func.add_function(BG)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "12596157-4e97-4c0d-b8d7-029232bbfa07",
+   "metadata": {},
+   "source": [
+    "It is helpful to know the default bounds for the fitting function, this can be obtained with the `get_bounds` method."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4135b236-b88b-4489-827e-4688e613b0d9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "lower, upper = func.get_bounds()\n",
+    "print('lower bounds', lower)\n",
+    "print('upper bounds', upper)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ec6039eb-9159-4738-ba53-8c106f514d4e",
+   "metadata": {},
+   "source": [
+    "Looking at the plot above, it is clear that these bounds are too large. We can set them using the `set_bounds` method, it takes a list of the lower and upper bounds as arguments. The final optional argument is the `index` this tells the `CompositeFunction` which function the new bounds should be associated with, by default it is the last function to be added to the `CompositeFunction`. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b6561e43-d295-4494-8230-1f846438bd83",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "func.set_bounds([-.1], [.25])\n",
+    "print(func.get_bounds())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "56e37ad4-7da5-463d-bfd0-eaca26ccfdac",
+   "metadata": {},
+   "source": [
+    "There is also a `get_guess` method for getting the start values for a fit. If these are not suitable then it can be updated using the `set_guess` method. The arguments are the new guess and the `index` of the function you would like to update (the default is the last function to be added).  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0bf154db-5a94-48fe-8713-61f0da200e7d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(func.get_guess())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e48bfbba-1ee6-4459-be2a-8c9f9bd30e94",
+   "metadata": {},
+   "source": [
+    "The workflow requires a fitting engine to compute the results. At present there are two options `scipy` and `gofit` (global optimization, see __[here for more details](https://ralna.github.io/GOFit/_build/html/index.html)__). The next line executes the workflow, this will calculate the log of the posterior probability for one to four decays, along with the corresponding fitting results. It returns the last fitting function it used, in this case four decays plus a flat background.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "46faff90-a530-417b-8140-5cf9db5bdf52",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "workflow.set_scipy_engine(func.get_guess(), *func.get_bounds())\n",
+    "func = workflow.execute(max_features, func, func.get_guess())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "28fb2a5e-2d00-4f39-9c62-131541d04ee3",
+   "metadata": {},
+   "source": [
+    "The results and errors of the calculation can be obtained from the `get_parameters_and_errors` method. This returns two dictionaries, one for the values and one for the errors. The keys indicate the parameter, with the `Nx` showing how many features (decays) were used in the calculation. Within the `results` are the loglikelihoods ($log_{10}(P)$), which are the logs of the unnormalized posterior probability. Hence, the most likely model is the one with the largest value. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fb35d454-c112-44c3-b4b4-60514011f528",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results, results_errors = workflow.get_parameters_and_errors\n",
+    "for key in results.keys():\n",
+    "    if 'log' in key:\n",
+    "        print(key, results[key][0])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "442f31c1-6a43-4c0e-9155-b485a8340d1d",
+   "metadata": {},
+   "source": [
+    "The results show that two decays are most likely, followed by one decay. The code includes a penalty for overfitting, this reduces the probability of a model that is overparameterized. \n",
+    "\n",
+    "It would be helpful to view the fits against the data. To do this we get the `fit_engine` object directly, which keeps a history of the fits it has performed. The `fit_engine` has a `get_fit_values` method that returns;\n",
+    "- the x data\n",
+    "- the fitted y values\n",
+    "- the errors on the fit\n",
+    "- the difference between the fit and the original data (not interpolated)\n",
+    "- the errors on the differences\n",
+    "\n",
+    "for the requested fit. The first fit was for one decay, so will be at index `0`. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0c5527d4-3918-4c93-a4e9-5b1c8ef8c759",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fit_engine = workflow.fit_engine\n",
+    "x1, y1, e1, _, _ = fit_engine.get_fit_values(0)\n",
+    "\n",
+    "fit_engine = workflow.fit_engine\n",
+    "x2, y2, e2, _, _ = fit_engine.get_fit_values(1)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9860a0b8-e903-44b3-8e16-db219dfa6b3c",
+   "metadata": {},
+   "source": [
+    "Plotting the results with the data shows how difficult it is to determine the number of decays by eye. The simulation was for two decays, so the workflow has correctly identified the number of decays. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "55920234-61f1-4fc0-961f-01c5787c21c5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, ax = plt.subplots()\n",
+    "ax.errorbar(sx, sy, se, fmt='ok', label='data')\n",
+    "ax.set(xlabel='time ($\\mu s)$', ylabel='Asymmetry', title='input data');\n",
+    "ax.errorbar(x1, y1, e1, fmt='b--', label='1 exp')\n",
+    "ax.errorbar(x2, y2, e2, fmt='r--', label='2 exp')\n",
+    "ax.set(xlabel='Time ($\\mu s$)', ylabel='Asymmetry', title='Muon data');\n",
+    "\n",
+    "plt.legend();"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "295e67c4-06dc-400c-9ac6-157ad928b625",
+   "metadata": {},
+   "source": [
+    "The fit parameters for the two decays are contained within the `results` dictionary and the `results_errors` has the corresponding errors. It is important to note that the loglikelihoods do not have corresponding error measurements. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3e8a5293-51bd-479d-9d97-a68c3299335e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for key in results.keys():\n",
+    "    if 'N2' in key and 'log' not in key:\n",
+    "        print(key, results[key][0], results_errors[key][0])"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -19,7 +19,7 @@ Contents
    Installation and support <install>
    Bayesian Methods <cf_methods>
    Theory <theory>
-   examples/QENS.ipynb
+   Examples <examples/index>
    Developer docs <dev/index>
    api