From eeea159a5a51775e3fdd719da5421d2ef2e016d8 Mon Sep 17 00:00:00 2001
From: Bryan Hilbert <hilbert@stsci.edu>
Date: Thu, 7 Nov 2024 23:26:27 -0500
Subject: [PATCH] Add notebook for NIRCam WFSS Box Extraction (#246)

* Add notebook for NIRCam WFSS Box Extraction

* Tweaks to text from final readthrough

* PEP8 fixes

* Adjustments after consulting with Nor

* ensure pipeline steps have access to necessary reference files

---------

Co-authored-by: gibsongreen <gibsongreen99@outlook.com>
---
 .../BoxExtraction_using_Grismconf_CRDS.ipynb  | 1146 +++++++++++++++++
 .../requirements.txt                          |    9 +
 2 files changed, 1155 insertions(+)
 create mode 100755 notebooks/NIRCam/NIRCam_WFSS_Box_extraction/BoxExtraction_using_Grismconf_CRDS.ipynb
 create mode 100755 notebooks/NIRCam/NIRCam_WFSS_Box_extraction/requirements.txt
diff --git a/notebooks/NIRCam/NIRCam_WFSS_Box_extraction/BoxExtraction_using_Grismconf_CRDS.ipynb b/notebooks/NIRCam/NIRCam_WFSS_Box_extraction/BoxExtraction_using_Grismconf_CRDS.ipynb
new file mode 100755
index 000000000..74a89bd97
--- /dev/null
+++ b/notebooks/NIRCam/NIRCam_WFSS_Box_extraction/BoxExtraction_using_Grismconf_CRDS.ipynb
@@ -0,0 +1,1146 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "4d6f827d-b694-4875-b3a0-e6ef001c602e",
+   "metadata": {},
+   "source": [
+    "# WFSS Box Extraction Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9e5bd923-8f7f-40b8-a86f-2b7e3f5d5351",
+   "metadata": {},
+   "source": [
+    "This notebook demonstrates how to use the Generalized World Coordinate System (gWCS) in a Wide Field Slitless Spectrscopy (WFSS) observation to determine source locations and wavelengths. It shows how to calculate the location of a source in a WFSS observation given its location in a corresponding imaging observation, and how to calculate the wavelength at a particular pixel along an object's trace.\n",
+    "\n",
+    "It then shows how to use the gWCS to perform a box extraction of a spectrum and translate the 1D spectrum into physical units.\n",
+    "\n",
+    "In this example, we use exposures from JWST program 01076. We want to work on files that have full gWCS information in their headers, and that have had the flat field applied. We also need to run the flux calibration step of the pipeline in order to populate the name of the photom reference file in the header of the WFSS file (in the S_PHOTOM header keyword). This reference file will be used as part of the extraction process below. The photom step will not change the values of the science data in the WFSS exposure, because the observing mode (OBS_MODE header keyword) in the file is set to NRC_WFSS.\n",
+    "\n",
+    "In order to accomplish this, the <b>assign_wcs</b>, <b>flat field</b>, and <b>photom</b> steps of the pipeline must be run on the data. Ordinarily this means we could simply download *_cal.fits files from MAST, and that is true for the imaging mode data used in this notebook. However as we show below, we want to apply the imaging mode flat field to the WFSS data. This means that we must download the *_rate.fits file, and manually run these pipeline steps on the data. For consistency, we do the same with the imaging mode data.\n",
+    "\n",
+    "JWST detectors show little to no wavelength dependence in their flat-field, and just as is regularly done with HST WFSS data, in this example we have the pipeline apply the flat field for the direct cross filter to all the imaging as well as WFSS observations. We do not use a WFSS-specific flat field.\n",
+    "\n",
+    "Once the data have been properly calibrated, the notebook uses the <b>grismconf</b> package to translate between source locations in the imaging and WFSS data, and calculate wavelengths associated with a given location in the WFSS data. <b>grismconf</b> also uses the flux calibration curve in the photom reference file for the grisms to translate the data from units of $DN/sec$ to $F_{lambda}$ units ($erg / sec / cm^2 / \\overset{\\circ}{A}$). <b>grismconf</b> will obtain the needed NIRCam WFSS configuration files from the Calibration Reference Data System (CRDS). Note that the photom step must be run on the data in order to obain the name of the approproate CRDS sesitivity file.\n",
+    "\n",
+    "Note: At this stage, the important part of this is not the absolute accuracy of the WCS. Instead, we rely on accurate self-consistency between the imaging and the WFSS observations. \n",
+    "\n",
+    "Author: N. Pirzkal <br>\n",
+    "Date created: 24 Sept 2024"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "89e0c28a-df34-4bfb-af97-9ddf9d5768b4",
+   "metadata": {},
+   "source": [
+    "## Table of Contents\n",
+    "1. [Package Imports](#Package-Imports)\n",
+    "2. [Define Functions and Parameters](#Define-Functions-and-Parameters)\n",
+    "3. [Download Data](#Download-Data)\n",
+    "4. [Run Pipeline Steps](#Run-Pipeline-Steps)\n",
+    "5. [Basic Computation of WFSS Information](#Basic-Computation-of-WFSS-Information)\n",
+    "      * [Compute where light gets dispersed to](#Compute-where-light-gets-dispersed-to)\n",
+    "      * [Compute the spectral trace for a given object](#Compute-the-spectral-trace-for-a-given-object)\n",
+    "      * [Basic Box Extraction](#Basic-Box-Extraction)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7ffc9e2c-5958-4e61-8073-d584f8331c30",
+   "metadata": {},
+   "source": [
+    "## Package Imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3d34005b-94f5-42c0-b13b-cc10aea8a7f9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from copy import deepcopy\n",
+    "import matplotlib.pyplot as plt\n",
+    "import numpy as np\n",
+    "import os\n",
+    "import requests\n",
+    "from scipy.stats import sigmaclip\n",
+    "\n",
+    "from crds import client\n",
+    "import grismconf\n",
+    "from jwst.assign_wcs import AssignWcsStep\n",
+    "from jwst.flatfield import FlatFieldStep\n",
+    "from jwst.photom import PhotomStep"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "117e4ec7-610c-4d11-82f4-756971dc23e0",
+   "metadata": {},
+   "source": [
+    "## Define Functions and Parameters"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "46833187-8b09-4abb-8317-572154b56700",
+   "metadata": {},
+   "source": [
+    "Define a function to download a named file via the MAST API to the current directory. The function includes authentication logic, but the example in this notebook uses public data, so no MAST API token is required."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a2b09d49-838f-4523-a6b2-9e678d10666a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def get_jwst_file(name, mast_api_token=None, overwrite=False):\n",
+    "    \"\"\"Retrieve a JWST data file from MAST archive.\n",
+    "    \n",
+    "    Parameters\n",
+    "    ----------\n",
+    "    name : str\n",
+    "        Name of the file to download from MAST\n",
+    "        \n",
+    "    mast_api_token : str\n",
+    "        MAST API token. Required only for proprietary data\n",
+    "        \n",
+    "    overwrite : bool\n",
+    "        If True and the requested file already exists locally, the file will not be downloaded. IF False,\n",
+    "        the file will be downloaded\n",
+    "    \"\"\"\n",
+    "    # If the file already exists locally, don't redownload it, unless the\n",
+    "    # user has set the overwrite keyword\n",
+    "    if os.path.isfile(name):\n",
+    "        if not overwrite:\n",
+    "            print(f'{name} already exists locally. Skipping download.')\n",
+    "            return\n",
+    "        else:\n",
+    "            print(f'{name} exists locally. Re-downloading.')\n",
+    "\n",
+    "    mast_url = \"https://mast.stsci.edu/api/v0.1/Download/file\"\n",
+    "    params = dict(uri=f\"mast:JWST/product/{name}\")\n",
+    "    if mast_api_token:\n",
+    "        headers = dict(Authorization=f\"token {mast_api_token}\")\n",
+    "    else:\n",
+    "        headers = {}\n",
+    "    r = requests.get(mast_url, params=params, headers=headers, stream=True)\n",
+    "    r.raise_for_status()\n",
+    "    with open(name, \"wb\") as fobj:\n",
+    "        for chunk in r.iter_content(chunk_size=1024000):\n",
+    "            fobj.write(chunk)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c3f0f371-8912-425a-8911-7eec9b46f841",
+   "metadata": {},
+   "source": [
+    "Define a function that will run assign_wcs and flat fielding on an input rate file"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0de8bf4a-3f12-466c-8887-f7d1bed0f543",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def run_pipeline_steps(filename):\n",
+    "    \"\"\"Run the assign_wcs, flat field, and photom calibration steps on the given file.\n",
+    "    If the file contains WFSS data, trick the pipeline to use the imaging mode flat\n",
+    "    field reference file.\n",
+    "    \n",
+    "    Parameters\n",
+    "    ----------\n",
+    "    filename : str\n",
+    "        Name of the input file upon which the steps will be run\n",
+    "        \n",
+    "    Returns\n",
+    "    -------\n",
+    "    filename : str\n",
+    "        Name of the output file saved by the pipeline steps\n",
+    "        \n",
+    "    photom : jwst.datamodels.ImageModel\n",
+    "        Datamodel instance containing the calibrated data\n",
+    "    \"\"\"\n",
+    "    assign_wcs = AssignWcsStep.call(filename)\n",
+    "\n",
+    "    # In order to apply the imaging mode flat field reference file to the data,\n",
+    "    # we need to trick CRDS by temporarily changing the pupil value to be CLEAR\n",
+    "    reset_pupil = False\n",
+    "    if 'GRISM' in assign_wcs.meta.instrument.pupil:\n",
+    "        true_pupil = deepcopy(assign_wcs.meta.instrument.pupil)\n",
+    "        assign_wcs.meta.instrument.pupil = 'CLEAR'\n",
+    "        reset_pupil = True\n",
+    "\n",
+    "    # Run the flat field step\n",
+    "    flat = FlatFieldStep.call(assign_wcs, save_results=True)\n",
+    "    \n",
+    "    # Run the photom step to populate the name of the WFSS sensitivity \n",
+    "    photom = PhotomStep.call(flat, save_results=True)\n",
+    "    \n",
+    "    # Set the pupil back to the original value now that flat fielding is complete\n",
+    "    if reset_pupil:\n",
+    "        photom.meta.instrument.pupil = true_pupil\n",
+    "        photom.save(photom.meta.filename)\n",
+    "    \n",
+    "    # Return the name of the output file, as well as the datamodel\n",
+    "    return photom.meta.filename, photom"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "41068262-9a54-46c5-a9d2-ca2217ad1fee",
+   "metadata": {},
+   "source": [
+    "## Download Data and Ensure CRDS Configuration"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3f7b6965-2313-45c3-9399-c8601be0fdbd",
+   "metadata": {},
+   "source": [
+    "Download an example imaging mode rate file and corresponding WFSS mode rate file from MAST."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "253df6eb-f6ab-4c53-8c0e-ad58f64cda9b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# First, download the imaging and WFSS files from MAST\n",
+    "imaging_file = \"jw01076103001_02102_00001_nrcalong_rate.fits\"\n",
+    "wfss_file = \"jw01076103001_02101_00001_nrcalong_rate.fits\"\n",
+    "get_jwst_file(imaging_file)\n",
+    "get_jwst_file(wfss_file)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "faa27c94",
+   "metadata": {},
+   "source": [
+    "If CRDS is not configured, set the CRDS server URL and define a local cache path \n",
+    "to ensure the pipeline steps can access reference files as needed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "af7a4057",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if os.environ.get(\"CRDS_PATH\") is None:\n",
+    "    client.set_crds_server('https://jwst-crds.stsci.edu')\n",
+    "    os.environ[\"CRDS_SERVER_URL\"] = \"https://jwst-crds.stsci.edu\"\n",
+    "    os.environ[\"CRDS_PATH\"] = \"\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ab5f3c78-258f-49c4-99c4-87e26f0972cf",
+   "metadata": {},
+   "source": [
+    "## Run Pipeline Steps"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d5c025ca-ebba-49c2-ab24-a6318a29b65c",
+   "metadata": {},
+   "source": [
+    "Run the assign_wcs, flat field, and photom calibration steps on both the imaging and WFSS files."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "27a15b96-cfd7-4670-9142-a5378e053dd1",
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "# Run AssignWcsStep, FlatFieldStep, and PhotomStep on the imaging rate file\n",
+    "imaging_flat_file, imaging_data = run_pipeline_steps(imaging_file)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "67401d19-bbcf-45fc-b2a0-06c133f998e0",
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "# Run AssignWcsStep, FlatFieldStep, and PhotomStep on the WFSS rate file\n",
+    "wfss_flat_file, wfss_data = run_pipeline_steps(wfss_file)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5da7a2f7-7546-4791-ae9f-7ff0953af46a",
+   "metadata": {},
+   "source": [
+    "## Basic Computation of WFSS Information"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "455e24ae-791b-4331-85e8-b28d62a287fe",
+   "metadata": {},
+   "source": [
+    "All computations for WFSS are performed in detector coordinate space. All of the characteristics of the dispersed traces, including any change in the relative positions and the global shape (e.g. curvature, offsets...) of the traces is handled using a series of straight forward equations. This is described in <a href=\"https://www.stsci.edu/files/live/sites/www/files/home/hst/instrumentation/wfc3/documentation/instrument-science-reports-isrs/_documents/2017/WFC3-2017-01.pdf\">ISR WFC3 2017-01</a>: \"A more generalized coordinate transformation approach for grisms\".\n",
+    "Here we assume that a source would be at the pixel coordinates of ($x$, $y$). The coordinate of a single pixel on on the dispersed trace for the same source is denoted as ($x_g$, $y_g$) and the relative position of this dispersed trace element is therefore offset (x$_g$-x, \n",
+    " y$_g$-y) pixels with respect to the position of the source. The functional relation between ($x$, $y$), ($x_g$, $y_g$) and the wavelength of the light $\\lambda$, as well as their inverses are:\n",
+    "\n",
+    "$$\n",
+    "\\begin{align}\n",
+    "\\delta x = x_g - x = f_x(x,y;t)\\\\\n",
+    "\\delta y = y_g - y = f_y(x,y;t)\\\\\n",
+    "\\lambda = f_\\lambda(x,y;t)\n",
+    "\\end{align}\n",
+    "$$\n",
+    "\n",
+    "and \n",
+    "$$\n",
+    "\\begin{align}\n",
+    "t = f^{-1}_x(x,y;\\delta x)\\\\\n",
+    "t = f^{-1}_y(x,y;\\delta y)\\\\\n",
+    "t = f^{-1}_\\lambda(x,y;\\lambda)\n",
+    "\\end{align}\n",
+    "$$"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "83ea3d86-19fc-4529-b551-22a2ec98818f",
+   "metadata": {},
+   "source": [
+    "Note that these functions are parametrized with respect to the parameter $t$. This allows for some flexibility on the part of the calibration effort as $t$ can be defined somewhat arbitrarilly. In the case of the NIRCam grisms however, $t$ was chosen to be the $\\delta x$ or $\\delta y$, for the GRISMR and GRISMC, respectively since these grisms disperse light along the x-direction and y-direction, respectively. However, for additional convenience, the $t$ parameter is normalized to unity so that values of $t = 0$ and $t = 1$ correspond to the blue and red light edges of a dispersed spectrum.\n",
+    "Using the 6 equations above, one can relate any combination of ($x$,$y$), ($x'$,$y'$), $t$, and $\\lambda$ values. The equations listed above are implemented as DISPX(), DISPY(), DISPL(), INVDISPX(), INVDISPY(), and INVDISPL() in the GRISMCONF package."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d179fcbb-9d08-4eed-a3d4-29d716823a8f",
+   "metadata": {},
+   "source": [
+    "Now we will use the Grismconf package to retrieve information about the WFSS file. Note that we are using the output file from the calibration steps above."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fe3fd942-7839-44c7-9ad9-a188e2a1d942",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# This is the final output file from the pipeline call on the WFSS file above\n",
+    "wfss_file = \"jw01076103001_02101_00001_nrcalong_photomstep.fits\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "513059b9-0ebd-4431-a942-b6182fcc64c7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Load a WFSS configuration file to use in the example below.\n",
+    "C = grismconf.Config(wfss_file)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "edca7185-8e19-445b-b45d-1617e52cd75f",
+   "metadata": {},
+   "source": [
+    "### Compute where light gets dispersed to"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "077da68e-0367-4e80-9adb-97794a5b951f",
+   "metadata": {},
+   "source": [
+    "Here we show how to calculate the location of the point on the trace corresponding to a given wavelength for a source at a given detector location ($x$, $y$). For these calculations, we need only the WFSS file. The corresponding imaging mode file is not necessary."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7ef508ca-a7c6-4b8b-a089-110b652984e4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "x = 1000  # Pixel x coordinate\n",
+    "y = 1000  # Pixel y coordinate\n",
+    "\n",
+    "wavelength = 3.5  # wavelength, in microns"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "31248889-2e95-469f-8526-f2b89c83e4fa",
+   "metadata": {},
+   "source": [
+    "We want to compute $\\hat x$, the amount of dispersion in a pixel for photons with a wavelength of $\\lambda$. We first use the relation between $t$ and $\\lambda$ and then the relation between $\\hat x$ and $t$. This is done using INVDISPL() for order \"+1\" for an object at location ($x$, $y$):"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ea7ebf64-6469-43bc-9176-b50b957223b0",
+   "metadata": {},
+   "source": [
+    "Check which orders are available"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e50b4915-34a3-4161-b7e4-f15c514f8619",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "C.orders"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c5a89da8-1694-40a6-af02-b166ea20810d",
+   "metadata": {},
+   "source": [
+    "Calculate $t$ for the given position and wavelength."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "251e5c79-22a9-4f09-9f31-a3410b025fb2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t = C.INVDISPL(\"+1\", x, y, wavelength)\n",
+    "print(\"t =\", t)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a1f1617a-7bc1-42cf-bb35-76dfeb8c78a3",
+   "metadata": {},
+   "source": [
+    "We now can compute $\\delta x$ and $\\delta y$ using DISPX():"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c2b817a4-0da0-4bf7-84d9-683a4e72979a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "𝛿x = C.DISPX(\"+1\", x, y, t)\n",
+    "𝛿y = C.DISPY(\"+1\", x, y, t)\n",
+    "print(\"𝛿x =\", 𝛿x)\n",
+    "print(\"𝛿y =\", 𝛿y)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "de14b4c7-a8c3-486f-9147-b4cfe7a1908f",
+   "metadata": {},
+   "source": [
+    "The final pixel coordinates are therefore:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e5283aa4-fd40-45d9-a865-fe05f17005e8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "xg = x + 𝛿x\n",
+    "yg = y + 𝛿y\n",
+    "print(\"Trace coordinates:\", xg, yg)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0482f604-97b7-4c90-952e-09ffd8ba78de",
+   "metadata": {},
+   "source": [
+    "Alternatively, we could compute the approximate wavelength of the light at a given position on the spectral trace. For example, we would like to compute the wavelength of a pixel that is at coordinates ($x_g$, $y_g$) for a 1st order spectrum of a source that is known to be at the coordinates ($x$, $y$). As this is a Grism R spectrum, we can use the relation between $\\delta x$ and t and $\\lambda$. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c77920de-3dc0-429a-a266-de8a95b20ae6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Source is at the coordinates (1000, 1000) and we are looking at a pixel\n",
+    "# along the trace at pixel coordinate 1558\n",
+    "x = 1000\n",
+    "y = 1000"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f7b33090-f6ca-4fe9-bae1-055e708f8f23",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t = C.INVDISPX(\"+1\", x, y, xg-x)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1fe1df61-db4a-4594-8722-1364e2a9e2ae",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "wavelength = C.DISPL(\"+1\", x, y, t)\n",
+    "print(f\"Wavelength = {wavelength} microns\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7a1550bd-572f-4662-bfef-5f3b1fac82d1",
+   "metadata": {},
+   "source": [
+    "Here we see that we get back the 3.5 micron wavelength that we used as input when calculating $x_g$ and $y_g$ above."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2abb689b-06b8-461d-bd9a-d79ff590b568",
+   "metadata": {},
+   "source": [
+    "### Compute the spectral trace for a given object"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6ac1665e-3fd0-4c5f-9251-0a6474be9afc",
+   "metadata": {},
+   "source": [
+    "We can compute where we would expect the dispersed 1st order trace for a given object in a similar manner. We can use a series of $t$ values to cover the whole spectra trace (in this case the NIRCam calibration assumes $0<t<1$)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4c0e0be7-791a-4331-a718-563fb94d1c03",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "x = 1000\n",
+    "y = 1000\n",
+    "ts = np.arange(0, 1, 0.01)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "108bc026-4d4e-4fad-a256-8a4fa5464109",
+   "metadata": {},
+   "source": [
+    "We can compute all of the corresponding trace coordinates and wavelengths (x$_g$, y$_g$, $\\lambda$):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2a5b6bee-2a00-4a10-966d-749a7ceba444",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "xgs = C.DISPX(\"+1\", x, y, ts) + x\n",
+    "ygs = C.DISPY(\"+1\", x, y, ts) + y \n",
+    "waves = C.DISPL(\"+1\", x, y, ts)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ad32a4e7-20fe-4549-8855-a35f21e07c20",
+   "metadata": {},
+   "source": [
+    "Show an image of the location of the trace across the detector."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "730f3d0e-0595-49fc-a5a6-01a32478731b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, ax = plt.subplots(1, 1, figsize=(20, 2))\n",
+    "plt.scatter(xgs, ygs, c=waves)\n",
+    "plt.colorbar(label=\"Wavelength (μm)\")\n",
+    "plt.grid()\n",
+    "plt.ylabel(r\"y$_g$\")\n",
+    "plt.xlabel(r\"x$_g$\")\n",
+    "plt.title(\"Trace location across the detector\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ea8598ee",
+   "metadata": {},
+   "source": [
+    "### Basic Box Extraction"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0454645e",
+   "metadata": {},
+   "source": [
+    "A very basic \"extraction\" of a spectrum can be performed using the WFSS transformation listed above. Here we show how one could perfrom a basic \"Box\" extraction of a well-isolated object i.e. not contaminated by overlapping spectra from other sources in the field)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fcc1b242",
+   "metadata": {},
+   "source": [
+    "One of the key concepts is the virtual location of the source in the dispersed WFSS observation. We are not able to determine that location precisely using the WFSS data alone and must rely on additional information. The location of the source is usually measured in a direct image of the field. In cases where the imaging and WFSS data are undithered with respect to one another, the position of the source in the WFSS observation is the position of the same source in the undithered imaging. \n",
+    "\n",
+    "If dithering was performed, one can rely on the WCS to estimate the location of the source in imaging data that would have been observed at the same position and orientation as the WFSS observation. This is done by computing the observed RA and Dec of the source in the available imaging data, and then converting these newly computed celestrial coordinates back to a detector ($x$, $y$) location using a WCS from the dithered WFSS observation."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a3b7e35-ea97-4192-84df-02e971a26896",
+   "metadata": {},
+   "source": [
+    "#### Examine a source in the data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "df889818-f14c-4827-9ad6-60f3d9088b03",
+   "metadata": {},
+   "source": [
+    "Look at a source in the imaging data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fc6998fe",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "xd, yd = 1562, 696"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c92b21d1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plt.imshow(imaging_data.data[yd - 20:yd + 20, xd - 20:xd + 20], origin=\"lower\", vmin=0, vmax=10)\n",
+    "plt.xlabel('Column Number')\n",
+    "plt.ylabel('Row Number')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0288321c-ffa0-49d1-8c7b-bfcb8c913ad2",
+   "metadata": {},
+   "source": [
+    "#### Calculate trace location"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fda8eb1d-f408-45f2-b8c3-b542f385fc8d",
+   "metadata": {},
+   "source": [
+    "Now get the WCS from the imaging and WFSS files, so that we can perform coordinate transforms and calculate where the trace of this source lands in the WFSS observation."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3d4ff71e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "imaging_to_world = imaging_data.meta.wcs.get_transform('detector', 'world')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "16678969",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "wfss_to_pix = wfss_data.meta.wcs.get_transform('world', 'detector')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c29ce542-5867-4d28-bdd3-7ce538f772f7",
+   "metadata": {},
+   "source": [
+    "Translate the source's (x, y) location in the imaging mode data to RA, Dec, using the imaging mode WCS"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a286c856",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ra, dec = imaging_to_world(xd, yd)\n",
+    "print(ra, dec)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6acb41e1-9cae-415b-bf22-81815aa6732c",
+   "metadata": {},
+   "source": [
+    "Now translate the RA, Dec to a location on the detector in the WFSS data, using the WCS from the WFSS file. Note that for this translation, the wavelength and order are required inputs, but they do not actually affect the calculation. We'll use a wavelength of 3.56 microns and an order of 1 in the cell below, but you can see that changing these values does not change the resulting x, y values."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c9e8b8a2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "x, y, wav, ord = wfss_to_pix(ra, dec, 3.56, 1)\n",
+    "x, y, wav, ord"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5d05a900",
+   "metadata": {},
+   "source": [
+    "Compute an approximate bounding box for the spectrum of this source. We use the fact that calibration define the edges of the spectra at values of t=0 and t=1 in the dispersion direction"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9fa975dc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# First calculate the distance from the nominal x, y location to the left and right\n",
+    "# edges of the box (i.e. where t = 0 and t = 1)\n",
+    "ts = np.array([0, 1])\n",
+    "dxs = C.DISPX(\"+1\", x, y, ts)\n",
+    "dys = C.DISPY(\"+1\", x, y, ts)\n",
+    "dxs, dys"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "35590b9b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Use the distances above to calculate the x location of the left and right edges of the box.\n",
+    "x_min = int(x + dxs[0])\n",
+    "x_max = int(x + dxs[1])\n",
+    "\n",
+    "# Set the height of the box to be 50 pixels (25 pixels above and below the nominal location\n",
+    "y_min = int(y + dys[1] - 25)\n",
+    "y_max = int(y + dys[1] + 25)\n",
+    "\n",
+    "x_min, x_max"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b7f820e7-64c9-481b-ba35-effa4c43eb3e",
+   "metadata": {},
+   "source": [
+    "Show an image of the box to be extracted."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b20354b1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, ax = plt.subplots(1, 1, figsize=(20, 5))\n",
+    "ax.imshow(wfss_data.data[y_min:y_max, x_min:x_max], origin=\"lower\", vmin=0.25, vmax=1)\n",
+    "\n",
+    "ax.plot(x+dxs-x_min, y+dys-y_min)\n",
+    "ax.set_xlabel('Column Number')\n",
+    "ax.set_ylabel('Row Number')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "964ad3fb",
+   "metadata": {},
+   "source": [
+    "The simplest extraction can be done under the assumption that the spectral trace is essentially linear and that the wavelength of the light falling on a pixel is only a function of the dispersion direction. As such, we can use the approximation that all pixels in a column correspond to the same wavelength. This is only appropriate for nearly flat dispersions and has often been assumed in the past. This also assumes that the field dependence of the dispersion is small within the area covered by a spectrum."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "050a4f9c",
+   "metadata": {},
+   "source": [
+    " #### Estimate Background"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b2672209",
+   "metadata": {},
+   "source": [
+    "Here, for this simplest case, we estimate and subtract a simple background value from the data, taken to be the median value in the region contianing our spectrum. This very simple step illustrates the limits of simple box extraction as any such estimate will be biased by any other sources in the field, or any non-flatness in the structure of the dispersed background. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "98027608",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# We want to ignore all NaN pixels, so locate all non-NaN pixels\n",
+    "ok = np.isfinite(wfss_data.data[y_min:y_max, x_min:x_max])\n",
+    "\n",
+    "# Sigma-clip the pixels in the box\n",
+    "clipped, low, upp = sigmaclip(wfss_data.data[y_min:y_max, x_min:x_max][ok], 1.5, 1.5)\n",
+    "\n",
+    "# Calculate the median of the sigma-clipped pixels\n",
+    "med_bck = np.nanmedian(clipped)\n",
+    "print(\"Background level estimate:\", med_bck)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b237bdba",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Plot a histogram of the data, along with a line indicating the median level\n",
+    "plt.hist(np.ravel(wfss_data.data[y_min:y_max, x_min:x_max]), 100, range=(0, 1.5))\n",
+    "plt.axvline(med_bck, color='r')\n",
+    "plt.title('Histogram of pixel values in extraction box')\n",
+    "plt.xlabel('Signal Value (DN/sec)')\n",
+    "plt.ylabel('Occurrences')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3ffce123",
+   "metadata": {},
+   "source": [
+    "#### Assign Wavelengths"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "56917762",
+   "metadata": {},
+   "source": [
+    "Compute the wavelength everywhere on the array for this object. We use the full array here for simplicity and compute the distance in the x- direction (dispersion direction) between a pixel and the source."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c59da7fd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ys, xs = np.indices(np.shape(wfss_data.data))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9c55e9da-3875-491b-9501-c4a32d20bf10",
+   "metadata": {},
+   "source": [
+    "Translate the x values to be distances from the nominal source location"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f5aa8977",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dxs = xs - x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7015c062",
+   "metadata": {},
+   "source": [
+    "We use the grismconf relations to compute the $t$ values everywhere and then compute the wavelengths, again everywhere."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b202ed97",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ts = C.INVDISPX(\"+1\", x, y, dxs)\n",
+    "lams = C.DISPL(\"+1\", x, y, ts)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a21064e9",
+   "metadata": {},
+   "source": [
+    "Displaying the wavelength array on the same area on the detector where the spectrum is, we see what we now have a wavelength estimate for each element of the 2D WFSS data."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1faf5150",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, ax = plt.subplots(1, 1, figsize=(20, 3))\n",
+    "tt = ax.imshow(lams[y_min:y_max, x_min:x_max], origin=\"lower\", vmin=2, vmax=4)\n",
+    "cbar = plt.colorbar(tt).ax.set_ylabel('Wavelength (microns)', rotation=270, fontsize=12, labelpad=25)\n",
+    "plt.title('Wavelength array covering all pixels of extraction box')\n",
+    "plt.xlabel('Column Number')\n",
+    "plt.ylabel('Row Number')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cbef6f40",
+   "metadata": {},
+   "source": [
+    "#### Box extraction"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "75c3f5d4",
+   "metadata": {},
+   "source": [
+    "Creating a 1D spectrum from the data above is, in its simplest form, a matter of collapsing things along the cross dipsersion direction (y- direction). The 1D spectrum will be affected by bad pixels, any spectra contamination, as well as any error in the amount of background subtracted. These effects become increasingly important as the box size is increased. We also subtract the median background from each column."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4fbe376b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "cs = np.nansum(wfss_data.data[y_min:y_max, x_min:x_max] - med_bck, axis=0)\n",
+    "plt.plot(cs)\n",
+    "plt.ylim(-5, 20)\n",
+    "plt.grid()\n",
+    "plt.xlabel('Column Number')\n",
+    "plt.ylabel('Signal (DN/sec)')\n",
+    "plt.title('Extracted 1D spectrum versus detector column')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b366a9da",
+   "metadata": {},
+   "source": [
+    "#### Wavelength calibration"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "741a0153",
+   "metadata": {},
+   "source": [
+    "In this case, we can average the wavelength array in the cross dispersion direction (y-direction) to obtain a single wavelength vector that is appropriate for the spectum we just box extracted. Plotting the 1D count vector versus the wavelength vector thus created results in a wavelength calibrated 1D spectrum"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "be76cfbe",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ws = np.nanmean(lams[y_min:y_max, x_min:x_max], axis=0)\n",
+    "plt.plot(ws, cs)\n",
+    "plt.ylim(-5, 20)\n",
+    "plt.grid()\n",
+    "plt.xlabel('Wavelength (microns)')\n",
+    "plt.ylabel('Signal (DN/sec)')\n",
+    "plt.title('Extracted 1D spectrum versus wavelength')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1ae8d3ca",
+   "metadata": {},
+   "source": [
+    "#### Converting to physical units"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "96c88153",
+   "metadata": {},
+   "source": [
+    "Here we use information contained in grismconf to convert our 1D spectrum, which is in units of DN/sec (per bin, or pixel in this case since there is no resampling of the data) into units of $erg / sec / cm^2 / \\overset{\\circ}{A}$. The grism inverse sensitivity curve in grismconf is in units of DN/sec (per pixel) per $erg / sec / cm^2 / \\overset{\\circ}{A}$. As the sensitivity is defined per Angstrom while our 1D spectrum is extracted per pixel, we need to account for this when applying the sensitivity. In most WFSS data the variation of the pixel size in wavelength units is small but this can easily be accounted for by computing the size of each pixel in wavelength units (Angstrom) as we show below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "34104ce0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Plot the inverse sensitivity curve\n",
+    "tws = np.arange(2.9, 4.2, 0.01)\n",
+    "plt.plot(tws, C.SENS[\"+1\"](tws))\n",
+    "plt.grid()\n",
+    "plt.xlabel(\"Wavelength (micron)\")\n",
+    "plt.ylabel(\"Inverse Sensitivity\")\n",
+    "plt.title(\"Inverse sensitivity curve\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "37b026eb",
+   "metadata": {},
+   "source": [
+    "Calculate the size of each element in our 1D spectrum, in Angstrom"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e25d39e4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# The value of 10000 comes from the NIRCam dispersion value of ~10A per pixel\n",
+    "dws = (ws[1:] - ws[:-1]) * 10000"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c60703d2",
+   "metadata": {},
+   "source": [
+    "Apply the sensitivity curve by dividing the 1D counts by the sensitivity and by the size of each bin in Angstrom to produce flux values in units of $erg / sec / cm^2 / \\overset{\\circ}{A}$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7c26b68e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fs = cs[1:] / C.SENS[\"+1\"](ws[1:]) / dws"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "71fc1dc4",
+   "metadata": {},
+   "source": [
+    "Plotting the resulting flux array versus our wavelength scale shows the fully calbirated 1D spectrum. One caveat of the conversion from DN/sec per pixel to $erg / sec / cm^2 / \\overset{\\circ}{A}$ is the well known edge effect caused by dividing counts values that are zero or close to zero by the sensitivity. As this method does not account for the broadening of the spectra by the footprint of the object in the dispersion direction, this effect is worse for extended sources. This effect can only be mitigated by more forward modeling extraction methods."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e4e27c86",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plt.plot(ws[1:], fs)\n",
+    "plt.xlabel(\"Wavelength (micron)\")\n",
+    "plt.ylabel(\"Signal ($erg/s/cm^2/A$)\")\n",
+    "plt.title(\"Extracted 1D spectrum in $F_{lambda}$ units\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9a744786",
+   "metadata": {},
+   "source": [
+    "Concentrating on regions off the spectral edges reveals the spectrum of our source. One of the many drawbacks of this simplistic box extraction method is the impact of bad pixels and any residual cosmic rays. These can cause significnat peaks in the spectra, which can be confused for emission lines."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7e883b58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plt.plot(ws[1:], fs)\n",
+    "plt.xlim(3.1, 4.)\n",
+    "plt.ylim(0, 0.5e-16)\n",
+    "plt.xlabel(\"Wavelength (micron)\")\n",
+    "plt.ylabel(\"Signal ($erg/s/cm^2/A$)\")\n",
+    "plt.title(\"Extracted 1D spectrum in $F_{lambda}$ units\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/notebooks/NIRCam/NIRCam_WFSS_Box_extraction/requirements.txt b/notebooks/NIRCam/NIRCam_WFSS_Box_extraction/requirements.txt
new file mode 100755
index 000000000..94cdddf08
--- /dev/null
+++ b/notebooks/NIRCam/NIRCam_WFSS_Box_extraction/requirements.txt
@@ -0,0 +1,9 @@
+astropy >= 6.1.3
+grismconf >= 1.51
+jupyter >= 1.1.1
+jwst >= 1.16.0
+matplotlib >= 3.9.2
+numpy == 1.26.4
+requests >= 2.32.3
+scipy >= 1.14.1
+crds >= 12.0.4
\ No newline at end of file