diff --git a/notebooks/NIRISS_WFSS_advanced/00_niriss_mast_query_data_setup.ipynb b/notebooks/NIRISS_WFSS_advanced/00_niriss_mast_query_data_setup.ipynb new file mode 100644 index 000000000..01f5e6afa --- /dev/null +++ b/notebooks/NIRISS_WFSS_advanced/00_niriss_mast_query_data_setup.ipynb @@ -0,0 +1,555 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "592da8da", + "metadata": {}, + "source": [ + "# Download Wide Field Slittless Spectroscopy (WFSS) Data\n", + "This notebook uses the python [astroquery.mast Observations](https://astroquery.readthedocs.io/en/latest/mast/mast_obsquery.html) class of the [MAST API](https://mast.stsci.edu/api/v0/) to query specific data products of a specific program. We are looking for NIRISS imaging and WFSS files of the [NGDEEP program](https://www.stsci.edu/jwst/phase2-public/2079.pdf) (ID 2079). The observations are in three [NIRISS filters](https://jwst-docs.stsci.edu/jwst-near-infrared-imager-and-slitless-spectrograph/niriss-instrumentation/niriss-pupil-and-filter-wheels): F115W, F150W, and F200W using both GR150R and GR150C [grisms](https://jwst-docs.stsci.edu/jwst-near-infrared-imager-and-slitless-spectrograph/niriss-instrumentation/niriss-gr150-grisms). A [WFSS observation sequence](https://jwst-docs.stsci.edu/jwst-near-infrared-imager-and-slitless-spectrograph/niriss-observing-strategies/niriss-wfss-recommended-strategies) typically consists of a direct image followed by a grism observation in the same blocking filter to help identify the sources in the field. In program 2079, the exposure sequence follows the pattern: direct image -> GR150R -> direct image -> GR150C -> direct image.\n", + "\n", + "**Use case**: use MAST to download data products.
\n", + "**Data**: JWST/NIRISS images and spectra from program 2079 observation 004.
\n", + "**Tools**: astropy, astroquery, glob, matplotlib, numpy, os, pandas, (yaml)
\n", + "**Cross-instrument**: all
\n", + "\n", + "**Content**\n", + "- [Imports](#imports)\n", + "- [Querying for Observations](#setup)\n", + " - [Search with Proposal ID](#propid)\n", + " - [Search with Observation ID](#obsid)\n", + "- [Filter and Download Products](#filter)\n", + " - [Filtering Data Before Downloading](#filter_data)\n", + " - [Downloading Data](#downloading)\n", + "- [Inspect Downloaded Data](#inspect)\n", + "\n", + "\n", + "**Author**: Camilla Pacifici (cpacifici@stsci.edu) & Rachel Plesha (rplesha@stsci.edu) & Jo Taylor (jotaylor@stsci.edu)
\n", + "**First Published**: May 2024\n", + "\n", + "This notebook was inspired by the [JWebbinar session about MAST](https://github.com/spacetelescope/jwebbinar_prep/blob/main/mast_session/Crowded_Field/Crowded_Field.ipynb)." + ] + }, + { + "cell_type": "markdown", + "id": "ff93c82c", + "metadata": {}, + "source": [ + "\n", + "## Imports" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7e92dee7", + "metadata": {}, + "outputs": [], + "source": [ + "from astropy.io import fits\n", + "from astroquery.mast import Observations\n", + "from matplotlib import pyplot as plt\n", + "import numpy as np\n", + "import os\n", + "import glob\n", + "import pandas as pd" + ] + }, + { + "cell_type": "markdown", + "id": "008fcb1e", + "metadata": {}, + "source": [ + "\n", + "## Querying for Observations" + ] + }, + { + "cell_type": "markdown", + "id": "9562f354", + "metadata": {}, + "source": [ + "The observations class in ``astroquery.mast`` is used to download JWST data. Use the metadata function to see the available search options and their descriptions.\n", + "\n", + "Note that for JWST, the instrument names have a specific format. More information about that can be found at: https://outerspace.stsci.edu/display/MASTDOCS/JWST+Instrument+Names " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f3185204", + "metadata": {}, + "outputs": [], + "source": [ + "Observations.get_metadata(\"observations\")" + ] + }, + { + "cell_type": "markdown", + "id": "9447551a", + "metadata": {}, + "source": [ + "The two most common ways to download specific datasets are by using the [proposal ID](https://www.stsci.edu/jwst/science-execution/program-information) or by using the [observation ID](https://jwst-pipeline.readthedocs.io/en/latest/jwst/data_products/file_naming.html)." + ] + }, + { + "cell_type": "markdown", + "id": "d47efd71", + "metadata": {}, + "source": [ + "\n", + "#### Search with Proposal ID" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "569efa6d", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# Select the proposal ID, instrument, and some useful keywords (filters in this case).\n", + "obs_table = Observations.query_criteria(obs_collection=[\"JWST\"], \n", + " instrument_name=[\"NIRISS/IMAGE\", \"NIRISS/WFSS\"],\n", + " provenance_name=[\"CALJWST\"], # Executed observations\n", + " filters=['F115W', 'F150W', 'F200W'],\n", + " proposal_id=[2079],\n", + " )\n", + "\n", + "print(len(obs_table), 'files found')\n", + "# look at what was obtained in this query for a select number of column names of interest\n", + "obs_table[['obs_collection', 'instrument_name', 'filters', 'target_name', 'obs_id', 's_ra', 's_dec', 't_exptime', 'proposal_id']]" + ] + }, + { + "cell_type": "markdown", + "id": "4d358734", + "metadata": {}, + "source": [ + "\n", + "#### Search with Observation ID\n", + "The observation ID (obs_id) allows for flexibility of searching by the proposal ID and the observation ID because of how the JWST filenames are structured. More information about the JWST file naming conventions can be found at: https://jwst-pipeline.readthedocs.io/en/latest/jwst/data_products/file_naming.html. For the purposes of this notebook series, we will use only one of the two observations (004) in program 2079.\n", + "\n", + "Additionally, there is flexibility using wildcards inside of the search criteria. For example, instead of specifying both \"NIRISS/IMAGE\" and \"NIRISS/WFSS\", we can specify \"NIRISS*\", which picks up both file modes. The wildcard also works within the obs_id, so we do not have to list all of the different IDs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6ddce5a5", + "metadata": {}, + "outputs": [], + "source": [ + "# Obtain a list to download from a specific list of observation IDs instead\n", + "obs_id_table = Observations.query_criteria(instrument_name=[\"NIRISS*\"],\n", + " provenance_name=[\"CALJWST\"], # Executed observations\n", + " obs_id=['jw02079-o004*'], # Searching for PID 2079 observation 004\n", + " ) \n", + "\n", + "# this number will change with JWST pipeline and reference file updates\n", + "print(len(obs_id_table), 'files found') # ~613 files" + ] + }, + { + "cell_type": "markdown", + "id": "7e57c3c0", + "metadata": {}, + "source": [ + "\n", + "## Filter and Download Products" + ] + }, + { + "cell_type": "markdown", + "id": "9daaeaf9", + "metadata": {}, + "source": [ + "If there are too many files to download, the API will time out. Instead, it is better to divide the observations in batches to download one at a time." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b8a653a9", + "metadata": {}, + "outputs": [], + "source": [ + "batch_size = 5 # 5 files at a time maximizes the download speed.\n", + "\n", + "# Let's split up our list of files, ``obs_id_table``, into batches according to our batch size.\n", + "obs_batches = [obs_id_table[i:i+batch_size] for i in range(0, len(obs_id_table), batch_size)]\n", + "print(\"How many batches?\", len(obs_batches))\n", + "\n", + "single_group = obs_batches[0] # Useful to inspect the files obtained from one group\n", + "print(\"Inspect the first batch to ensure that it matches expectations of what you want downloaded:\") \n", + "single_group['obs_collection', 'instrument_name', 'filters', 'target_name', 'obs_id', 's_ra', 's_dec', 't_exptime', 'proposal_id']" + ] + }, + { + "cell_type": "markdown", + "id": "10c7d144", + "metadata": {}, + "source": [ + "Select the type of products needed. The various levels are:\n", + "- uncalibrated files\n", + " - productType=[\"SCIENCE\"]\n", + " - productSubGroupDescription=['UNCAL']\n", + " - calib_level=[1]\n", + "- rate images\n", + " - productType=[\"SCIENCE\"]\n", + " - productSubGroupDescription=['RATE']\n", + " - calib_level=[2]\n", + "- level 2 associations for both spectroscopy and imaging\n", + " - productType=[\"INFO\"]\n", + " - productSubGroupDescription=['ASN']\n", + " - calib_level=[2]\n", + "- level 3 associations for imaging\n", + " - productType=[\"INFO\"]\n", + " - productSubGroupDescription=['ASN']\n", + " - dataproduct_type=[\"image\"]\n", + " - calib_level=[3]" + ] + }, + { + "cell_type": "markdown", + "id": "179f01e4", + "metadata": {}, + "source": [ + "\n", + "#### Filtering Data Before Downloading" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fd82b142", + "metadata": {}, + "outputs": [], + "source": [ + "# creating a dictionary of the above information to use for inspection of the filtering function\n", + "file_dict = {'uncal': {'product_type': 'SCIENCE',\n", + " 'productSubGroupDescription': 'UNCAL',\n", + " 'calib_level': [1]},\n", + " 'rate': {'product_type': 'SCIENCE',\n", + " 'productSubGroupDescription': 'RATE',\n", + " 'calib_level': [2]},\n", + " 'level2_association': {'product_type': 'INFO',\n", + " 'productSubGroupDescription': 'ASN',\n", + " 'calib_level': [2]},\n", + " 'level3_association': {'product_type': 'INFO',\n", + " 'productSubGroupDescription': 'ASN',\n", + " 'calib_level': [3]},\n", + " }" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fed7ca65", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# Look at the files existing for each of these different levels\n", + "files_to_download = []\n", + "for index, batch_exposure in enumerate(single_group):\n", + " \n", + " print('*'*50)\n", + " print(f\"Exposure #{index+1} ({batch_exposure['obs_id']})\")\n", + " # pull out the product names from the list to filter\n", + " products = Observations.get_product_list(batch_exposure)\n", + " \n", + " for filetype, query_dict in file_dict.items():\n", + " print('File type:', filetype)\n", + " filtered_products = Observations.filter_products(products,\n", + " productType=query_dict['product_type'],\n", + " productSubGroupDescription=query_dict['productSubGroupDescription'],\n", + " calib_level=query_dict['calib_level'],\n", + " )\n", + " print(filtered_products['productFilename'])\n", + " files_to_download.extend(filtered_products['productFilename'])\n", + " print()\n", + " print('*'*50)" + ] + }, + { + "cell_type": "markdown", + "id": "e98933e6", + "metadata": {}, + "source": [ + "From above, we can see that for each exposure name in the observation list (`obs_id_table`), there are many associated files in the background that need to be downloaded as well. This is why we need to work in batches to download.\n", + "\n", + "\n", + "#### Downloading Data\n", + "To actually download the products, provide ``Observations.download_products()`` with a list of the filtered products. \n", + "\n", + "Typically, adjustments aren't needed to the [detector1 pipeline](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_detector1.html), so we can start with the outputs from detector1, the rate files, rather than the uncal files. Because of this, we only need to download the rate and association files. If you need to rerun the detector1 pipeline, `productSubGroupDescription` and `calib_level` will need to be adjusted in the `Observations.filter_products` call to download the uncal files instead.\n", + "\n", + "If the data are proprietary, you may also need to set up your API token. **NEVER** commit your token to a public repository. An alternative is to create a separate configuration file (config_file.yaml) that is readable only to you and has a key: 'mast_token' : _API token_\n", + "\n", + "To make create a new API token visit to following link: \n", + " https://auth.mast.stsci.edu/token?suggested_name=Astroquery&suggested_scope=mast:exclusive_access\n", + "\n", + "*Note that a version of astroquery >= 0.4.7 is required to have the call `flat=True` when downloading the data. If you prefer to use an earlier version, remove that line from the call, download the data, and move all of the files in all downloaded subfolders into the single directory as defined by the `download_dir` variable.*" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b9ca7055-81bd-4a94-a766-45bb463bc0fa", + "metadata": {}, + "outputs": [], + "source": [ + "# check that the version is above 0.4.7. See above note for more information\n", + "import astroquery\n", + "astroquery.__version__" + ] + }, + { + "cell_type": "raw", + "id": "fc1897a3", + "metadata": {}, + "source": [ + "# if needed, create a separate configuration file and replace Observations.download_products() with the following:\n", + "\n", + "import yaml\n", + "\n", + "with open(config_file) as f:\n", + " mast_config = yaml.safe_load(f)\n", + " \n", + "mysession = Observations(mast_config['mast_token'])\n", + "mysession.download_products(filtered_products)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "21c50976-a56a-4cd2-9cd6-11f9266b61bb", + "metadata": {}, + "outputs": [], + "source": [ + "download_dir = 'data'\n", + "\n", + "# make sure the download directory exists; if not, write a new directory\n", + "if not os.path.exists(download_dir):\n", + " os.mkdir(download_dir)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "616ff15c", + "metadata": {}, + "outputs": [], + "source": [ + "# Now let's get the products for each batch of observations, and filter down to only the products of interest.\n", + "for index, batch in enumerate(obs_batches):\n", + " \n", + " # Progress indicator...\n", + " print('\\n'+f'Batch #{index+1} / {len(obs_batches)}')\n", + " \n", + " # Make a list of the `obsid` identifiers from our Astropy table of observations to get products for.\n", + " obsids = batch['obsid']\n", + " print('Working with the following obsids:')\n", + " for number, obs_text in zip(obsids, batch['obs_id']):\n", + " print(f\"{number} : {obs_text}\")\n", + " \n", + " # Get list of products \n", + " products = Observations.get_product_list(obsids)\n", + " \n", + " # Filter the products to only get only the products of interest\n", + " filtered_products = Observations.filter_products(products, \n", + " productType=[\"SCIENCE\", \"INFO\"],\n", + " productSubGroupDescription=[\"RATE\", \"ASN\"], # Not using \"UNCAL\" here since we can start with the rate files\n", + " calib_level=[2, 3] # not using 1 here since not getting the UNCAL files\n", + " )\n", + " \n", + " # Download products for these records.\n", + " # The `flat=True` option stores all files in a single directory specified by `download_dir`.\n", + " manifest = Observations.download_products(filtered_products,\n", + " download_dir=download_dir,\n", + " flat=True, # astroquery v0.4.7 or later only\n", + " ) \n", + " print('Products downloaded:\\n', filtered_products['productFilename'])" + ] + }, + { + "cell_type": "markdown", + "id": "31ff8769-1240-452d-b84c-eca69d53a13f", + "metadata": {}, + "source": [ + "If continuing on with the WFSS notebooks, let's double check that we've downloaded all of the files that we need for the remaining notebooks. There should be 149 files downloaded. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "39310ec6-f58c-4dd6-96ba-b5cda55ee7d5", + "metadata": {}, + "outputs": [], + "source": [ + "downloaded_files = glob.glob(os.path.join(download_dir, '*.fits')) + glob.glob(os.path.join(download_dir, '*.json'))\n", + "print(len(downloaded_files), 'files downloaded to:', download_dir)" + ] + }, + { + "cell_type": "markdown", + "id": "e840d7c4-3643-4152-bd04-bfdcc150cb3e", + "metadata": {}, + "source": [ + "\n", + "## Inspect Downloaded Data\n", + "\n", + "The purpose of this function is to have a better idea of what data are available to you. Additionally, you will be able to use this dataframe to select specific files that match the mode you would like to take a closer look at." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "949795c1-5ab9-45a5-aaf2-ba30593a1505", + "metadata": {}, + "outputs": [], + "source": [ + "ratefile_datadir = 'data/'\n", + "\n", + "# first look for all of the rate files you have downloaded\n", + "rate_files = np.sort(glob.glob(os.path.join(ratefile_datadir, \"*rate.fits\")))\n", + "\n", + "for file_num, ratefile in enumerate(rate_files):\n", + "\n", + " rate_hdr = fits.getheader(ratefile) # Primary header for each rate file\n", + "\n", + " # information we want to store that might be useful to us later for evaluating the data\n", + " temp_hdr_dict = {\"FILENAME\": ratefile,\n", + " \"TARG_RA\": [rate_hdr[\"TARG_RA\"]],\n", + " \"TARG_DEC\": [rate_hdr[\"TARG_DEC\"]],\n", + " \"FILTER\": [rate_hdr[\"FILTER\"]], # Grism; GR150R/GR150C\n", + " \"PUPIL\": [rate_hdr[\"PUPIL\"]], # Filter used; F090W, F115W, F140M, F150W F158M, F200W\n", + " \"EXPSTART\": [rate_hdr['EXPSTART']], # Exposure start time (MJD)\n", + " \"PATT_NUM\": [rate_hdr[\"PATT_NUM\"]], # Position number within dither pattern for WFSS\n", + " \"NUMDTHPT\": [rate_hdr[\"NUMDTHPT\"]], # Total number of points in entire dither pattern\n", + " \"XOFFSET\": [rate_hdr[\"XOFFSET\"]], # X offset from pattern starting position for NIRISS (arcsec)\n", + " \"YOFFSET\": [rate_hdr[\"YOFFSET\"]], # Y offset from pattern starting position for NIRISS (arcsec)\n", + " }\n", + "\n", + " # Turn the dictionary into a pandas dataframe\n", + " if file_num == 0:\n", + " # if this is the first file, make an initial dataframe\n", + " rate_df = pd.DataFrame(temp_hdr_dict)\n", + " else:\n", + " # otherwise, append to the dataframe for each file\n", + " new_data_df = pd.DataFrame(temp_hdr_dict)\n", + "\n", + " # merge the two dataframes together to create a dataframe with all \n", + " rate_df = pd.concat([rate_df, new_data_df], ignore_index=True, axis=0)\n", + "\n", + "rate_dfsort = rate_df.sort_values('EXPSTART', ignore_index=False)\n", + "\n", + "# Save the dataframe to a file to read in later, if desired\n", + "outfile = './list_ngdeep_rate.csv'\n", + "rate_dfsort.to_csv(outfile, sep=',', index=False)\n", + "print('Saved:', outfile)\n", + "\n", + "# Look at the resulting dataframe\n", + "rate_dfsort" + ] + }, + { + "cell_type": "markdown", + "id": "4f49059b-cc92-448f-aca1-b5af87b0c208", + "metadata": {}, + "source": [ + "In particular, let's look at the observation sequence these rate files follow. We have sorted the files above by exposure time, so they should already be in time order in the dataframe. \n", + "\n", + "`FILTER = CLEAR` indicates a direct image while `FILTER=GR150R` or `FILTER=GR150C` indicates a dispersed image. PUPIL is the blocking filter used. The first 14 exposures make up the first sequence set of direct image -> grism -> direct image -> grism. There are also multiple dither positions for the dispersed images and the direct images. The multiple direct image dithers will be combined in image3, while the multiple dispersed images can be combined in spec3. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ccd23e56-4441-4ad0-ac90-68358802133d", + "metadata": {}, + "outputs": [], + "source": [ + "rate_df[['EXPSTART', 'FILTER', 'PUPIL', 'PATT_NUM', 'XOFFSET', 'YOFFSET']].head(14)" + ] + }, + { + "cell_type": "markdown", + "id": "38503db6-e37b-4237-b518-6b1a15cefd66", + "metadata": {}, + "source": [ + "Shown below are the first 14 rate files to give an idea of the above sequence visually. Grid lines are shown as a visual guide for any dithers that are made." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5a285064-b1ee-4ead-8890-4d20f1ba7865", + "metadata": {}, + "outputs": [], + "source": [ + "# plot set up\n", + "fig = plt.figure(figsize=(20, 35))\n", + "cols = 3\n", + "rows = int(np.ceil(14 / cols))\n", + "\n", + "# loop over the first 14 rate files and plot them\n", + "for plt_num, rf in enumerate(rate_dfsort[0:14]['FILENAME']):\n", + "\n", + " # determine where the subplot should be\n", + " xpos = (plt_num % 40) % cols\n", + " ypos = ((plt_num % 40) // cols) # // to make it an int.\n", + "\n", + " # make the subplot\n", + " ax = plt.subplot2grid((rows, cols), (ypos, xpos))\n", + "\n", + " # open the data and plot it\n", + " with fits.open(rf) as hdu:\n", + " data = hdu[1].data\n", + " data[np.isnan(data)] = 0 # filling in nan data with 0s to help with the matplotlib color scale.\n", + " \n", + " ax.imshow(data, vmin=0, vmax=1.5, origin='lower')\n", + "\n", + " # adding in grid lines as a visual aid\n", + " for gridline in [500, 1000, 1500]:\n", + " ax.axhline(gridline, color='black', alpha=0.5)\n", + " ax.axvline(gridline, color='black', alpha=0.5)\n", + "\n", + " ax.set_title(f\"#{plt_num+1}: {hdu[0].header['FILTER']} {hdu[0].header['PUPIL']} Dither{hdu[0].header['PATT_NUM']}\")" + ] + }, + { + "cell_type": "markdown", + "id": "8c0e1d71", + "metadata": {}, + "source": [ + "\"Space" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.7" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/NIRISS_WFSS_advanced/01_niriss_wfss_image2_image3.ipynb b/notebooks/NIRISS_WFSS_advanced/01_niriss_wfss_image2_image3.ipynb new file mode 100644 index 000000000..e49adbe82 --- /dev/null +++ b/notebooks/NIRISS_WFSS_advanced/01_niriss_wfss_image2_image3.ipynb @@ -0,0 +1,1205 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "7a935d7c", + "metadata": {}, + "source": [ + "# Run Image Pipeline and Create Source Catalog\n", + "\n", + "In this run, we are starting from the rate files which have already been calibrated with the detector1 pipeline. This will save time as we do not need to edit any of the steps being performed as part of detector1. Therefore, the first calibration that should be done as part of a WFSS run is to run the rate files direct images through the Image2 and Image3 steps of the JWST pipeline. This includes creating a source catalog, which most likely will need to be adjusted from the pipeline default values. **Not having a good source catalog will result in non optimal extraction of sources in the dispersed, WFSS, images.**\n", + "\n", + "**Use case**: The default parameters for the pipeline do not extract the expected sources, so custom parameters need to be set to obtain new combined image and source catalog.
\n", + "**Data**: JWST/NIRISS images and spectra from program 2079 observation 004. This should be stored in a single directory `data`, and can be downloaded from the previous notebook, 00_niriss_mast_query_data_setup.ipynb.
\n", + "**Tools**: astropy, crds, glob, jdaviz, json, jwst, matplotlib, numpy, os, pandas, urllib, warnings, zipfile
\n", + "**Cross-instrument**: NIRISS
\n", + "\n", + "**Content**\n", + "- [Imports & Data Setup](#imports)\n", + "- [Default Imaging Pipeline Run](#default)\n", + " - [Image2](#default_image2)\n", + " - [Image3](#default_image3)\n", + " - [Inspecting Default Results](#view_default)\n", + "- [Custom Imaging Pipeline Run](#custom)\n", + " - [Image3](#custom_image3)\n", + " - [Inspecting Custom Results](#view_custom)\n", + "- [Refining the Source Catalog Further](#source_cat)\n", + " - [Matching Source IDs Across Catalogs](#match_sources) \n", + " - [Manually Editing the Source Catalog](#manual_cat)\n", + "\n", + "**Author**: Rachel Plesha (rplesha@stsci.edu), Camilla Pacifici (cpacifici@stsci.edu), JWebbinar notebooks.
\n", + "**First Published**: May 2024
\n", + "**Last tested**: This notebook was last tested with JWST pipeline version 1.12.5 and the CRDS context jwst_1225.pmap." + ] + }, + { + "cell_type": "markdown", + "id": "7fdaf1a1-57e5-4239-b82e-6552261d8d82", + "metadata": {}, + "source": [ + "\n", + "## Imports & Data Setup" + ] + }, + { + "cell_type": "markdown", + "id": "e3ccd6a3-e781-4d7f-8120-fa142723a6d9", + "metadata": {}, + "source": [ + "[CRDS Documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/user_documentation/reference_files_crds.html#crds)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6bf2ea52-2568-44cf-8074-eec3b5b76cc0", + "metadata": {}, + "outputs": [], + "source": [ + "# Update the CRDS path to your local directory\n", + "%env CRDS_PATH=crds_cache\n", + "%env CRDS_SERVER_URL=https://jwst-crds.stsci.edu" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "028352f4", + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import glob\n", + "import json\n", + "import warnings\n", + "import urllib\n", + "import zipfile\n", + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "import astropy.units as u\n", + "from astropy.io import fits\n", + "from astropy.coordinates import SkyCoord\n", + "from astropy.table import Table\n", + "\n", + "from jdaviz import Imviz\n", + "from matplotlib import pyplot as plt\n", + "%matplotlib widget\n", + "# %matplotlib inline\n", + "\n", + "from jwst.pipeline import Image2Pipeline\n", + "from jwst.pipeline import Image3Pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "fb035a4d-e63c-4125-b083-462e3353e49f", + "metadata": {}, + "source": [ + "Check what version of the JWST pipeline you are using. To see what the latest version of the pipeline is available or install a previous version, check [GitHub](https://github.com/spacetelescope/jwst#software-vs-dms-build-version-map\"). Also verify what [CRDS version](https://jwst-crds.stsci.edu/) you are using. [CRDS documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/user_documentation/reference_files_crds.html) explains how to set a specific context to use in the JWST pipeline. If either of these values are different from the last tested note above there may be differences in this notebook." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ae10a560", + "metadata": {}, + "outputs": [], + "source": [ + "import jwst\n", + "import crds\n", + "print('JWST Pipeliene Version:', jwst.__version__)\n", + "print('CRDS Context:', crds.get_context_name('jwst'))" + ] + }, + { + "cell_type": "markdown", + "id": "feb25e11-bc16-44b8-830c-d34f12e2dcf7", + "metadata": {}, + "source": [ + "The data directory, `data_dir` here should contain all of the association and rate files in a single, flat directory. `default_run_image3` and `custom_run_image3` are directories that we will use later for our calibrated data. They are separated so that we can compare the two outputs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e2ecc213-479a-422c-a0f8-8608309b845c", + "metadata": {}, + "outputs": [], + "source": [ + "data_dir = 'data'\n", + "default_run_image3 = 'default_image3_calibrated' # where the results of the default image3 run will be saved (inside of data_dir)\n", + "custom_run_image3 = 'custom_image3_calibrated'# where the results of the custom image3 run will be saved (inside of data_dir)" + ] + }, + { + "cell_type": "markdown", + "id": "d991e323-8e98-4487-a8ab-1b696f91899b", + "metadata": {}, + "source": [ + "The association files expect that 1) all of the data are in the same directory and 2) that you are performing the pipeline call also in that directory. Because of that, we need to change into the data directory to run the imaging pipelines." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "39fbaa8a-5062-4a2d-b8a9-f3764086aa91", + "metadata": {}, + "outputs": [], + "source": [ + "# if you have not downloaded the data from notebook 00, run this cell. Otherwise, feel free to skip it.\n", + "\n", + "# Download uncalibrated data from Box into the data directory:\n", + "boxlink = 'https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/niriss_wfss_advanced/niriss_wfss_advanced_01_input.zip'\n", + "boxfile = os.path.basename(boxlink)\n", + "urllib.request.urlretrieve(boxlink, boxfile)\n", + "\n", + "zf = zipfile.ZipFile(boxfile, 'r')\n", + "zf.extractall(path=data_dir)\n", + "\n", + "# move the files downloaded from the box file into the top level data directory\n", + "box_download_dir = os.path.join(data_dir, boxfile.split('.zip')[0])\n", + "for filename in glob.glob(os.path.join(box_download_dir, '*')):\n", + " if '.csv' in filename:\n", + " # move to the current directory\n", + " os.rename(filename, os.path.basename(filename))\n", + " else:\n", + " # move to the data directory \n", + " os.rename(filename, os.path.join(data_dir, os.path.basename(filename)))\n", + "\n", + "# remove unnecessary files now\n", + "os.remove(boxfile)\n", + "os.rmdir(box_download_dir)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3d348ad9-4344-4a03-a58d-344357eeea28", + "metadata": {}, + "outputs": [], + "source": [ + "# From the csv file we created earlier, find a list of all of the grism observations we will want to calibrate with spec2\n", + "listrate_file = 'list_ngdeep_rate.csv'\n", + "rate_df = pd.read_csv(listrate_file)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a4ea2cb0-e5d5-4347-8c4c-9126cda54561", + "metadata": {}, + "outputs": [], + "source": [ + "cwd = os.getcwd() # get the current working directory \n", + "if cwd != data_dir: # if you are not already in the location of the data, change into it\n", + " try:\n", + " os.chdir(data_dir)\n", + " except FileNotFoundError:\n", + " print(f\"Not able to change into: {data_dir}.\\nRemaining in: {cwd}\")\n", + " pass" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b88f9b55-dbde-466c-9647-86e5fff3fdcc", + "metadata": {}, + "outputs": [], + "source": [ + "for temp_dir in [default_run_image3, custom_run_image3]:\n", + " if not os.path.exists(temp_dir):\n", + " os.mkdir(temp_dir)" + ] + }, + { + "cell_type": "markdown", + "id": "10824c6a-4843-4f31-beb7-05ecc58e8b84", + "metadata": {}, + "source": [ + "\n", + "## Default Imaging Pipeline Run\n", + "To start, run the default image2 and image3 steps of the pipeline on all direct images observed with the WFSS data." + ] + }, + { + "cell_type": "markdown", + "id": "0ac742e8", + "metadata": {}, + "source": [ + "\n", + "### Run Default Image2\n", + "\n", + "Image2 is run on the direct image rate files. While your program should have valid association files to download from MAST, if for any reason you need to make your own association file, see [Creating Custom ASN Files](#customasn)." + ] + }, + { + "cell_type": "markdown", + "id": "82eb2c93", + "metadata": {}, + "source": [ + "#### Looking in a Level 2 Imaging Association File\n", + "First, take a look inside the association (ASN) files to better understand everything that is contained in them.\n", + "\n", + "For image2 association files, there should be one asn file for each dither position in an observing sequence which is set by the [exposure strategy](https://jwst-docs.stsci.edu/jwst-near-infrared-imager-and-slitless-spectrograph/niriss-observing-strategies/niriss-wfss-recommended-strategies). In this case, that should match the number of direct images (`FILTER=CLEAR`) in `rate_df` because each direct image is at a unique dither position (XOFFSET, YOFFSET) within an observing sequence. For this program and observation, there is one direct image before a grism change with only one dither, another direct image with four dithers between the change in grisms, and a direct image at the end of a sequence with three dithers. This leads to a total of eight images per observing sequence, with five observing sequences in the observation using the blocking filters F115W -> F115W -> F150W -> F150W -> F200W." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f6429ae4-cc03-4718-b2c2-12924170fe89", + "metadata": {}, + "outputs": [], + "source": [ + "image2_asns = glob.glob('*image2*asn*.json')\n", + "print(len(image2_asns), 'Image2 ASN files') # there should be 40 asn files for image2\n", + "\n", + "# the number of association files should match the number of rate files\n", + "print(len(rate_df[rate_df['FILTER'] == 'CLEAR']), 'Direct Image rate files')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9226e1b9-9553-402b-910d-73f7cffcb784", + "metadata": {}, + "outputs": [], + "source": [ + "# look at one of the association files\n", + "asn_data = json.load(open(image2_asns[0]))\n", + "for key, data in asn_data.items():\n", + " print(f\"{key} : {data}\")" + ] + }, + { + "cell_type": "markdown", + "id": "99c3a7f0-ec0a-403a-9c55-58b2ac6559a7", + "metadata": {}, + "source": [ + "From this association, we can tell many things about the observation:\n", + "1. From `asn_type` and `asn_rule`, we can see that this is an image2 association\n", + "2. From `degraded_status` we can see that there are no exposures to not be included in the calibration.\n", + "3. From `constraints`, we can see this is not a time series observation (TSO), the observation is part of program 2079, observed with NIRISS with the CLEAR (i.e. imaging for WFSS) and F150W filters.\n", + "4. From `products` we can see there is only one exposure associated. This is typical for image2 where there is usually only one exposure per dither per observing sequence." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e5641931-1042-4494-a527-43e521803f3c", + "metadata": {}, + "outputs": [], + "source": [ + "# in particular, take a closer look at the product filenames with the association file:\n", + "for product in asn_data['products']:\n", + " for key, value in product.items():\n", + " if key == 'members':\n", + " print(f\"{key}:\")\n", + " for member in value:\n", + " print(f\" {member['expname']} {member['exptype']}\")\n", + " else:\n", + " print(f\"{key}: {value}\")" + ] + }, + { + "cell_type": "markdown", + "id": "db2dbaea-228e-4565-928b-2984f4d7fda7", + "metadata": {}, + "source": [ + "#### Run image2\n", + "\n", + "The `rate.fits` products will be calibrated into `cal.fits` files. More information about the steps performed in the Image2 part of the pipeline can be found in the [Image2 pipeline documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_image2.html).\n", + "\n", + "In this case, we're saving the outputs to the same directory we are running the pipeline in so that we can then use the output `cal` files to run the Image3 pipeline" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dff099be", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "for img2_asn in image2_asns:\n", + " # check if the calibrated file already exists\n", + " asn_data = json.load(open(img2_asn))\n", + " cal_file = f\"{asn_data['products'][0]['name']}_cal.fits\"\n", + " if os.path.exists(cal_file):\n", + " print(cal_file, 'cal file already exists.')\n", + " continue\n", + " # if not, calibrated with image2\n", + " img2 = Image2Pipeline.call(img2_asn, save_results=True)" + ] + }, + { + "cell_type": "markdown", + "id": "fe1ae678", + "metadata": {}, + "source": [ + "\n", + "### Run Default Image3 " + ] + }, + { + "cell_type": "markdown", + "id": "e1fd5c10", + "metadata": {}, + "source": [ + "#### Looking in a Level 3 Association File\n", + "The contents are quite similar to image2, but notice now that there are many more members that are associated together, and they use the individual pointing cal files from image2. Image3 resamples and combines images of the same blocking filter (PUPIL for NIRISS WFSS) from all dither and observing sequences to form a single image, which leads to fewer image3 association files." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ccf6ac3b-333f-451d-8157-f83ccba9352c", + "metadata": {}, + "outputs": [], + "source": [ + "image3_asns = glob.glob('*image3*asn*.json')\n", + "print(len(image3_asns), 'Image3 ASN files') # there should be 3 image3 association files\n", + "\n", + "# the number of image3 association files should match the number of unique blocking filters used\n", + "uniq_filters = np.unique(rate_df[rate_df['FILTER'] == 'CLEAR']['PUPIL'])\n", + "print(f\"{len(uniq_filters)} unique filters used: {uniq_filters}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "add4839e", + "metadata": {}, + "outputs": [], + "source": [ + "# look at one of the association files\n", + "image3_asn_data = json.load(open(image3_asns[0]))\n", + "for key, data in image3_asn_data.items():\n", + " print(f\"{key} : {data}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2831ae47-5b1b-4d92-b02a-8ce1ee359710", + "metadata": {}, + "outputs": [], + "source": [ + "# in particular, take a closer look at the product filenames with the association file:\n", + "for product in image3_asn_data['products']:\n", + " for key, value in product.items():\n", + " if key == 'members':\n", + " print(f\"{key}:\")\n", + " for member in value:\n", + " print(f\" {member['expname']} {member['exptype']}\")\n", + " else:\n", + " print(f\"{key}: {value}\")" + ] + }, + { + "cell_type": "markdown", + "id": "0cab171d", + "metadata": {}, + "source": [ + "#### Run image3" + ] + }, + { + "cell_type": "markdown", + "id": "ca3bd322-7f67-43c4-8dae-dc47779462ee", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + }, + "tags": [] + }, + "source": [ + "In Image3, the `cal.fits` individual pointing files will be calibrated into a single combined `i2d.fits` image. The Image3 step is also where the final source catalog is created, so we can change some input paramters to obtain a more refined output source catalog. This is done below in the [Custom Imaging Pipeline Run](#custom) section. However, we will first calibrate the data with the default parameters. More information about the steps performed in the Image3 part of the pipeline can be found in the [Image3 pipeline documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_image3.html).\n", + "\n", + "**Note: Image3 can take a while to run**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6cf900af", + "metadata": {}, + "outputs": [], + "source": [ + "for img3_asn in image3_asns:\n", + " # check if the calibrated file already exists\n", + " asn_data = json.load(open(img3_asn))\n", + " cal_file = os.path.join(default_run_image3, f\"{asn_data['products'][0]['name']}_i2d.fits\")\n", + " if os.path.exists(cal_file):\n", + " print(cal_file, 'cal file already exists.')\n", + " continue\n", + " # if not, calibrated with image3\n", + " img3 = Image3Pipeline.call(img3_asn, save_results=True, output_dir=default_run_image3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "eb5b3614-cfc3-42b7-b5ce-e243eb680ae7", + "metadata": {}, + "outputs": [], + "source": [ + "# remove unnecessary files to save disk space\n", + "for crf in glob.glob(os.path.join(default_run_image3, '*crf.fits')):\n", + " os.remove(crf)" + ] + }, + { + "cell_type": "markdown", + "id": "33f0dbeb-4caf-4d97-8b2c-3255f906e76b", + "metadata": {}, + "source": [ + "\n", + "### Inspecting Default Results" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "47da18f6-92e3-4fff-8153-222cfc55d353", + "metadata": {}, + "outputs": [], + "source": [ + "# These are all resuts from the Image3 pipeline\n", + "image3_i2d = np.sort(glob.glob(os.path.join(default_run_image3, '*i2d.fits'))) # combined image over multiple dithers/mosaic\n", + "image3_segm = np.sort(glob.glob(os.path.join(default_run_image3, '*segm.fits'))) # segmentation map that defines the extent of a source\n", + "image3_cat = np.sort(glob.glob(os.path.join(default_run_image3, '*cat.ecsv'))) # Source catalog that defines the RA/Dec of a source at a particular pixel" + ] + }, + { + "cell_type": "markdown", + "id": "7442c52c-d399-4593-a488-e30e02261383", + "metadata": {}, + "source": [ + "#### Matplotlib\n", + "Matplotlib has limitations where ImViz might better suite your needs -- especially if you like to look at things in WCS coordinates. For the notebook purposes, we are highlighting a few key areas using the matplotlib package instead.\n", + "\n", + "Using the `i2d` combined image and the source catalog produced by Image3, we can visually inspect if we're happy with where the pipeline found the sources. In the following figures, what has been defined as an extended source by the pipeline is shown in orange-red, and what has been defined as a point source by the pipeline is shown in grey. This definition affects the extraction box in the WFSS images." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0945914a-bebc-4a8e-b619-fded2bd42485", + "metadata": {}, + "outputs": [], + "source": [ + "fig = plt.figure(figsize=(10, 10))\n", + "\n", + "cols = 2\n", + "rows = int(np.ceil(len(image3_i2d) / cols))\n", + "\n", + "for plt_num, img in enumerate(image3_i2d):\n", + "\n", + " # determine where the subplot should be\n", + " xpos = (plt_num % 40) % cols\n", + " ypos = ((plt_num % 40) // cols) # // to make it an int.\n", + "\n", + " # make the subplot\n", + " ax = plt.subplot2grid((rows, cols), (ypos, xpos))\n", + "\n", + " # plot the image\n", + " with fits.open(img) as hdu:\n", + " ax.imshow(hdu[1].data, vmin=0, vmax=0.3, origin='lower')\n", + " ax.set_title(f\"obs{hdu[0].header['OBSERVTN']} {hdu[0].header['PUPIL']}\")\n", + "\n", + " # also plot the associated catalog\n", + " cat = Table.read(img.replace('i2d.fits', 'cat.ecsv'))\n", + " extended_sources = cat[cat['is_extended'] == 1] # 1 is True; i.e. is extended\n", + " point_sources = cat[cat['is_extended'] == 0] # 0 is False; i.e. is a point source\n", + " ax.scatter(extended_sources['xcentroid'], extended_sources['ycentroid'], s=20, facecolors='None', edgecolors='orangered', alpha=0.9)\n", + " ax.scatter(point_sources['xcentroid'], point_sources['ycentroid'], s=20, facecolors='None', edgecolors='dimgrey', alpha=0.9)\n", + "\n", + "# Helps to make the axes not overlap ; you can also set this manually if this doesn't work\n", + "plt.tight_layout()" + ] + }, + { + "cell_type": "markdown", + "id": "d157e3a8-2f2a-4487-8329-f6fa56713212", + "metadata": {}, + "source": [ + "The segmentation maps are also a product of the Image3 pipeline, and they are used the help determine the source catalog. Let's take a look at those to ensure we are happy with what it is defining as a source.\n", + "\n", + "In the segmentation map, each blue blob should correspond to a physical target. There are cases where sources can be blended, in which case the parameters for making the semgentation map and source catalog should be changed. An example of this can be seen below in the observation 004 F200W filter image where two galaxies at ~(1600, 1300) have been blended into one source. This is discussed in more detail below in [Custom Imaging Pipeline Run](#custom). \n", + "\n", + "*Note that because of the [filter offset difference](https://jwst-docs.stsci.edu/jwst-near-infrared-imager-and-slitless-spectrograph/niriss-instrumentation/niriss-gr150-grisms#NIRISSGR150Grisms-Blockingfilters) the field of views for the shown in the cutouts below differs for each filter.*" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b27845f3-65f0-46ce-9c0e-70e64ce65bcd", + "metadata": {}, + "outputs": [], + "source": [ + "# we will look at this multiple times, so let's define this as a function\n", + "def plot_image_and_segmentation_map(i2d_images, segm_images, xmin=1250, xmax=1750, ymin=1250, ymax=1750, cat_suffix='cat.ecsv'):\n", + " \n", + " cols = 2\n", + " rows = len(i2d_images)\n", + "\n", + " fig = plt.figure(figsize=(10, 10*(rows/2)))\n", + "\n", + " for plt_num, img in enumerate(np.sort(np.concatenate([segm_images, i2d_images]))):\n", + " \n", + " # determine where the subplot should be\n", + " xpos = (plt_num % 40) % cols\n", + " ypos = ((plt_num % 40) // cols) # // to make it an int.\n", + "\n", + " # make the subplot\n", + " ax = plt.subplot2grid((rows, cols), (ypos, xpos))\n", + " \n", + " if 'i2d' in img:\n", + " cat = Table.read(img.replace('i2d.fits', cat_suffix))\n", + " cmap = 'gist_gray'\n", + " else:\n", + " cmap = 'tab20c_r'\n", + " \n", + " # plot the image\n", + " with fits.open(img) as hdu:\n", + " ax.imshow(hdu[1].data, vmin=0, vmax=0.3, origin='lower', cmap=cmap)\n", + " title = f\"obs{hdu[0].header['OBSERVTN']} {hdu[0].header['PUPIL']}\"\n", + " \n", + " # also plot the associated catalog\n", + " extended_sources = cat[cat['is_extended'] == 1] # 1 is True; i.e. is extended\n", + " point_sources = cat[cat['is_extended'] == 0] # 0 is False; i.e. is a point source\n", + " \n", + " for color, sources in zip(['darkred', 'black'], [extended_sources, point_sources]):\n", + " # plotting the sources\n", + " ax.scatter(sources['xcentroid'], sources['ycentroid'], s=20, facecolors='None', edgecolors=color, alpha=0.9)\n", + " \n", + " # adding source labels \n", + " for i, source_num in enumerate(sources['label']):\n", + " ax.annotate(source_num, \n", + " (sources['xcentroid'][i]+0.5, sources['ycentroid'][i]+0.5), \n", + " fontsize=8,\n", + " color=color)\n", + " if 'i2d' in img:\n", + " ax.set_title(f\"{title} combined image\")\n", + " else:\n", + " ax.set_title(f\"{title} segmentation map\")\n", + " \n", + " # zooming in on a smaller region\n", + " ax.set_xlim(xmin, xmax)\n", + " ax.set_ylim(ymin, ymax)\n", + " \n", + " # Helps to make the axes not overlap ; you can also set this manually if this doesn't work\n", + " plt.tight_layout()\n", + "\n", + " return fig" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0599d52e-7a92-4c16-92e7-59ae62cfe8d9", + "metadata": {}, + "outputs": [], + "source": [ + "default_fig = plot_image_and_segmentation_map(image3_i2d, image3_segm)" + ] + }, + { + "cell_type": "markdown", + "id": "d1addba0-ae56-4565-b12a-3ded03d9b0bf", + "metadata": {}, + "source": [ + "#### ImViz\n", + "\n", + "Similarly to DS9, ImViz allows you to interactively view these images and the corresponding source catalog as well." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7bec8a55", + "metadata": {}, + "outputs": [], + "source": [ + "imviz = Imviz()\n", + "viewer = imviz.default_viewer\n", + "\n", + "# plot each i2d image\n", + "catalogs = [] # for plotting the catalogs\n", + "labels = [] # for plotting the catalogs\n", + "for img in image3_i2d:\n", + " print(f'Plotting: {img}')\n", + " label = f\"obs{fits.getval(img, 'OBSERVTN')} {fits.getval(img, 'PUPIL')}\"\n", + " with warnings.catch_warnings():\n", + " warnings.simplefilter('ignore')\n", + " imviz.load_data(img, data_label=label)\n", + "\n", + " # save info to plot the catalogs next\n", + " catalogs.append(img.replace('i2d.fits', 'cat.ecsv'))\n", + " labels.append(label)\n", + "\n", + "# this aligns the image to use the WCS coordinates; \n", + "# the images need to be loaded first, but before adding markers\n", + "imviz.link_data(link_type='wcs')\n", + "\n", + "# also plot the associated catalog\n", + "# this needs to be a separate loop due to linking in imviz when using sky coordinates\n", + "for catname, label in zip(catalogs, labels):\n", + " cat = Table.read(catname)\n", + " \n", + " # format the table into the format imviz expects\n", + " sky_coords = Table({'coord': [SkyCoord(ra=cat['sky_centroid'].ra.degree,\n", + " dec=cat['sky_centroid'].dec.degree,\n", + " unit=\"deg\")]})\n", + "\n", + " viewer.marker = {'color': 'orange', 'alpha': 1, 'markersize': 20, 'fill': False}\n", + " viewer.add_markers(sky_coords, use_skycoord=True, marker_name=f\"{label} catalog\")\n", + "\n", + "# This changes the stretch of all of the images\n", + "plotopt = imviz.plugins['Plot Options']\n", + "plotopt.select_all(viewers=True, layers=True)\n", + "plotopt.stretch_preset = '99.5%'\n", + " \n", + "imviz.show()" + ] + }, + { + "cell_type": "markdown", + "id": "4ea23a1b-4b6f-4ee1-9e2b-ff132dd9127d", + "metadata": {}, + "source": [ + "\n", + "## Custom Imaging Pipeline Run" + ] + }, + { + "cell_type": "markdown", + "id": "14216c26-cf89-4f15-b7ec-b5a8ca0071ba", + "metadata": {}, + "source": [ + "\n", + "### Image3" + ] + }, + { + "cell_type": "markdown", + "id": "ccc697e5-92de-4a2a-a9e1-3f81d20fcf27", + "metadata": {}, + "source": [ + "Try editing a few parameters and compare the outcomes to the default run above, at first for a single file.\n", + "\n", + "When we call the image3 pipeline, we can add modifications to a specific step of the pipeline. In this case we're going to edit the `source_catalog` and `tweakreg` steps of the pipeline. An explanation of the different parameters to tweak can be found in the further information below, while the default values are a combination of both the default pipeline values listed in there, and the parameter reference files that are supplied.\n", + "- [source_catalog Further Information](https://jwst-pipeline.readthedocs.io/en/latest/api/jwst.source_catalog.SourceCatalogStep.html)\n", + "- [tweakreg Futher Information](https://jwst-pipeline.readthedocs.io/en/latest/api/jwst.tweakreg.tweakreg_step.TweakRegStep.html)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3f2f7f0f", + "metadata": {}, + "outputs": [], + "source": [ + "image3_asns = np.sort(glob.glob('*image3*asn*.json'))\n", + "test_asn = image3_asns[1]\n", + "\n", + "# check if the calibrated file already exists\n", + "asn_data = json.load(open(test_asn))\n", + "i2d_file = os.path.join(custom_run_image3, f\"{asn_data['products'][0]['name']}_i2d.fits\")\n", + "\n", + "if os.path.exists(i2d_file):\n", + " print(i2d_file, 'i2d file already exists.')\n", + "else:\n", + " # call the image3 pipeline in the same way as before, but add a few new modifications\n", + " cust_img3 = Image3Pipeline.call(test_asn,\n", + " steps={\n", + " 'source_catalog': {'kernel_fwhm': 5.0,\n", + " 'snr_threshold': 10.0,\n", + " 'npixels': 50,\n", + " 'deblend': True,\n", + " },\n", + " 'tweakreg': {'snr_threshold': 20,\n", + " 'abs_refcat': 'GAIADR3',\n", + " 'save_catalogs': True,\n", + " 'searchrad': 3.0,\n", + " 'kernel_fwhm': 2.302,\n", + " 'fitgeometry': 'shift',\n", + " },\n", + " },\n", + " save_results=True,\n", + " output_dir=custom_run_image3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "403345ea-1f26-4ead-b680-ab6a54f73421", + "metadata": {}, + "outputs": [], + "source": [ + "# remove unnecessary files to save disk space\n", + "for crf in glob.glob(os.path.join(custom_run_image3, '*crf.fits')):\n", + " os.remove(crf)" + ] + }, + { + "cell_type": "markdown", + "id": "17d4cdae-0cad-4385-b93b-6976ecf90898", + "metadata": {}, + "source": [ + "\n", + "### Inspecting Custom Results" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "96e15c74-65e5-45b2-bf83-4e512cff4c6c", + "metadata": {}, + "outputs": [], + "source": [ + "default_i2d = os.path.join(default_run_image3, os.path.basename(i2d_file))\n", + "compare_i2ds = [i2d_file, default_i2d]\n", + "compare_segm = [i2d_file.replace('i2d.fits', 'segm.fits'), default_i2d.replace('i2d.fits', 'segm.fits')]\n", + "\n", + "compare_fig = plot_image_and_segmentation_map(compare_i2ds, compare_segm)" + ] + }, + { + "cell_type": "markdown", + "id": "8fd7891a-22ec-4913-a9fd-b153b66a932e", + "metadata": {}, + "source": [ + "The cell below shows similar information using Imviz instead to visualize this." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "724d7a9e-f923-4cc0-8e99-6d21008db42a", + "metadata": {}, + "outputs": [], + "source": [ + "imviz = Imviz()\n", + "viewer = imviz.default_viewer\n", + "\n", + "for img, label in zip([i2d_file, os.path.join(default_run_image3, os.path.basename(i2d_file))], ['custom', 'default']):\n", + " print(f'Plotting: {img}')\n", + " title = f\"{label} obs{fits.getval(img, 'OBSERVTN')} {fits.getval(img, 'PUPIL')}\"\n", + " with warnings.catch_warnings():\n", + " warnings.simplefilter('ignore')\n", + " imviz.load_data(img, data_label=title)\n", + "\n", + " # this aligns the image to use the WCS coordinates\n", + " linking = imviz.plugins['Links Control']\n", + " linking.link_type = 'WCS'\n", + "\n", + " # also plot the associated catalog\n", + " cat = Table.read(img.replace('i2d.fits', 'cat.ecsv'))\n", + " # format the table into the format imviz expects\n", + " t_xy = Table({'x': cat['xcentroid'],\n", + " 'y': cat['ycentroid']})\n", + " viewer.marker = {'color': 'orange', 'alpha': 1, 'markersize': 20, 'fill': False}\n", + " viewer.add_markers(t_xy, marker_name=f\"{label} catalog\")\n", + "\n", + "# This changes the stretch of all of the images\n", + "plotopt = imviz.plugins['Plot Options']\n", + "plotopt.select_all(viewers=True, layers=True)\n", + "plotopt.stretch_preset = '99.5%'\n", + " \n", + "imviz.show()" + ] + }, + { + "cell_type": "markdown", + "id": "48ae7dc1-45ce-41a6-a5a3-c07226efc912", + "metadata": {}, + "source": [ + "Calibrate the remaining images if you are happy with the above results" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4787d1bb-57a7-4056-b93e-c5e7497d14cf", + "metadata": {}, + "outputs": [], + "source": [ + "image3_asns = np.sort(glob.glob('*image3*asn*.json'))\n", + "\n", + "for img3_asn in image3_asns:\n", + " # check if the calibrated file already exists\n", + " asn_data = json.load(open(img3_asn))\n", + " i2d_file = os.path.join(custom_run_image3, f\"{asn_data['products'][0]['name']}_i2d.fits\")\n", + " if os.path.exists(i2d_file):\n", + " print(i2d_file, 'cal file already exists.')\n", + " continue\n", + " # call the image3 pipeline in the same way as before, but add a few new modifications\n", + " cust_img3 = Image3Pipeline.call(img3_asn,\n", + " steps={\n", + " 'source_catalog': {'kernel_fwhm': 5.0,\n", + " 'snr_threshold': 10.0,\n", + " 'npixels': 50,\n", + " 'deblend': True,\n", + " },\n", + " 'tweakreg': {'snr_threshold': 20,\n", + " 'abs_refcat': 'GAIADR3',\n", + " 'save_catalogs': True,\n", + " 'searchrad': 3.0,\n", + " 'kernel_fwhm': 2.302,\n", + " 'fitgeometry': 'shift',\n", + " },\n", + " },\n", + " save_results=True,\n", + " output_dir=custom_run_image3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "828566bf-fb5f-4ba5-8ef7-0de09a73f0fb", + "metadata": {}, + "outputs": [], + "source": [ + "# remove unnecessary files to save disk space\n", + "for crf in glob.glob(os.path.join(custom_run_image3, '*crf.fits')):\n", + " os.remove(crf)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8286e568-a553-4434-93ff-74829b79d110", + "metadata": {}, + "outputs": [], + "source": [ + "# These are all resuts from the Image3 pipeline\n", + "cust_image3_i2d = np.sort(glob.glob(os.path.join(custom_run_image3, '*i2d.fits'))) # combined image over multiple dithers/mosaic\n", + "cust_image3_segm = np.sort(glob.glob(os.path.join(custom_run_image3, '*segm.fits'))) # segmentation map that defines the extent of a source\n", + "\n", + "custom_fig = plot_image_and_segmentation_map(cust_image3_i2d, cust_image3_segm)" + ] + }, + { + "cell_type": "markdown", + "id": "812a461f-27f5-4ae7-84bd-4c4cafe0c603", + "metadata": {}, + "source": [ + "\n", + "## Refining the Source Catalog Further" + ] + }, + { + "cell_type": "markdown", + "id": "191a12fb-0989-46d2-8e62-792cf3b23f4a", + "metadata": {}, + "source": [ + "In the above cases, we have modified the results using the pipeline directly. It might be the case that perhaps you want to use someone else's custom source catalog, or modify the source catalog even further from what was output by the pipeline. In these cases, we will then need to modify the spec2 ASN files to point to the new source catalog, which will be discussed in the spec2 notebook. Additionally, it can be useful to match all of the source IDs across the different catalogs. In this case, there are three different catalogs created by the pipeline that identify the same RA/Dec or X/Y pixel location as a different source ID, so we will edit those to have the same source ID values across the catalogs. These extra steps aren't always necessary, but could be helpful in analyses of NIRISS WFSS data." + ] + }, + { + "cell_type": "markdown", + "id": "4eef927a-4301-481c-b730-d80219b9d14e", + "metadata": {}, + "source": [ + "\n", + "#### Matching Source IDs Across Catalogs\n", + "\n", + "In the above figures, you can see that the same source has multiple source IDs. Here we want to match all of the source IDs across all observations to be sure we are talking about the same source regardless of which filter or observation we look at. To do this, we use the astropy `match_to_catalog_3d` function and rebase the labels." + ] + }, + { + "cell_type": "markdown", + "id": "6ca9e9eb-6420-421a-818b-1faab1db788a", + "metadata": {}, + "source": [ + "The first step is to decide on a base catalog to match all of the other catalogs to. Here we'll use the the observation 004 F115W filter." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d024dff0-6ffc-408c-ae86-b65f766e73ef", + "metadata": {}, + "outputs": [], + "source": [ + "custom_cats = np.sort(glob.glob(os.path.join(custom_run_image3, '*niriss_clear-f????_cat.ecsv'))) # cat filename format from image3 results\n", + "print(\"All image3 catalogs:\\n\", custom_cats)\n", + "\n", + "base_cat = Table.read(custom_cats[0])\n", + "\n", + "# save out the base catalog with a new name to be consistent\n", + "base_cat_name = custom_cats[0].replace('cat.ecsv', 'source-match_cat.ecsv')\n", + "base_cat.write(base_cat_name, overwrite=True)\n", + "print(\"\\nBase catalog:\", base_cat_name)" + ] + }, + { + "cell_type": "markdown", + "id": "589f1d52-3915-43c6-93b7-32915139e7d6", + "metadata": {}, + "source": [ + "Loop through the remaining catalogs to match the IDs based off of sky coordinates matching to within 1 arcsecond." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "55f6ce9b-d995-489c-8f1b-b6ce76e4b5ee", + "metadata": {}, + "outputs": [], + "source": [ + "max_sep = 1 * u.arcsec # adjust if necessary\n", + "\n", + "base_sky = base_cat['sky_centroid']\n", + "\n", + "for to_match_cat in custom_cats[1:]:\n", + " # read in the catalog\n", + " other_cat = Table.read(to_match_cat)\n", + " other_sky = other_cat['sky_centroid']\n", + "\n", + " # find the matching sources between the two catalogs based on sky coordinates\n", + " idx, d2d, d3d = base_sky.match_to_catalog_3d(other_sky)\n", + " sep_constraint = d2d < max_sep\n", + " base_matches = base_cat[sep_constraint]\n", + " other_matches = other_cat[idx[sep_constraint]]\n", + "\n", + " # rebase the ID values to be the same\n", + " other_matches['label'] = base_matches['label']\n", + "\n", + " # save out the new catalog\n", + " match_cat_name = to_match_cat.replace('cat.ecsv', 'source-match_cat.ecsv')\n", + " other_matches.write(match_cat_name, overwrite=True)\n", + " print('Saved:', match_cat_name)" + ] + }, + { + "cell_type": "markdown", + "id": "f080887a-c1dd-435e-ae8a-72dcfebf9606", + "metadata": {}, + "source": [ + "Look at the new source label numbers. They should match!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8e0e4837-f956-4cd2-b4f9-68a60d647605", + "metadata": {}, + "outputs": [], + "source": [ + "new_cat_fig = plot_image_and_segmentation_map(cust_image3_i2d, cust_image3_segm, cat_suffix='source-match_cat.ecsv',\n", + " xmin=1500, xmax=2000, ymin=800, ymax=1300)" + ] + }, + { + "cell_type": "markdown", + "id": "686b13dc-0573-45ec-8828-9be428bf32ea", + "metadata": {}, + "source": [ + "\n", + "#### Manually Editing the Source Catalog\n", + "\n", + "Looking ahead to the WFSS extraction, it might be that we only really care about one or two sources at any particular moment. In this case, it's helpful to pair down the source catalog to just that source to make it easier to look at the extracted 1-D spectrum in the WFSS data.\n", + "\n", + "- [Source Catalog Column Information](https://jwst-pipeline.readthedocs.io/en/latest/jwst/source_catalog/main.html#output-products)" + ] + }, + { + "cell_type": "markdown", + "id": "a379ccea-58a4-4440-b1f5-26c9736e36c6", + "metadata": {}, + "source": [ + "To start, look at the current custom source catalog for one of the filters to get an idea of what is contained in the catalogs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e51fa992-9785-4c96-a4fa-99572e16f63c", + "metadata": {}, + "outputs": [], + "source": [ + "# first, look at the current, custom source catalog for the F200W filter\n", + "cat_name = np.sort(glob.glob(os.path.join(custom_run_image3, '*source-match_cat.ecsv')))[2]\n", + "cat = Table.read(cat_name)\n", + "\n", + "print(cat_name)\n", + "cat" + ] + }, + { + "cell_type": "markdown", + "id": "66930bfc-8647-4e00-8369-936eaff148a9", + "metadata": {}, + "source": [ + "There may be multiple ways to look for a source, so shown below are three options:\n", + "1. With a known RA/Dec of an object\n", + "2. A known x/y location of an object\n", + "3. With a source ID of an object. Note that we are using the rebased source catalogs here for the IDs.\n", + "\n", + "This same concept can be extended to filter by any of the columns contained in the source catalog." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2a5af10e-90c8-4e08-bff7-b6b798f9bea5", + "metadata": {}, + "outputs": [], + "source": [ + "# with a known RA/Dec\n", + "desired_ra = 53.15437048946369\n", + "desired_dec = -27.771689847051736\n", + "\n", + "c = SkyCoord(ra=desired_ra*u.degree, dec=desired_dec*u.degree)\n", + "nearest_id, distance_2d, distance_3d = c.match_to_catalog_sky(cat['sky_centroid']) \n", + "\n", + "cat[['label', 'xcentroid', 'ycentroid', 'sky_centroid', 'is_extended', 'isophotal_abmag', 'isophotal_vegamag']][nearest_id]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "04bd4079-c774-471d-94da-004a7fd287e8", + "metadata": {}, + "outputs": [], + "source": [ + "# alternatively with a known X/Y pixel location in the F200W image (based on what you've defined cat to be)\n", + "known_x = 1880\n", + "known_y = 1100\n", + "\n", + "nearest_pos = [np.sqrt((x-known_x)**2 + (y-known_y)**2) for x, y in zip(cat['xcentroid'], cat['ycentroid'])]\n", + "\n", + "wh_nearest = np.where(np.array(nearest_pos) == min(nearest_pos))[0][0]\n", + "\n", + "cat[['label', 'xcentroid', 'ycentroid', 'sky_centroid', 'is_extended', 'isophotal_abmag', 'isophotal_vegamag']][wh_nearest]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "572fe4a4-3d4d-4364-9672-7b2d23154fad", + "metadata": {}, + "outputs": [], + "source": [ + "# alternatively with a known source number\n", + "# note that this number might be different for you depending on pipeline versions and parameters changed.\n", + "source = 118\n", + "\n", + "wh_source = np.where(np.array(cat['label'] == source))[0][0]\n", + "\n", + "cat[['label', 'xcentroid', 'ycentroid', 'sky_centroid', 'is_extended', 'isophotal_abmag', 'isophotal_vegamag']][wh_source]" + ] + }, + { + "cell_type": "markdown", + "id": "3303d1c5-d484-46d1-9472-e57862660c31", + "metadata": {}, + "source": [ + "Using any of the three methods above, write out the new catalog.\n", + "- with RA/Dec: `new_cat = Table(cat[nearest_id])`\n", + "- with x/y: `new_cat = Table(cat[wh_nearest])`\n", + "- with source ID: `new_cat = Table(cat[wh_source])`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "aae96ac3-a587-4672-8edf-8bb327eea6e5", + "metadata": {}, + "outputs": [], + "source": [ + "new_cat = Table(cat[wh_source]) # turn the row instance into a dataframe again\n", + "\n", + "# save the new catalog with a unique name\n", + "new_cat_name = cat_name.replace('cat.ecsv', f'source{source}_cat.ecsv')\n", + "new_cat.write(new_cat_name, overwrite=True)\n", + "print('Saved:', new_cat_name)" + ] + }, + { + "cell_type": "markdown", + "id": "9d2a8566-eb1c-4df0-b4eb-f24c6edc8e56", + "metadata": {}, + "source": [ + "Once we have an updated source catalog that we are content with, we can move on to the spec2 step of the pipeline. It likely will be necessary to come back to this step after running spec2. Let's take a quick look at the source that we will be extracting in the following notebook with spec2." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "443842e0-b30a-4a31-937c-1880eb7f697a", + "metadata": {}, + "outputs": [], + "source": [ + "fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(9, 6))\n", + "\n", + "img = cust_image3_i2d[-1]\n", + " \n", + "cat = Table.read(new_cat_name)\n", + "\n", + "for ax in [ax1, ax2]:\n", + " # plot the image\n", + " with fits.open(img) as hdu:\n", + " ax.imshow(hdu[1].data, vmin=0, vmax=0.3, origin='lower')\n", + " title = f\"obs{hdu[0].header['OBSERVTN']} {hdu[0].header['PUPIL']}\"\n", + " \n", + " # also plot the associated catalog\n", + " extended_sources = cat[cat['is_extended'] == 1] # 1 is True; i.e. is extended\n", + " point_sources = cat[cat['is_extended'] == 0] # 0 is False; i.e. is a point source\n", + " \n", + " for color, sources in zip(['darkred', 'black'], [extended_sources, point_sources]):\n", + " # plotting the sources\n", + " ax.scatter(sources['xcentroid'], sources['ycentroid'], s=20, facecolors='None', edgecolors=color, alpha=0.9)\n", + " \n", + " # adding source labels \n", + " for i, source_num in enumerate(sources['label']):\n", + " ax.annotate(source_num, \n", + " (sources['xcentroid'][i]+0.5, sources['ycentroid'][i]+0.5), \n", + " fontsize=8,\n", + " color=color)\n", + "\n", + "fig.suptitle(\"Speicifc Source to Extract with Spec2\")\n", + "\n", + "# zooming in on a smaller region\n", + "ax2.set_xlim(known_x-50, known_x+50)\n", + "ax2.set_ylim(known_y-50, known_y+50)\n", + "\n", + "fig.tight_layout()" + ] + }, + { + "cell_type": "markdown", + "id": "d73170cc-0e25-44d2-ac6b-0ba56e8122a4", + "metadata": {}, + "source": [ + "\"Space" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.7" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/NIRISS_WFSS_advanced/02_niriss_wfss_spec2.ipynb b/notebooks/NIRISS_WFSS_advanced/02_niriss_wfss_spec2.ipynb new file mode 100644 index 000000000..38ec1db11 --- /dev/null +++ b/notebooks/NIRISS_WFSS_advanced/02_niriss_wfss_spec2.ipynb @@ -0,0 +1,1313 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "6d674dc1", + "metadata": {}, + "source": [ + "# Run spec2 pipeline\n", + "\n", + "The dispersed images for WFSS will be run through the [spec2 pipeline](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_spec2.html) after the image2 and image3 pipelines have been run on the corresponding direct images. As mentioned in the previous notebook, it is extremely helpful for this step if you have a source catalog that only includes the sources that you would like to look at. Any additional sources will take extra time to calibrate.\n", + "\n", + "**Use case**: After creating a custom source catalog, spec2 should be run on the dispersed WFSS images.
\n", + "**Data**: JWST/NIRISS images and spectra from program 2079 obs 004. This should be stored in a single directory `data`, and can be downloaded from the notebook 00_niriss_mast_query_data_setup.ipynb.
\n", + "**Tools**: astropy, crds, glob, jdaviz, json, jwst, matplotlib, numpy, os, pandas, shutil, urllib, zipfile
\n", + "**Cross-instrument**: NIRISS
\n", + "\n", + "**Content**\n", + "- [Imports & Data Setup](#imports)\n", + "- [Custom Spec2 Pipeline Run](#custom_spec2)\n", + " - [Spec2 Association Files](#spec2_asn)\n", + " - [Run Spec2](#spec2_run)\n", + " - [Examine the Outputs of Spec2](#spec2_examine)\n", + "- [Explore Data Further](#explore)\n", + " - [Find a Source](#source)\n", + " - [Limit the Number of Extracted Sources](#limit_source)\n", + " - [Final Visualization](#final_visualize)\n", + "\n", + "**Author**: Rachel Plesha (rplesha@stsci.edu), Camilla Pacifici (cpacifici@stsci.edu), JWebbinar notebooks.
\n", + "**First Published**: May 2024
\n", + "**Last tested**: This notebook was last tested with JWST pipeline version 1.12.5 and the CRDS context jwst_1225.pmap." + ] + }, + { + "cell_type": "markdown", + "id": "b28e541b", + "metadata": {}, + "source": [ + "\n", + "## Imports & Data Setup" + ] + }, + { + "cell_type": "markdown", + "id": "ab693165-555f-41b3-8135-1155da31393f", + "metadata": {}, + "source": [ + "[CRDS Documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/user_documentation/reference_files_crds.html#crds)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "974ab7ce-d1dd-4bcc-b405-25c6dd683a25", + "metadata": {}, + "outputs": [], + "source": [ + "# Update the CRDS path to your local directory\n", + "%env CRDS_PATH=crds_cache\n", + "%env CRDS_SERVER_URL=https://jwst-crds.stsci.edu" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9443f2ff", + "metadata": {}, + "outputs": [], + "source": [ + "import glob\n", + "import json\n", + "import os\n", + "import shutil\n", + "import urllib\n", + "import zipfile\n", + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "from astropy.table import Table\n", + "from astropy.io import fits\n", + "from astropy import constants as const\n", + "\n", + "from matplotlib import pyplot as plt\n", + "from matplotlib import patches\n", + "%matplotlib widget\n", + "# %matplotlib inline\n", + "\n", + "from jwst.pipeline import Spec2Pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "64b6e8d7-64a0-4a3b-9776-29b181452ff9", + "metadata": {}, + "source": [ + "Check what version of the JWST pipeline you are using. To see what the latest version of the pipeline is available or install a previous version, check [GitHub](https://github.com/spacetelescope/jwst#software-vs-dms-build-version-map\"). Also verify what [CRDS version](https://jwst-crds.stsci.edu/) you are using. [CRDS documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/user_documentation/reference_files_crds.html) explains how to set a specific context to use in the JWST pipeline. If either of these values are different from the last tested note above there may be differences in this notebook." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "87cdec65", + "metadata": {}, + "outputs": [], + "source": [ + "import jwst\n", + "import crds\n", + "print('JWST Pipeliene Version:', jwst.__version__)\n", + "print('CRDS Context:', crds.get_context_name('jwst'))" + ] + }, + { + "cell_type": "markdown", + "id": "436ed7e4-f642-41d0-83dd-565c24f78d5c", + "metadata": {}, + "source": [ + "#### Data Setup\n", + "\n", + "The data directory, `data_dir` should contain all of the association and rate files in a single, flat directory. `custom_run_image3` should match the output directory for the custom image3 calibration in the notebook 01_niriss_wfss_image2_image3.ipynb.\n", + "\n", + "For spec2, we need the rate files that we downloaded as well as the segmentation maps and source catalogs from image3. Because of that, we will create a new directory (defined by `custom_run_spec2`), change into it, and copy the relevant data over. \n", + "\n", + "*For a regular workflow, it is likely easier to leave all files in a single directory compared to the multiple directories we make here.*" + ] + }, + { + "cell_type": "markdown", + "id": "d218c8e0-7a25-4688-b904-c4225a66a8ca", + "metadata": {}, + "source": [ + "Open up our data information file which will make parsing the rate files to copy over easier as we only need the dispersed images, i.e. those observed with GR150R or GR150C." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "168c2146", + "metadata": {}, + "outputs": [], + "source": [ + "data_dir = 'data'\n", + "if not os.path.exists(data_dir):\n", + " os.mkdir(data_dir)\n", + " \n", + "custom_run_spec2 = 'custom_spec2' # saving files here in this notebook\n", + "custom_run_image3 = 'custom_image3_calibrated' # results of custom image3 calibration\n", + "\n", + "# if the directories dont't exist yet, make it\n", + "for custom_dir in [custom_run_spec2, custom_run_image3]:\n", + " if not os.path.exists(os.path.join(data_dir, custom_dir)):\n", + " os.mkdir(os.path.join(data_dir, custom_dir))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "634e2f5f-8a49-4eea-83f6-850456436635", + "metadata": {}, + "outputs": [], + "source": [ + "# if you have not downloaded the data from notebook 00 or have not run notebook 01, run this cell. Otherwise, feel free to skip it.\n", + "\n", + "# Download uncalibrated data from Box into the data directory:\n", + "boxlink = 'https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/niriss_wfss_advanced/niriss_wfss_advanced_02_input.zip'\n", + "boxfile = os.path.basename(boxlink)\n", + "urllib.request.urlretrieve(boxlink, boxfile)\n", + "\n", + "zf = zipfile.ZipFile(boxfile, 'r')\n", + "zf.extractall(path=data_dir)\n", + "\n", + "# move the files downloaded from the box file into the top level data directory\n", + "box_download_dir = os.path.join(data_dir, boxfile.split('.zip')[0])\n", + "\n", + "for filename in glob.glob(os.path.join(box_download_dir, '*')):\n", + " if '.csv' in filename:\n", + " # move to the current directory\n", + " os.rename(filename, os.path.basename(filename))\n", + " elif '_segm.fits' in filename or '_cat.ecsv' in filename:\n", + " # move the image2 products to the appropriate directory\n", + " os.rename(filename, os.path.join(data_dir, custom_run_spec2, os.path.basename(filename)))\n", + " elif '_i2d.fits' in filename:\n", + " # move image3 products to their directory, too\n", + " os.rename(filename, os.path.join(data_dir, custom_run_image3, os.path.basename(filename)))\n", + " else:\n", + " # move to the data directory \n", + " os.rename(filename, os.path.join(data_dir, os.path.basename(filename)))\n", + " \n", + "# remove unnecessary files now\n", + "os.remove(boxfile)\n", + "os.rmdir(box_download_dir)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a3f8222a-92f1-476a-9dee-85604e8415d0", + "metadata": {}, + "outputs": [], + "source": [ + "# From the csv file we created earlier, find a list of all of the grism observations we will want to calibrate with spec2\n", + "listrate_file = 'list_ngdeep_rate.csv'\n", + "rate_df = pd.read_csv(listrate_file)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6bfd083d-55b5-4610-b11e-276388394e27", + "metadata": {}, + "outputs": [], + "source": [ + "# copy all of the grism rate files\n", + "gr150r_files = list(rate_df[rate_df['FILTER'] == 'GR150R']['FILENAME'])\n", + "gr150c_files = list(rate_df[rate_df['FILTER'] == 'GR150C']['FILENAME'])\n", + "\n", + "for grism_rate in gr150r_files + gr150c_files:\n", + " if os.path.exists(grism_rate):\n", + " shutil.copy(grism_rate, os.path.join(data_dir, custom_run_spec2, os.path.basename(grism_rate)))\n", + " else:\n", + " print(f'{grism_rate} does not exist. Not able to copy file.')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e03967b5-e6d1-4cd6-a0ca-d429b5fa18c6", + "metadata": {}, + "outputs": [], + "source": [ + "# copy all of the spec2 asn files\n", + "for asn in glob.glob(os.path.join(data_dir, '*spec2*asn*.json')):\n", + " if os.path.exists(asn):\n", + " shutil.copy(asn, os.path.join(data_dir, custom_run_spec2, os.path.basename(asn)))\n", + " else:\n", + " print(f'{asn} does not exist. Not able to copy file.')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d5beb0fe-ff62-453c-95b5-046d6f2aec10", + "metadata": {}, + "outputs": [], + "source": [ + "# copy all of the necessary image3 output files\n", + "cats = glob.glob(os.path.join(data_dir, custom_run_image3, '*source*_cat.ecsv')) # copy both the source-match and source118 catalogs\n", + "segm = glob.glob(os.path.join(data_dir, custom_run_image3, '*_segm.fits'))\n", + "\n", + "for image3_file in cats + segm:\n", + " if os.path.exists(image3_file):\n", + " shutil.copy(image3_file, os.path.join(data_dir, custom_run_spec2, os.path.basename(image3_file)))\n", + " else:\n", + " print(f'{image3_file} does not exist. Not able to copy file.')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c02d9212-7f9a-4f8d-a769-6765bc5bf710", + "metadata": {}, + "outputs": [], + "source": [ + "cwd = os.getcwd() # get the current working directory \n", + "if cwd != data_dir: # if you are not already in the location of the data, change into it\n", + " try:\n", + " os.chdir(data_dir)\n", + " except FileNotFoundError:\n", + " print(f\"Not able to change into: {data_dir}.\\nRemaining in: {cwd}\")\n", + " pass\n", + "\n", + "if not os.path.exists(custom_run_spec2):\n", + " os.mkdir(custom_run_spec2)\n", + "\n", + "os.chdir(custom_run_spec2)\n", + "\n", + "new_cwd = os.getcwd()\n", + "print('Now in:', new_cwd)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "84915ddb-15a0-491b-ada1-965112ac9311", + "metadata": {}, + "outputs": [], + "source": [ + "# directories for calibrating later\n", + "source_outdir = 'new_catalog_calibrated'\n", + "if not os.path.exists(source_outdir):\n", + " os.mkdir(source_outdir)\n", + "\n", + "\n", + "param_outdir = 'parameter_input_calibrated'\n", + "if not os.path.exists(param_outdir):\n", + " os.mkdir(param_outdir)" + ] + }, + { + "cell_type": "markdown", + "id": "e0a176fd-094d-44e5-b860-5b9f730e0898", + "metadata": {}, + "source": [ + "\n", + "## Custom Spec2 Pipeline Run\n", + "\n", + "Because this is a custom spec2 pipeline run, the first step is to make sure the correct source catalog is being used. This will define the spectral trace cutouts, and therefore inform the pipeline on where to extract the spectrum." + ] + }, + { + "cell_type": "markdown", + "id": "aee8bd1b-1012-47ea-8ce4-fece4c2866a0", + "metadata": {}, + "source": [ + "\n", + "### Spec2 Association Files\n", + "\n", + "As with the imaging part of the pipeline, there are association files for spec2. These are a bit more complex in that they need to have the science (WFSS) data, direct image, source catalog, and segmentation map included as members. For the science data, the rate files are used as inputs, similarly to image2. Also like image2, there should be one asn file for each dispersed image dither position in an observing sequence. In this case, that should match the number of rate files where `FILTER=GR150R` or `FILTER=GR150C`. For this program and observation, there are three dithers per grism, and both GR150R and GR150C are used, totaling six exposures per observing sequence with five observing sequences in the observation using the blocking filters F115W -> F115W -> F150W -> F150W -> F200W." + ] + }, + { + "cell_type": "markdown", + "id": "7228f9ea-5e57-41d3-b236-69d7de008423", + "metadata": {}, + "source": [ + "#### Looking in a Spec2 Association File" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bf6515ea-2b60-4cb1-92ab-3b8dd72a8c01", + "metadata": {}, + "outputs": [], + "source": [ + "spec2_asns = np.sort(glob.glob('*spec2*asn*.json'))\n", + "print(len(spec2_asns), 'Spec2 ASN files')\n", + "# number of asn files should match the number of dispersed images -- 30 for obs 004\n", + "print(len(rate_df[(rate_df['FILTER'] == 'GR150R') | (rate_df['FILTER'] == 'GR150C')]), 'Dispersed image rate files')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "87e4b9de-b857-4c2b-a58d-a90ef2aadf05", + "metadata": {}, + "outputs": [], + "source": [ + "# look at one of the association files\n", + "asn_data = json.load(open(spec2_asns[0]))\n", + "for key, data in asn_data.items():\n", + " print(f\"{key} : {data}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "41c5c5a0-0641-4b7d-bf0c-f1ce688c48e1", + "metadata": {}, + "outputs": [], + "source": [ + "# in particular, take a closer look at the product filenames with the association file:\n", + "for product in asn_data['products']:\n", + " for key, value in product.items():\n", + " if key == 'members':\n", + " print(f\"{key}:\")\n", + " for member in value:\n", + " print(f\" {member['expname']} : {member['exptype']}\")\n", + " else:\n", + " print(f\"{key}: {value}\")" + ] + }, + { + "cell_type": "markdown", + "id": "5caee78b-b2e4-432d-9838-836495f390ac", + "metadata": {}, + "source": [ + "#### Modify the Association File to use Custom Source Catalog\n", + "\n", + "What is currently in the association files uses the pipeline source catalog. In the previous notebook, we created a custom source catalog that we want to use with the extension `source-match_cat.ecsv`, so we need to point to that catalog instead in the association files." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c4074a86-fd34-42e7-b09f-bf6e66c4b866", + "metadata": {}, + "outputs": [], + "source": [ + "new_source_ext = 'source-match_cat.ecsv'\n", + "\n", + "# loop through all of the spec2 association files in the current directory\n", + "for asn in spec2_asns:\n", + " asn_data = json.load(open(asn))\n", + " for product in asn_data['products']: # for every product, check the members \n", + " for member in product['members']: # there are multiple members per product\n", + " if member['exptype'] == 'sourcecat':\n", + " cat_in_asn = member['expname']\n", + " # check that we haven't already updated the source catalog name\n", + " if new_source_ext not in cat_in_asn:\n", + " new_cat = cat_in_asn.replace('cat.ecsv', new_source_ext)\n", + " # actually update the association file member\n", + " if os.path.exists(new_cat):\n", + " member['expname'] = new_cat\n", + " else:\n", + " print(f\"{new_cat} does not exist in the currenty working directory\")\n", + "\n", + " # write out the new association file\n", + " with open(asn, 'w', encoding='utf-8') as f:\n", + " json.dump(asn_data, f, ensure_ascii=False, indent=4)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "413eeb50-7e0a-4190-b999-f870b02fdc2e", + "metadata": {}, + "outputs": [], + "source": [ + "# double check that things still look ok:\n", + "asn_check = json.load(open(spec2_asns[0]))\n", + "for product in asn_check['products']:\n", + " for key, value in product.items():\n", + " if key == 'members':\n", + " print(f\"{key}:\")\n", + " for member in value:\n", + " print(f\" {member['expname']} : {member['exptype']}\")\n", + " else:\n", + " print(f\"{key}: {value}\")" + ] + }, + { + "cell_type": "markdown", + "id": "5584e673-2907-4483-bb80-28bc16c11911", + "metadata": {}, + "source": [ + "Alternatively, open the .json file in your favorite text editor and manually edit each of the catalog names" + ] + }, + { + "cell_type": "markdown", + "id": "d71ccaa3", + "metadata": {}, + "source": [ + "\n", + "### Run spec2\n", + "\n", + "Like the image pipeline, we can supply custom parameters for steps in the pipeline.\n", + "\n", + "More information about everything that is run during the spec2 stage of the pipeline can be found [here](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_spec2.html).\n", + "\n", + "To start, we are only going to run spec2 on one association file because spec2 can take a while to run if there are many sources. We are saving the outputs to the directory we are currently in, which should be the `custom_spec2` directory made above." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "71e686d1-51b9-4548-8117-57a8815c9d39", + "metadata": {}, + "outputs": [], + "source": [ + "test_asn = spec2_asns[0]\n", + "print(f'Calibrating: {test_asn}')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b2d5b653", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# check if the calibrated file already exists\n", + "asn_data = json.load(open(test_asn))\n", + "x1d_file = f\"{asn_data['products'][0]['name']}_x1d.fits\"\n", + "\n", + "if os.path.exists(x1d_file):\n", + " print(x1d_file, ': x1d file already exists.')\n", + "else:\n", + " spec2 = Spec2Pipeline.call(test_asn, save_results=True)" + ] + }, + { + "cell_type": "markdown", + "id": "c9dc3a2d-e33e-4f4d-88cb-c335e38bdcc8", + "metadata": {}, + "source": [ + "\n", + "### Examining the Outputs of Spec2\n", + "\n", + "The outputs of spec2 are `cal.fits` and `x1d.fits` files. Here we do a quick look into some important parts of these files.\n", + "- [cal further reading](https://jwst-pipeline.readthedocs.io/en/latest/jwst/data_products/science_products.html#calibrated-data-cal-and-calints)\n", + "- [x1d further reading](https://jwst-pipeline.readthedocs.io/en/latest/jwst/data_products/science_products.html#extracted-1-d-spectroscopic-data-x1d-and-x1dints)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d84fe30e-3737-4886-ae51-28bfe21771e5", + "metadata": {}, + "outputs": [], + "source": [ + "asn_example = json.load(open(spec2_asns[0]))\n", + "rate_file = asn_example['products'][0]['members'][0]['expname']\n", + "source_cat = asn_example['products'][0]['members'][2]['expname']\n", + "cal_file = rate_file.replace('rate', 'cal')\n", + "x1d_file = rate_file.replace('rate', 'x1d')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f0b56449-c61d-4330-9698-17818b95f7b7", + "metadata": {}, + "outputs": [], + "source": [ + "# open all of the files to look at\n", + "rate_hdu = fits.open(rate_file)\n", + "cal_hdu = fits.open(cal_file)\n", + "x1d_hdu = fits.open(x1d_file)\n", + "cat = Table.read(source_cat)\n", + "\n", + "# first look at how many sources we expect from the catalog\n", + "print(f'There are {len(cat)} sources identified in the current catalog.\\n')\n", + "\n", + "# then look at how long the cal and x1d files are for comparison\n", + "print(f'The x1d file has {len(x1d_hdu)} extensions & the cal file has {len(cal_hdu)} extensions')" + ] + }, + { + "cell_type": "markdown", + "id": "a0f6c961-dfba-4b12-a7ee-cb61623308dd", + "metadata": {}, + "source": [ + "Note that the 0th and final extension in each file do not contain science data, but the remaining extensions correspond to each source. The `x1d` file contains the extracted spectrum for each source, while the `cal` file contains the 2D cutout information in seven extensions for each source (SCI, DQ, ERR, WAVELENGTH, VAR_POISSON, VAR_RNOISE, VAR_FLAT).\n", + "\n", + "This in part is why it is so important to have a refined source catalog. If there are sources that are not useful for your research, there is no reason to create a cutout and extract them.\n", + "\n", + "Notice that there are more sources than there are extensions in the files. This is because the pipeline defaults to only extracting the 100 brightest sources. To change this behavior, supply the pipeline with the paramter `wfss_nbright`, which we do [below](#limit_source)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "68ae6ca2-b804-4158-824f-0ed859d6a876", + "metadata": {}, + "outputs": [], + "source": [ + "print(x1d_hdu.info())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a0773e93-9baa-4d66-b181-a9623446159e", + "metadata": {}, + "outputs": [], + "source": [ + "print(cal_hdu.info())" + ] + }, + { + "cell_type": "markdown", + "id": "52473e1d-c5b4-490d-b2e9-e0d049897c7b", + "metadata": {}, + "source": [ + "The `x1d` file is a BinTable, so there are additional columns contained inside each of the data extensions. [x1d further reading](https://jwst-pipeline.readthedocs.io/en/latest/jwst/data_products/science_products.html#extracted-1-d-spectroscopic-data-x1d-and-x1dints) goes through what is contained in each column, but we can also take a quick look by looking at one of the data columns." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1fa8d0c6-ea17-475a-ab74-2a40fba25670", + "metadata": {}, + "outputs": [], + "source": [ + "# a quick look at the columns also available in the x1d file\n", + "print(x1d_hdu[1].data.columns)" + ] + }, + { + "cell_type": "markdown", + "id": "2e1a31a6-fbba-4ca5-920a-ac59014c8495", + "metadata": {}, + "source": [ + "\n", + "## Explore Spec2 Data Further" + ] + }, + { + "cell_type": "markdown", + "id": "ba3ce4c1-64c0-4fcb-a274-b1003fade978", + "metadata": {}, + "source": [ + "\n", + "### Find a Source in the Spec2 Data\n", + "\n", + "Each extension of the cal and x1d files has a source ID in the header. These values should match with the values in the source catalog." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d7504a2f-74b9-4f02-a760-79a99a1b80be", + "metadata": {}, + "outputs": [], + "source": [ + "def find_source_ext(x1d_hdu, cal_hdu, source_id, info=True):\n", + " # x1d extension first\n", + " x1d_source_ids = np.array([x1d_hdu[ext].header['SOURCEID'] for ext in range(len(x1d_hdu))[1:-1]]) # cut off the 0th and last data extensions\n", + " wh_x1d = np.where(x1d_source_ids == source_id)[0][0] + 1 # need to add 1 for the primary header\n", + " \n", + " # look for cal extension, too, but only in the SCI extension; \n", + " # fill in with a source ID of -999 for all other extensions to get the right extension value\n", + " cal_source_ids = np.array([cal_hdu[ext].header['SOURCEID'] if cal_hdu[ext].header['EXTNAME'] == 'SCI'\n", + " else -999 for ext in range(len(cal_hdu))[1:-1]]) \n", + " wh_cal = np.where(cal_source_ids == source_id)[0][0] + 1 # need to add 1 for the primary header\n", + "\n", + " if info:\n", + " print(f\"All source IDs in x1d file:\\n{x1d_source_ids}\\n\")\n", + " print(f\"Extension {wh_x1d} in {x1d_hdu[0].header['FILENAME']} contains the data for source {source_id} from our catalog\")\n", + " print(f\"Extension {wh_cal} in {cal_hdu[0].header['FILENAME']} contains the data for source {source_id} from our catalog\")\n", + "\n", + " return wh_x1d, wh_cal" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "df474283-d660-4b5a-a4f8-9e5a6b187c30", + "metadata": {}, + "outputs": [], + "source": [ + "# Let's look for source 118 that we identified in the previous notebook.\n", + "source_id = 118\n", + "wh_x1d_118, wh_cal_118 = find_source_ext(x1d_hdu, cal_hdu, source_id)\n", + "\n", + "x1d_data_118 = x1d_hdu[wh_x1d_118].data \n", + "cal_data_118 = cal_hdu[wh_cal_118].data" + ] + }, + { + "cell_type": "markdown", + "id": "a362fa9f-ae33-4dcc-954c-471a80b41d37", + "metadata": {}, + "source": [ + "First, let's look at the extraction box as it appears on the rate image. This will help us get a global feel for how much we can trust the pipeline's extraction. \n", + "\n", + "*Note: In this example, the extraction box isn't fully cenetered around the spectrum. There is an ongoing calibration effort to better account for difference in the spectral trace shape across the NIRISS detector. Updates about the status of this calibration can be found on the [NIRISS jdox](https://jwst-docs.stsci.edu/jwst-calibration-pipeline-caveats/jwst-wide-field-slitless-spectroscopy-pipeline-caveats#JWSTWideFieldSlitlessSpectroscopyPipelineCaveats-NIRISSWFSS)*" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ddc84d02-8b67-440b-9380-53d5a27a5349", + "metadata": {}, + "outputs": [], + "source": [ + "# fill in the nan values from the bad pixels with zero to make it easier to look at\n", + "rate_data = rate_hdu[1].data\n", + "rate_data[np.isnan(rate_data)] = 0\n", + "\n", + "# extraction box parameters from the header of the cal data:\n", + "cal_header = cal_hdu[wh_cal_118].header\n", + "sx_left = cal_header['SLTSTRT1']\n", + "swidth = cal_header['SLTSIZE1']\n", + "sx_right = cal_header['SLTSTRT1'] + swidth\n", + "sy_bottom = cal_header['SLTSTRT2']\n", + "sheight = cal_header['SLTSIZE2']\n", + "sy_top = cal_header['SLTSTRT2'] + sheight\n", + "\n", + "# plot the rate file and the extraction box\n", + "fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))\n", + "ax1.imshow(rate_data, origin='lower', vmin=0.2, vmax=1, aspect='auto') # the scaling may need some adjustment\n", + "ax2.imshow(rate_data, origin='lower', vmin=0.2, vmax=1, aspect='auto') # the scaling may need some adjustment\n", + "\n", + "rectangle = patches.Rectangle((sx_left, sy_bottom), swidth, sheight, edgecolor='darkorange', facecolor=\"None\", linewidth=1)\n", + "ax1.add_patch(rectangle)\n", + "ax1.set_title(rate_file)\n", + "\n", + "rectangle2 = patches.Rectangle((sx_left, sy_bottom), swidth, sheight, edgecolor='darkorange', facecolor=\"None\", linewidth=2)\n", + "ax2.add_patch(rectangle2)\n", + "ax2.set_xlim(sx_left-30, sx_right+30)\n", + "ax2.set_ylim(sy_bottom-30, sy_top+30)\n", + "ax2.set_title(f'Source {source_id} Zoom in')\n", + "\n", + "plt.suptitle(f\"{cal_hdu[0].header['FILTER']} {cal_hdu[0].header['PUPIL']}\")" + ] + }, + { + "cell_type": "markdown", + "id": "22a7c98b-0a1b-4d1f-9323-25eb08696e4d", + "metadata": {}, + "source": [ + "We can then take a look at the extracted spectrum in this box both in the cal file and the x1d file. In the extracted spectrum below you can see the [OII] and H$\\beta$ emission lines from the galaxy.\n", + "\n", + "*Note: The upturned edge effects seen in the 1-D spectrum are due to interpolation at the edges of the extraction box for the current flux calibration. This is also part of an ongoing calibration effort.*\n", + "\n", + "*Additional note: The default units of flux from the pipeline are in Jansky. However, in these extracted spectra we show units of erg/s/cm^2/Angstrom. To turn this off, set `convert=False` in `plot_cutout_and_spectrum`*" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ce6bfa92-3ba5-4863-b36f-8cd3dbb4d704", + "metadata": {}, + "outputs": [], + "source": [ + "def Fnu_to_Flam(wave_micron, flux_jansky):\n", + " # convert Jansky flux units to erg/s/cm^2/Angstrom with an input wavelength in microns\n", + " f_lambda = 1E-19 * flux_jansky * (const.c.value) / (wave_micron**2) # erg/s/cm^2/Angstom\n", + " return f_lambda" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "87f80f3f-b2c9-4496-a1e9-0b8005b49f7c", + "metadata": {}, + "outputs": [], + "source": [ + "def plot_cutout_and_spectrum(cal_data, x1d_data, cal_file, x1d_file, source_id, convert=True):\n", + " fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(11, 5))\n", + " \n", + " # plot the cal image\n", + " ax1.imshow(cal_data, origin='lower', vmin=0, vmax=np.nanmax(cal_data)*.01, aspect='auto')\n", + " ax1.set_title(os.path.basename(cal_file))\n", + " \n", + " # plot the spectrum\n", + " wave = x1d_data['WAVELENGTH'].ravel()\n", + " flux = x1d_data['FLUX'].ravel()\n", + "\n", + " if convert:\n", + " flux = Fnu_to_Flam(wave, flux)\n", + " fluxunit = 'egs/s/cm^2/Angstrom'\n", + " else:\n", + " fluxunit = 'Jy'\n", + "\n", + " ax2.plot(wave, flux)\n", + " ax2.set_title(os.path.basename(x1d_file))\n", + "\n", + " edge_buffer = int(len(flux) * .25)\n", + " max_flux = np.nanmax(flux[edge_buffer:edge_buffer*-1])\n", + " ax2.set_ylim(0, max_flux+(max_flux*0.1)) # cutting the flux of the edges & adding 10% buffer to the limits\n", + " ax2.set_xlabel('Wavelength (Microns)')\n", + " ax2.set_ylabel(f'Flux ({fluxunit})')\n", + "\n", + " if fits.getval(cal_file, 'FILTER') == 'GR150C':\n", + " ax1.set_xlabel('X Pixels (<--- dispersion direction)')\n", + " ax1.set_ylabel('Y Pixels')\n", + " else:\n", + " ax1.set_xlabel('X Pixels')\n", + " ax1.set_ylabel('Y Pixels (<--- dispersion direction)') \n", + " \n", + " plt.suptitle(f\"{fits.getval(cal_file, 'FILTER')} {fits.getval(cal_file, 'PUPIL')} Source {source_id}\", x=0.5, y=1)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ca35719a-8eb7-4ba3-87c8-82cb067da4b8", + "metadata": {}, + "outputs": [], + "source": [ + "plot_cutout_and_spectrum(cal_data_118, x1d_data_118, cal_file, x1d_file, source_id)" + ] + }, + { + "cell_type": "markdown", + "id": "b4e0c7f9-2400-4324-a831-d614bb51c22f", + "metadata": {}, + "source": [ + "\n", + "### Limit the Number of Extracted Sources\n", + "\n", + "Because it takes so long to extract so many sources, let's see if we can pair down the number of sources being extracted. We'll do this with parameter inputs (for bright point sources) and with a further refined source catalog." + ] + }, + { + "cell_type": "markdown", + "id": "c815381d-b50b-404a-9941-e21696e20542", + "metadata": {}, + "source": [ + "#### Limit Extraction Using Parameter Inputs\n", + "For this calibration, we are explicitly calling out parameters the following step:\n", + "- `extract_2d` where we are setting `wfss_nbright` which limits the number of bright sources that are extracted and `wfss_mmag_extract` which sets a limit on the faintest magnitude we want extracted. [Further Reading](https://jwst-pipeline.readthedocs.io/en/latest/jwst/extract_2d/main.html#nircam-and-niriss-wfss)\n", + "\n", + "In this case, we'll limit the extractions to only the 10 brightest objects." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9fb2d1af-e529-497d-92db-390d17905eb9", + "metadata": {}, + "outputs": [], + "source": [ + "# check if the calibrated file already exists\n", + "asn_data = json.load(open(spec2_asns[0]))\n", + "x1d_file = os.path.join(param_outdir, f\"{asn_data['products'][0]['name']}_x1d.fits\")\n", + "\n", + "if os.path.exists(x1d_file):\n", + " print(x1d_file, ': x1d file already exists.')\n", + "else:\n", + " spec2 = Spec2Pipeline.call(spec2_asns[0],\n", + " steps={'extract_2d': {'wfss_nbright': 10}, },\n", + " save_results=True,\n", + " output_dir=param_outdir)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5c2e9d47-0e79-4005-8aa6-d3eaf1264a33", + "metadata": {}, + "outputs": [], + "source": [ + "# open the x1d to examine how many sources were extracted\n", + "x1ds = glob.glob(os.path.join(param_outdir, '*x1d.fits*'))\n", + "with fits.open(x1ds[0]) as temp_x1d:\n", + " print(f'The x1d file has {len(temp_x1d)-2} extracted sources')" + ] + }, + { + "cell_type": "markdown", + "id": "43e624bc-ea7b-41e5-a923-8c62374c9b13", + "metadata": {}, + "source": [ + "#### Limit Extraction Using the Source Catalog\n", + "In the last notebook we limited the catalog a couple of different ways. Here, let's limit the catalog to a specific magnitude range, and then use that new catalog to extract the data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b1cbc550-6f54-4f2b-b887-617661fabd07", + "metadata": {}, + "outputs": [], + "source": [ + "source_match_cats = np.sort(glob.glob('*source-match_cat.ecsv'))\n", + "source_match_cat = Table.read(source_cat)\n", + "\n", + "# look at the possible magnitude ranges to look at\n", + "mags = cat['isophotal_vegamag']\n", + "min_vegmag = mags.min()\n", + "max_vegmag = mags.max()\n", + "print(f\"Magnitude range: {min_vegmag} - {max_vegmag}\")\n", + "\n", + "# source 118 should have a Vega mag of ~21.68\n", + "source_id = 118\n", + "source_mag = source_match_cat[source_match_cat['label'] == source_id]['isophotal_vegamag'][0]\n", + "print(f\"Magnitude for source in previous notebook (source {source_id}) : {source_mag}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2cd470d5-f0f7-4f81-adf9-52e05f7dadfa", + "metadata": {}, + "outputs": [], + "source": [ + "# find the catalog for how many sources are between a specific magnitude (21.18 to make sure to include our example source)\n", + "min_mag = 21.1\n", + "max_mag = 21.2\n", + "mag_cat = source_match_cat[(source_match_cat['isophotal_vegamag'] >= min_mag) & (source_match_cat['isophotal_vegamag'] <= max_mag)]\n", + "\n", + "mag_cat['label', 'xcentroid', 'ycentroid', 'sky_centroid', 'is_extended', 'isophotal_abmag', 'isophotal_vegamag']" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3670bd3b-00c7-40cb-906d-b63e44c044d9", + "metadata": {}, + "outputs": [], + "source": [ + "mag_source_ext = 'mag-limit_cat.ecsv'\n", + "\n", + "new_cat = Table(mag_cat) # turn the row instance into a dataframe again\n", + "\n", + "# save the new catalog with a unique name\n", + "new_cat_name = source_cat.replace('cat.ecsv', mag_source_ext)\n", + "new_cat.write(new_cat_name, overwrite=True)\n", + "print('Saved:', new_cat_name)" + ] + }, + { + "cell_type": "markdown", + "id": "6c2f2072-b19d-4aab-92d6-2e0570bda9d8", + "metadata": {}, + "source": [ + "Once we have a source catalog that we've limited to specific sources, let's match the remaining catalogs to those sources" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "afc40e15-a182-4440-8d6a-920c0ea8f526", + "metadata": {}, + "outputs": [], + "source": [ + "for cat_name in source_match_cats:\n", + "\n", + " if cat_name == source_cat:\n", + " # the base one has already been saved\n", + " continue\n", + "\n", + " # match the source IDs between the current catalog and the base catalog above\n", + " cat = Table.read(cat_name)\n", + " cat.add_index('label')\n", + " new_cat = cat.loc[list(mag_cat['label'])]\n", + "\n", + " # check to ensure the sources are all there\n", + " print(repr(new_cat['label', 'xcentroid', 'ycentroid', 'sky_centroid', 'is_extended', 'isophotal_abmag', 'isophotal_vegamag']))\n", + " \n", + " # save the new catalog with a unique name\n", + " new_cat_name = cat_name.replace('cat.ecsv', mag_source_ext)\n", + " new_cat.write(new_cat_name, overwrite=True)\n", + " print('Saved:', new_cat_name)\n", + " print()" + ] + }, + { + "cell_type": "markdown", + "id": "570bc05e-f84f-42ab-9067-ea9dd072de75", + "metadata": {}, + "source": [ + "Once the new catalogs have been made, we have to update the association files to use the new catalogs. **Note: the association files will need to be updated again if you want to calibrate again with the previous source catalogs**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ced9ad72-78d6-4718-8de4-1e5e108db6dc", + "metadata": {}, + "outputs": [], + "source": [ + "new_source_ext = mag_source_ext\n", + "\n", + "# loop through all of the spec2 association files in the current directory\n", + "for asn in spec2_asns:\n", + " asn_data = json.load(open(asn))\n", + " for product in asn_data['products']: # for every product, check the members \n", + " for member in product['members']: # there are multiple members per product\n", + " if member['exptype'] == 'sourcecat':\n", + " cat_in_asn = member['expname']\n", + " # check that we haven't already updated the source catalog name\n", + " if new_source_ext not in cat_in_asn:\n", + " new_cat = cat_in_asn.replace('cat.ecsv', new_source_ext)\n", + " # actually update the association file member\n", + " if os.path.exists(new_cat):\n", + " member['expname'] = new_cat\n", + " else:\n", + " print(f\"{new_cat} does not exist in the currenty working directory\")\n", + "\n", + " # write out the new association file\n", + " with open(asn, 'w', encoding='utf-8') as f:\n", + " json.dump(asn_data, f, ensure_ascii=False, indent=4)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2e839a1a-5c23-4185-8033-e43d0b367037", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# double check that the source catalog has been changed\n", + "for spec2_asn in spec2_asns:\n", + " asn_check = json.load(open(spec2_asn))\n", + " for product in asn_check['products']:\n", + " for key, value in product.items():\n", + " if key == 'members':\n", + " for member in value:\n", + " if member['exptype'] == 'sourcecat':\n", + " print(f\" {member['exptype']}: {member['expname']}\")\n", + " else:\n", + " print(f\"{key}: {value}\")" + ] + }, + { + "cell_type": "markdown", + "id": "c5dd5d4c-5ef0-4690-a4da-a795c295a313", + "metadata": {}, + "source": [ + "Now when we calibrate everything, for a single file it should take a lot less time because there are a limited number of sources. However, we will calibrate all of the files in this visit, so this cell might take a bit of time to run." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6b7dad07-7d44-4ebe-841d-3f686b9df016", + "metadata": {}, + "outputs": [], + "source": [ + "# calibrate with the new source catalog\n", + "for spec2_asn in spec2_asns:\n", + " # check if the calibrated file already exists\n", + " asn_data = json.load(open(spec2_asn))\n", + " x1d_file = os.path.join(source_outdir, f\"{asn_data['products'][0]['name']}_x1d.fits\")\n", + " \n", + " if os.path.exists(x1d_file):\n", + " print(x1d_file, ': x1d file already exists.')\n", + " else:\n", + " spec2 = Spec2Pipeline.call(spec2_asn, save_results=True, output_dir=source_outdir)" + ] + }, + { + "cell_type": "markdown", + "id": "d03794fa-eee5-402a-af65-96378351275d", + "metadata": {}, + "source": [ + "\n", + "### Final Visualization\n", + "\n", + "Now that everything has been calibrated, it's useful to look at all of the extracted files. The cal and x1d files from spec2 are extracted at each dither step, which is shown below. Spec3 then turns the individual dither x1d files into a single combined spectrum per grism and filter for each source.\n", + "\n", + "Note that for GR150R data, the dispersion direction is in the -Y direction, and for GR150C data, the dispersion direction is in the -X direction." + ] + }, + { + "cell_type": "markdown", + "id": "83e7c35f-cf6e-4d30-be38-5b16c80c1487", + "metadata": {}, + "source": [ + "##### Look at all of the Files for a Single Source" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "94aec550-ea1f-4215-8965-bcd2ac408bfd", + "metadata": {}, + "outputs": [], + "source": [ + "# Explore the new Data\n", + "x1ds = np.sort(glob.glob(os.path.join(source_outdir, \"*x1d.fits\")))\n", + "\n", + "# get a list of all of the source IDs from the first file to look at for this example\n", + "sources = [fits.getval(x1ds[0], 'SOURCEID', ext=ext) for ext in range(len(fits.open(x1ds[0])))[1:-1]]\n", + "source_id = 118\n", + "\n", + "# plot each x1d/cal file\n", + "for i, x1d_file in enumerate(x1ds):\n", + " cal_file = x1d_file.replace('x1d.fits', 'cal.fits')\n", + " with fits.open(x1d_file) as x1d_hdu, fits.open(cal_file) as cal_hdu:\n", + "\n", + " try:\n", + " wh_x1d, wh_cal = find_source_ext(x1d_hdu, cal_hdu, source_id, info=False)\n", + " except IndexError:\n", + " # this means the source isn't in this observation\n", + " continue\n", + "\n", + " x1d_data = x1d_hdu[wh_x1d].data \n", + " cal_data = cal_hdu[wh_cal].data\n", + "\n", + " plot_cutout_and_spectrum(cal_data, x1d_data, cal_file, x1d_file, source_id)" + ] + }, + { + "cell_type": "markdown", + "id": "08135225-8662-4b89-8b2d-e422ae087e73", + "metadata": {}, + "source": [ + "Overplot these files on top of each other to compare. The two grisms will be different line styles to draw attention to any differences that could be due to the calibration, including contamination, and each blocking filter will be a different color." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d0fc37eb-a7c8-4083-82a6-80e329f1fa3c", + "metadata": {}, + "outputs": [], + "source": [ + "# Overplot the different grisms on top of each other for the same source\n", + "x1ds = np.sort(glob.glob(os.path.join(source_outdir, \"*x1d.fits\")))\n", + "\n", + "# get a list of all of the source IDs from the first file to look at for this example\n", + "sources = [fits.getval(x1ds[0], 'SOURCEID', ext=ext) for ext in range(len(fits.open(x1ds[0])))[1:-1]]\n", + "source_id = 118\n", + "\n", + "# create a figure with three panels\n", + "src118_fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(11, 9), sharex=True, sharey=True)\n", + "\n", + "ls_dict = {'GR150R': '--',\n", + " 'GR150C': '-',\n", + " }\n", + "\n", + "color_dict = {'F115W': '#e1cb00',\n", + " 'F150W': '#32b45c',\n", + " 'F200W': '#099ab4',\n", + " }\n", + "\n", + "max_fluxes = []\n", + "all_waves = []\n", + "# plot each x1d file\n", + "for i, x1d_file in enumerate(x1ds):\n", + " with fits.open(x1d_file) as x1d_hdu:\n", + " try:\n", + " wh_x1d, wh_cal = find_source_ext(x1d_hdu, cal_hdu, source_id, info=False)\n", + " except IndexError:\n", + " # this means the source isn't in this observation\n", + " continue\n", + "\n", + " x1d_data = x1d_hdu[wh_x1d].data \n", + " grism = x1d_hdu[0].header['FILTER']\n", + " filter = x1d_hdu[0].header['PUPIL']\n", + "\n", + " wave = x1d_data['WAVELENGTH'].ravel()\n", + " flux = x1d_data['FLUX'].ravel()\n", + "\n", + " flux = Fnu_to_Flam(wave, flux)\n", + " fluxunits = 'erg/s/cm^2/Angstrom'\n", + "\n", + " ax1.plot(wave, flux, color=color_dict[filter], ls=ls_dict[grism], alpha=0.7)\n", + " if grism == 'GR150C':\n", + " ax2.plot(wave, flux, color=color_dict[filter], ls=ls_dict[grism], alpha=0.7)\n", + " else:\n", + " ax3.plot(wave, flux, color=color_dict[filter], ls=ls_dict[grism], alpha=0.7)\n", + "\n", + " # save the maximum fluxes for plotting, removing any edge effects\n", + " edge_buffer = int(len(flux) * .25)\n", + " max_fluxes.append(np.nanmax(flux[edge_buffer:edge_buffer*-1]))\n", + " all_waves.extend(wave)\n", + "\n", + "# plot limits & labels\n", + "ax1.set_ylim(0, np.max(max_fluxes))\n", + "ax1.set_xlim(np.min(all_waves), np.max(all_waves))\n", + "\n", + "src118_fig.suptitle(f'Source {source_id}')\n", + "ax1.set_title('GR150R & GR150C')\n", + "ax2.set_title('GR150C')\n", + "ax3.set_title('GR150R')\n", + "\n", + "for ax in [ax1, ax2, ax3]:\n", + " ax.set_xlabel('Wavelength (Microns)')\n", + " ax.set_ylabel(f'Flux ({fluxunits})')\n", + "\n", + "# label for each of the filters\n", + "for filt, color in color_dict.items():\n", + " ax1.plot(0, 0, color=color, label=filt)\n", + "\n", + "ax1.legend(bbox_to_anchor=(1, 1.05))\n", + "src118_fig.tight_layout()" + ] + }, + { + "cell_type": "markdown", + "id": "123375a9-4660-4205-9128-4b077bfc06dc", + "metadata": {}, + "source": [ + "##### Look at all of the sources for a single file\n", + "\n", + "Note that some sources might not actually be extracting anything interesting. If this is the case, go back to your source catalog and images to ensure that you have the correct source identified and the target is centered in the cutout. This notebook uses simple examples to create and limit the source catalog, so it might not always show the most scientifically interesting sources." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "aa3b5777-4368-41fa-942f-c3a6f003d1a3", + "metadata": {}, + "outputs": [], + "source": [ + "x1d_file = os.path.join(source_outdir, 'jw02079004002_11101_00002_nis_x1d.fits') # this is a nice spectrum of 118 above\n", + "cal_file = x1d_file.replace('x1d.fits', 'cal.fits')\n", + "with fits.open(x1d_file) as x1d_hdu, fits.open(cal_file) as cal_hdu:\n", + "\n", + " for ext in range(len(x1d_hdu))[1:-1]:\n", + "\n", + " source_id = x1d_hdu[ext].header['SOURCEID']\n", + "\n", + " try:\n", + " wh_x1d, wh_cal = find_source_ext(x1d_hdu, cal_hdu, source_id, info=False)\n", + " except IndexError:\n", + " # this means the source isn't in this observation\n", + " continue\n", + " \n", + " x1d_data = x1d_hdu[wh_x1d].data \n", + " cal_data = cal_hdu[wh_cal].data\n", + " \n", + " plot_cutout_and_spectrum(cal_data, x1d_data, cal_file, x1d_file, source_id)" + ] + }, + { + "cell_type": "markdown", + "id": "13a0285f-edb3-4796-8377-774a65d02ecb", + "metadata": {}, + "source": [ + "##### Visualize where the extracted sources are on the dispersed image, and how that compares to the direct image" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d87810b9-efc5-4554-a2c0-11b5338c7b8b", + "metadata": {}, + "outputs": [], + "source": [ + "# setting up the figure\n", + "fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(11, 5), sharex=True, sharey=True)\n", + "\n", + "# **grism data\n", + "\n", + "# for the dispersed image plot\n", + "# x1d_file and cal_file use the same root as we are looking at for a single source\n", + "rate_file = os.path.basename(x1d_file.replace('x1d.fits', 'rate.fits'))\n", + "\n", + "# fill in the nan values from the bad pixels with zero to make it easier to look at\n", + "with fits.open(rate_file) as rate_hdu:\n", + " rate_data = rate_hdu[1].data\n", + "rate_data[np.isnan(rate_data)] = 0\n", + "\n", + "# plot the rate file and the extraction box\n", + "ax1.imshow(rate_data, origin='lower', vmin=0, vmax=np.nanmax(rate_data)*0.01, aspect='auto')\n", + "\n", + "with fits.open(x1d_file) as x1d_hdu, fits.open(cal_file) as cal_hdu:\n", + "\n", + " for ext in range(len(x1d_hdu))[1:-1]:\n", + "\n", + " source_id = x1d_hdu[ext].header['SOURCEID']\n", + "\n", + " try:\n", + " wh_x1d, wh_cal = find_source_ext(x1d_hdu, cal_hdu, source_id, info=False)\n", + " except IndexError:\n", + " # this means the source isn't in this observation\n", + " continue\n", + " \n", + " x1d_data = x1d_hdu[wh_x1d].data \n", + " cal_data = cal_hdu[wh_cal].data\n", + " \n", + " # extraction box parameters from the header of the cal data:\n", + " cal_header = cal_hdu[wh_cal].header\n", + " sx_left = cal_header['SLTSTRT1']\n", + " swidth = cal_header['SLTSIZE1']\n", + " sx_right = cal_header['SLTSTRT1'] + swidth\n", + " sy_bottom = cal_header['SLTSTRT2']\n", + " sheight = cal_header['SLTSIZE2']\n", + " sy_top = cal_header['SLTSTRT2'] + sheight\n", + " \n", + " rectangle = patches.Rectangle((sx_left, sy_bottom), swidth, sheight, edgecolor='darkred', facecolor=\"None\", linewidth=1)\n", + " ax1.add_patch(rectangle)\n", + " ax1.text(sx_left, sy_top+10, source_id, fontsize=12, color='darkred')\n", + " \n", + " ax1.set_title(f\"{os.path.basename(x1d_file).split('_nis')[0]}: {cal_hdu[0].header['FILTER']} {cal_hdu[0].header['PUPIL']}\")\n", + "\n", + "# **imaging data\n", + "asn_data = json.load(open(fits.getval(x1d_file, 'ASNTABLE')))\n", + "i2d_name = asn_data['products'][0]['members'][1]['expname']\n", + "cat_name = asn_data['products'][0]['members'][2]['expname']\n", + "with fits.open(os.path.join('../../', data_dir, custom_run_image3, i2d_name)) as i2d:\n", + " ax2.imshow(i2d[1].data, origin='lower', aspect='auto', vmin=0, vmax=np.nanmax(i2d[1].data)*0.01)\n", + " ax2.set_title(f\"{os.path.basename(i2d_name).split('_i2d')[0]}\")\n", + "\n", + "# also plot the associated catalog\n", + "cat = Table.read(cat_name)\n", + "extended_sources = cat[cat['is_extended'] == 1] # 1 is True; i.e. is extended\n", + "point_sources = cat[cat['is_extended'] == 0] # 0 is False; i.e. is a point source\n", + "\n", + "for color, sources in zip(['darkred', 'black'], [extended_sources, point_sources]):\n", + " # plotting the sources\n", + " ax2.scatter(sources['xcentroid'], sources['ycentroid'], s=40, facecolors='None', edgecolors=color, alpha=0.9, lw=2)\n", + "\n", + " # adding source labels \n", + " for i, source_num in enumerate(sources['label']):\n", + " ax2.annotate(source_num, \n", + " (sources['xcentroid'][i]+0.5, sources['ycentroid'][i]+0.5), \n", + " fontsize=12,\n", + " color=color)\n", + "\n", + "fig.tight_layout()" + ] + }, + { + "cell_type": "markdown", + "id": "b115e0b7-459b-4688-a27e-b8a0afb22da3", + "metadata": {}, + "source": [ + "Continue to explore further, including using the [spec3 stage](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_spec3.html) of the pipeline!" + ] + }, + { + "cell_type": "markdown", + "id": "d252a65a-1be8-4e0b-ae67-80cac8fc4770", + "metadata": {}, + "source": [ + "\"Space" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.7" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/NIRISS_WFSS_advanced/extra_niriss_individual_steps.ipynb b/notebooks/NIRISS_WFSS_advanced/extra_niriss_individual_steps.ipynb new file mode 100644 index 000000000..e8d5d0b32 --- /dev/null +++ b/notebooks/NIRISS_WFSS_advanced/extra_niriss_individual_steps.ipynb @@ -0,0 +1,309 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "28608b44", + "metadata": {}, + "source": [ + "# Running Individual Pipeline Steps\n", + "\n", + "This notebook walks through calibrating the data with individual pipeline steps (AssignWCS and FlatField) rather than running the entire pipeline stage.\n", + "\n", + "**Use case**: When using a package outside of the standard JWST pipeline, there may be certain steps that are still helpful to utilize within the JWST pipeline. Here we show the most commonly run individual steps, AssignWCS and FlatField. AssignWCS determines and stores the WCS (World Coordinate System) information, and FlatField removes [detector features](https://jwst-docs.stsci.edu/known-issues-with-jwst-data/niriss-known-issues#NIRISSKnownIssues-artifactsArtifacts).
\n", + "**Data**: JWST/NIRISS images and spectra from program 2079.
\n", + "**Tools**: astropy, crds, glob, jwst, matplotlib, numpy, os, urllib, zipfile
\n", + "**Cross-instrument**: NIRISS
\n", + "\n", + "**Content**\n", + "- [Imports & Data Setup](#imports)\n", + "- [Running Individual Pipeline Steps](#pipeline_steps)\n", + " - [Assign WCS Step](#wcs_step)\n", + " - [Flat Field Step](#ff_step)\n", + " - [Compare Rate vs. Flat Fielded Data](#compare)\n", + "\n", + "**Author**: Rachel Plesha (rplesha@stsci.edu), Camilla Pacifici (cpacifici@stsci.edu)
\n", + "**Last modified**: May 2024
\n", + "**Last tested**: This notebook was last tested with JWST pipeline version 1.12.5 and the CRDS context jwst_1229.pmap." + ] + }, + { + "cell_type": "markdown", + "id": "5ebb20f2-fb41-42c6-96a6-b35dfe77bab7", + "metadata": {}, + "source": [ + "\n", + "## Imports & Data Setup" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "54802ba1-873a-4a62-9db1-84f60d1677e0", + "metadata": {}, + "outputs": [], + "source": [ + "# Update the CRDS path to your local directory\n", + "%env CRDS_PATH=crds_cache\n", + "%env CRDS_SERVER_URL=https://jwst-crds.stsci.edu" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "216f4a35", + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import glob\n", + "import urllib\n", + "import zipfile\n", + "import numpy as np\n", + "\n", + "from astropy.io import fits\n", + "\n", + "from matplotlib import pyplot as plt\n", + "%matplotlib inline\n", + "\n", + "from jwst.assign_wcs import AssignWcsStep\n", + "from jwst.flatfield import FlatFieldStep" + ] + }, + { + "cell_type": "markdown", + "id": "3c208323-56f4-4fb5-b8c0-750b65b525aa", + "metadata": {}, + "source": [ + "Check what version of the JWST pipeline you are using. To see what the latest version of the pipeline is available or install a previous version, check [GitHub](https://github.com/spacetelescope/jwst#software-vs-dms-build-version-map). Also verify what [CRDS context](https://jwst-crds.stsci.edu/) you are using. [CRDS documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/user_documentation/reference_files_crds.html) explains how to set a specific context to use in the JWST pipeline. If either of these values are different from the last tested note above there may be differences in this notebook." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b649638b-0353-4b4a-874d-949fc6610876", + "metadata": {}, + "outputs": [], + "source": [ + "import jwst\n", + "import crds\n", + "print('JWST Pipeliene Version:', jwst.__version__)\n", + "print('CRDS Context:', crds.get_context_name('jwst'))" + ] + }, + { + "cell_type": "markdown", + "id": "be56a062-bec7-4aff-922a-eaec53bdb346", + "metadata": {}, + "source": [ + "#### Data setup\n", + "Here we download and open the zip file that contains all of the rate files, and we also create an output directory for the calibrated files if it does not already exist" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "48baec42-7aee-4731-8048-1054c9a127f4", + "metadata": {}, + "outputs": [], + "source": [ + "data_dir_in = 'data' # directory where the rate files should be\n", + "data_dir_out = 'data/calibrated_steps/' # directory where to save the calibrate files\n", + "\n", + "# if the directory does not exist that you want to save out to, make that directory first\n", + "for datadir in [data_dir_in, data_dir_out]:\n", + " if not os.path.exists(datadir):\n", + " os.makedirs(datadir)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "21167b5c-3654-415d-ba9f-b5484e367c57", + "metadata": {}, + "outputs": [], + "source": [ + "# Download uncalibrated data from Box into the data directory:\n", + "boxlink = 'https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/niriss_wfss_advanced/niriss_wfss_extra_input.zip'\n", + "boxfile = os.path.basename(boxlink)\n", + "urllib.request.urlretrieve(boxlink, boxfile)\n", + "\n", + "zf = zipfile.ZipFile(boxfile, 'r')\n", + "zf.extractall(path=data_dir_in)\n", + "\n", + "# move the files downloaded from the box file into the top level data directory\n", + "box_download_dir = os.path.join(data_dir_in, boxfile.split('.zip')[0])\n", + "for filename in glob.glob(os.path.join(box_download_dir, '*')):\n", + " if '.csv' in filename:\n", + " # move to the current directory\n", + " os.rename(filename, os.path.basename(filename))\n", + " else:\n", + " # move to the data directory \n", + " os.rename(filename, os.path.join(data_dir_in, os.path.basename(filename)))\n", + "\n", + "# remove unnecessary files now\n", + "os.remove(boxfile)\n", + "os.rmdir(box_download_dir)" + ] + }, + { + "cell_type": "markdown", + "id": "d9bfb726-26b5-46c1-99bf-3c86e0c9e3bb", + "metadata": {}, + "source": [ + "\n", + "## Running Individual Pipeline Steps" + ] + }, + { + "cell_type": "markdown", + "id": "da194621-1d85-4481-922a-118ec1f859ec", + "metadata": {}, + "source": [ + "While you could look at the rate images, instead consider running the files through the `assign_wcs` and `flat_field` steps of the pipeline to clean up detector artifacts." + ] + }, + { + "cell_type": "markdown", + "id": "0c83aab3-75de-4a9b-be3c-28669b4c0044", + "metadata": {}, + "source": [ + "\n", + "#### Assign WCS Step\n", + "\n", + "The `assign_wcs` step of the pipeline is a critical part to obtaining the correct spectral trace cutouts for WFSS images. To read more about the step, visit the [AssignWCS description page](https://jwst-pipeline.readthedocs.io/en/latest/jwst/assign_wcs/main.html)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "66548075-1cf6-4d58-80ef-833f46c021a4", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# Run assign_wcs; we are only running on one file for demonstration here\n", + "ratefile = os.path.join(data_dir_in, 'jw02079004002_02101_00001_nis_rate.fits')\n", + "result = AssignWcsStep.call(ratefile, output_dir=data_dir_out, save_results=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9f97a814-64e6-44a9-bd35-4db7cb1977a7", + "metadata": {}, + "outputs": [], + "source": [ + "# A quick sanity check to ensure that the files were calibrated.\n", + "# if this is zero, check the log message above for any errors that may have occurred during the calibration\n", + "wcsstep_files = glob.glob(os.path.join(data_dir_out, '*assignwcsstep*'))\n", + "print(len(wcsstep_files), 'assignwcsstep files written') # 1 file should have been written" + ] + }, + { + "cell_type": "markdown", + "id": "ae590a10-cc87-46a5-adfa-829f67549ec4", + "metadata": {}, + "source": [ + "\n", + "#### Flat Field Step\n", + "\n", + "After the assignwcs file is run, we then want to run the `flat_field` step of the pipeline which removes detector artifacts using the flat field reference files. To read more about the step, visit the [FlatField description page](https://jwst-pipeline.readthedocs.io/en/latest/jwst/flatfield/main.html)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b6fd8419-841f-426d-aed2-31e7038c819c", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# Run flat_field\n", + "for wcsfile in wcsstep_files:\n", + " result = FlatFieldStep.call(wcsfile, output_dir=data_dir_out, save_results=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f7a250d7-85c8-4fec-8e7d-929027811ca0", + "metadata": {}, + "outputs": [], + "source": [ + "# A quick sanity check to ensure that the files were calibrated.\n", + "# if this is zero, check the log message above for any errors that may have occurred during the calibration\n", + "flatfield_files = glob.glob(os.path.join(data_dir_out, '*flatfieldstep*'))\n", + "print(len(flatfield_files), 'flatfieldstep files written') # 1 file should have been written (matching the wcs step)" + ] + }, + { + "cell_type": "markdown", + "id": "cd2eee20-b5c9-4a77-a9e0-e23eed1baa63", + "metadata": {}, + "source": [ + "\n", + "#### Compare Rate vs. Flat fielded Data\n", + "Running the cell below shows the same direct image from just the rate file versus when the `flat_field` step of the pipeline is run. Some detector artifacts are noticably gone such as the [cross-hatching](https://jwst-docs.stsci.edu/known-issues-with-jwst-data/niriss-known-issues#NIRISSKnownIssues-Cross-hatching) in the bottom right and middle of the detector.\n", + "\n", + "There are remaining optical artifacts to be aware of on the [NIRISS known issues page](https://jwst-docs.stsci.edu/known-issues-with-jwst-data/niriss-known-issues) such as the 1/f noise." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dde1dfc3-c2af-4797-b97f-16ce249212cb", + "metadata": {}, + "outputs": [], + "source": [ + "test_rate_file = ratefile # look at a direct image for this comparision\n", + "test_flat_file = os.path.join(data_dir_out, os.path.basename(test_rate_file).replace('rate.fits', 'flatfieldstep.fits'))\n", + "\n", + "plot_files = [test_rate_file, test_flat_file]\n", + "plot_titles = ['Rate File', 'Flat Corrected File']\n", + "\n", + "fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 15))\n", + "fig.suptitle(f\"{os.path.basename(test_rate_file).split('_rate')[0]}\\n{fits.getval(test_rate_file, 'PUPIL')}\", x=0.5, y=0.72)\n", + "\n", + "for filename, title, ax in zip(plot_files, plot_titles, [ax1, ax2]):\n", + " with fits.open(filename) as hdu:\n", + " # fill in the nan values from the bad pixels with zero; otherwise a single, non-dithered image is impossible to really see\n", + " data = hdu[1].data\n", + " data[np.isnan(data)] = 0\n", + " \n", + " ax.imshow(data, vmin=0.2, vmax=1.2, origin='lower')\n", + " ax.set_title(title)" + ] + }, + { + "cell_type": "markdown", + "id": "3cb991cf-af50-4189-8474-19adcb5768d8", + "metadata": {}, + "source": [ + "\"Space" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.7" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/NIRISS_WFSS_advanced/requirements.txt b/notebooks/NIRISS_WFSS_advanced/requirements.txt new file mode 100644 index 000000000..dbb55021e --- /dev/null +++ b/notebooks/NIRISS_WFSS_advanced/requirements.txt @@ -0,0 +1,9 @@ +astropy >= 4.3.1 +astroquery >= 0.4.7 +crds +glob2 +jdaviz +jwst >= 1.12.5 +matplotlib +numpy +pandas