diff --git a/docs/notebooks/0_introduction.ipynb b/docs/notebooks/0_introduction.ipynb index 3c2bf8d..10c2a55 100644 --- a/docs/notebooks/0_introduction.ipynb +++ b/docs/notebooks/0_introduction.ipynb @@ -4,8 +4,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - " \n", - " \n", + " \n", + " \n", "\n", "# Photo-z Server\n", "## Tutorial Notebook 0 - Introduction\n", @@ -18,7 +18,7 @@ "source": [ "Contact author: [Julia Gschwend](mailto:julia@linea.org.br)\n", "\n", - "Last verified run: **2024-Jul-04**" + "Last verified run: **2024-Jul-09**" ] }, { @@ -97,7 +97,7 @@ "metadata": {}, "source": [ "
\n", - " \n", + " \n", "
\n" ] }, @@ -120,8 +120,24 @@ "\n", "```\n", "$ pip install pzserver \n", - "``` \n", - "\n", + "``` " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "! pip install pzserver " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "jp-MarkdownHeadingCollapsed": true + }, + "source": [ "**For developers** \n", "\n", "Alternatively, if you have cloned the repository with:\n", @@ -137,7 +153,7 @@ "```\n", "\n", "--- \n", - "OBS: You might need to restart the kernel on the notebook to incorporate the new library. \n" + "OBS: You might need to restart the kernel on the notebook to incorporate the new library. " ] }, { @@ -272,6 +288,7 @@ "cell_type": "code", "execution_count": null, "metadata": { + "scrolled": true, "tags": [] }, "outputs": [], @@ -323,6 +340,7 @@ "cell_type": "code", "execution_count": null, "metadata": { + "scrolled": true, "tags": [] }, "outputs": [], @@ -629,9 +647,9 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "LSST", "language": "python", - "name": "python3" + "name": "lsst" }, "language_info": { "codemirror_mode": { @@ -643,7 +661,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.10" + "version": "3.11.7" }, "nbsphinx": { "execute": "never" diff --git a/docs/notebooks/1_specz_catalogs.ipynb b/docs/notebooks/1_specz_catalogs.ipynb index a61b927..f38f26f 100644 --- a/docs/notebooks/1_specz_catalogs.ipynb +++ b/docs/notebooks/1_specz_catalogs.ipynb @@ -4,8 +4,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - " \n", - " \n", + " \n", + " \n", "\n", "# Photo-z Server\n", "## Tutorial Notebook 1 - Spec-z Catalogs\n" @@ -228,9 +228,9 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "LSST", "language": "python", - "name": "python3" + "name": "lsst" }, "language_info": { "codemirror_mode": { @@ -242,7 +242,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.10" + "version": "3.11.7" }, "nbsphinx": { "execute": "never" diff --git a/docs/notebooks/2_training_sets.ipynb b/docs/notebooks/2_training_sets.ipynb index e4f839f..e13936a 100644 --- a/docs/notebooks/2_training_sets.ipynb +++ b/docs/notebooks/2_training_sets.ipynb @@ -4,8 +4,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - " \n", - " \n", + " \n", + " \n", "\n", "# Photo-z Server\n", "## Tutorial Notebook 2 - Training Sets\n" @@ -208,9 +208,9 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "LSST", "language": "python", - "name": "python3" + "name": "lsst" }, "language_info": { "codemirror_mode": { @@ -222,7 +222,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.10" + "version": "3.11.7" }, "nbsphinx": { "execute": "never" diff --git a/docs/notebooks/images/ScreenshotProductListButtons.png b/docs/notebooks/images/ScreenshotProductListButtons.png new file mode 100644 index 0000000..256d5ba Binary files /dev/null and b/docs/notebooks/images/ScreenshotProductListButtons.png differ diff --git a/docs/notebooks/images/linea.png b/docs/notebooks/images/linea.png index 599c039..83942f0 100644 Binary files a/docs/notebooks/images/linea.png and b/docs/notebooks/images/linea.png differ diff --git a/docs/notebooks/images/linea_darkblue.png b/docs/notebooks/images/linea_darkblue.png new file mode 100644 index 0000000..599c039 Binary files /dev/null and b/docs/notebooks/images/linea_darkblue.png differ diff --git a/docs/notebooks/intro_notebook.ipynb b/docs/notebooks/intro_notebook.ipynb deleted file mode 100644 index db95dc6..0000000 --- a/docs/notebooks/intro_notebook.ipynb +++ /dev/null @@ -1,1121 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - " \n", - " \n", - "\n", - "# Photo-z Server - Tutorial Notebook\n", - "\n", - "Contact author: Julia Gschwend ([julia@linea.org.br](mailto:julia@linea.org.br))\n", - "\n", - "Last verified run: 2024-Jun-25
" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "
\n", - "\n", - "# Notebook contents\n", - "\n", - "- PZ Server\n", - " - [Introduction](#introduction) \n", - " - [How to upload a data product to the PZ Server](#how-to-upload-a-data-product-to-the-pz-server)\n", - " - [How to download a data product from the PZ Server](#how-to-download-a-data-product-from-the-pz-server)\n", - "- PZ Server API (Python library pzserver)\n", - "- PZ Server API (Python library pzserver)\n", - " - [How to get general info from PZ Server](#how-to-get-general-info-from-pz-server)\n", - " - [How to display the metadata of a data product](#how-to-display-the-metadata-of-a-data-product)\n", - " - [How to download data products as .zip files](#how-to-download-data-products-as-zip-files) \n", - " - [How to share data products with other RSP users](#how-to-share-data-products-with-other-rsp-users)\n", - " - [How to retrieve contents of data products (work on memory)](#how-to-retrieve-contents-of-data-products-work-on-memory)\n", - "- Product types \n", - " - [Spec-z Catalogs](#spec-z-catalog)\n", - " - [Training Sets](#training-sets)\n", - " - [Photo-z Validation Results](#photo-z-validation-results)\n", - " - [Photo-z Tables](#photo-z-tables)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "# The PZ Server\n", - "\n", - "
\n", - "\n", - "## Introduction \n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "The Photo-z (PZ) Server is an online service available for the LSST Community to host and share lightweight photo-z related data products. The upload and download of data and metadata can be done at the website [pz-server.linea.org.br](https://pz-server.linea.org.br/) (during the development phase, a test environment is available at [pz-server-dev.linea.org.br](https://pz-server-dev.linea.org.br/)). There, you will find two separate pages containing a list of data products each: one for LSST Data Management's oficial data products, and other for user-generated data products. **The registered data products can also be accessed directly from Python code using the PZ Server's data access API, as demonstrated below.**\n", - "\n", - "The PZ Server is developed and delivered as part of the in-kind contribution program BRA-LIN, from LIneA to the Rubin Observatory's LSST. The service is hosted in the Brazilian IDAC, not directly connected to the [Rubin Science Platform (RSP)](https://data.lsst.cloud/). However, it requires RSP credentials for user's authentication. For a comprehensive documentation about the PZ Server, please visit the [PZ Server's documentation page](https://linea-it.github.io/pz-lsst-inkind-doc/). There, you will find also an overview of all LIneA's contributions related to Photo-zs. The internal documentation of the API functions is available on the [API's documentation page](https://linea-it.github.io/pzserver/html/index.html). " - "The PZ Server is developed and delivered as part of the in-kind contribution program BRA-LIN, from LIneA to the Rubin Observatory's LSST. The service is hosted in the Brazilian IDAC, not directly connected to the [Rubin Science Platform (RSP)](https://data.lsst.cloud/). However, it requires RSP credentials for user's authentication. For a comprehensive documentation about the PZ Server, please visit the [PZ Server's documentation page](https://linea-it.github.io/pz-lsst-inkind-doc/). There, you will find also an overview of all LIneA's contributions related to Photo-zs. The internal documentation of the API functions is available on the [API's documentation page](https://linea-it.github.io/pzserver/html/index.html). " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## How to upload a data product on the PZ Server website\n", - "## How to upload a data product on the PZ Server website\n", - "\n", - "To upload a data product, click on the button **NEW PRODUCT** on the top left of the **User-generated Data Products** page and fill in the Upload Form with relevant metadata. Alternatively, the user can upload files to the PZ Server programatically via the `pzserver` Python Library (described below). \n", - "To upload a data product, click on the button **NEW PRODUCT** on the top left of the **User-generated Data Products** page and fill in the Upload Form with relevant metadata. Alternatively, the user can upload files to the PZ Server programatically via the `pzserver` Python Library (described below). \n", - "\n", - "The photo-z-related products are organized into four categories (product types):\n", - "\n", - "- **Spec-z Catalog:** Catalog of spectroscopic redshifts and positions (usually equatorial coordinates).\n", - "- **Training Set:** Training set for photo-z algorithms (tabular data). It usually contains magnitudes, errors, and true redshifts.\n", - "- **Photo-z Validation Results:** Results of a photo-z validation procedure (free format). Usually contains photo-z estimates (single estimates and/or pdf) of a validation set, photo-z validation metrics, validation plots, etc.\n", - "- **Photo-z Table:** Results of a photo-z estimation procedure. Ideally in the same format as the photo-z tables delivered by the DM as part of the LSST data releases. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (and instructions on accessing the data should be provided in the description field). " - "- **Photo-z Table:** Results of a photo-z estimation procedure. Ideally in the same format as the photo-z tables delivered by the DM as part of the LSST data releases. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (and instructions on accessing the data should be provided in the description field). " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## How to download a data product from the PZ Server website\n", - "## How to download a data product from the PZ Server website\n", - "\n", - "To download a data product available on the Photo-z Server, go to one of the two pages by clicking on the card \"LSST PZ Data Products\" (for official products released by LSST DM Team) or \"User-generated Data Products\" (for products uploaded by the members of LSST community. The download button is on the left side of each data product (each row of the list). " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# The PZ Server API (Python library `pzserver`)" - "# The PZ Server API (Python library `pzserver`)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "**Using pip**\n", - "\n", - "The PZ Server API is avalialble on **pip** as `pzserver`. To install the API and its dependencies, type, on the Terminal: \n", - "\n", - "```\n", - "$ pip install pzserver \n", - "``` \n", - "\n", - "**For developers** \n", - "\n", - "Alternatively, if you have cloned the repository with:\n", - "\n", - "```\n", - "$ git clone https://github.com/linea-it/pzserver.git \n", - "``` \n", - "\n", - "To install the API and its dependencies, type:\n", - "\n", - "```\n", - "$ pip install .[dev]\n", - "```\n", - "\n", - "\n", - "OBS: You might need to restart the kernel on the notebook to incorporate the new library.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Imports and Setup" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "from pzserver import PzServer \n", - "import matplotlib.pyplot as plt\n", - "%reload_ext autoreload \n", - "%autoreload 2" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The connection with the PZ Server from Python code is done by an object of the class `PzServer`. To get authorization to define an instance of `PzServer`, the users must provide an **API Token** generated on the top right menu on the [PZ Server website](https://pz-server.linea.org.br/) (during the development phase, on the [test environment](https://pz-server-dev.linea.org.br/)). " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - " " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# pz_server = PzServer(token=\"\", host=\"pz-dev\") # \"pz-dev\" is the temporary host for test phase " - "# pz_server = PzServer(token=\"\", host=\"pz-dev\") # \"pz-dev\" is the temporary host for test phase " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "For convenience, the token can be saved into a file named as `token.txt` (which is already listed in the .gitignore file in this repository). " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "with open('token.txt', 'r') as file:\n", - " token = file.read()\n", - "pz_server = PzServer(token=token, host=\"pz-dev\") # \"pz-dev\" is the temporary host for test phase " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "For convenience, the token can be saved into a file named as `token.txt` (which is already listed in the .gitignore file in this repository). " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "with open('token.txt', 'r') as file:\n", - " token = file.read()\n", - "pz_server = PzServer(token=token, host=\"pz-dev\") # \"pz-dev\" is the temporary host for test phase " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## How to get general info from PZ Server" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The object `pz_server` just created above can provide access to data and metadata stored in the PZ Server. It also brings useful methods for users to navigate through the available contents. The methods with the preffix `get_` return the result of a query on the PZ Server database as a Python dictionary, and are most useful to be used programatically (see detaials on the [API documentation page](https://linea-it.github.io/pzserver/html/index.html)). Alternatively, those with the preffix `display_` show the results as a styled [_Pandas DataFrames_](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), optimized for Jupyter Notebook (note: column names might change in the display version). For instance:\n", - "The object `pz_server` just created above can provide access to data and metadata stored in the PZ Server. It also brings useful methods for users to navigate through the available contents. The methods with the preffix `get_` return the result of a query on the PZ Server database as a Python dictionary, and are most useful to be used programatically (see detaials on the [API documentation page](https://linea-it.github.io/pzserver/html/index.html)). Alternatively, those with the preffix `display_` show the results as a styled [_Pandas DataFrames_](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), optimized for Jupyter Notebook (note: column names might change in the display version). For instance:\n", - "\n", - "Display the list of product types supported with a short description;" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.display_product_types()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Display the list of users who uploaded data products to the server;" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.display_users()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Display the list of data releases available at the time; " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.display_releases()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "---\n", - "\n", - "Display all data products available (WARNING: this list can rapdly grow during the survey's operation). " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "pz_server.display_products_list() " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The information about product type, users, and releases shown above can be used to filter the data products of interest for your search. For that, the method `list_products` receives as argument a dictionary mapping the products attributes to their values. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.display_products_list(filters={\"release\": \"LSST DP0\", \n", - " \"product_type\": \"Training Set\"})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "It also works if we type a string pattern that is part of the value. For instance, just \"DP0\" instead of \"LSST DP0\": " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.display_products_list(filters={\"release\": \"DP0\"})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "It also allows the search for multiple strings by adding the suffix `__or` (two underscores + \"or\") to the search key. For instance, to get spec-z catalogs and training sets in the same search (notice that filtering is not case sensitive):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "pz_server.display_products_list(filters={\"product_type__or\": [\"Spec-z Catalog\", \"training set\"]})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To fetch the results of a search and attribute to a variable, just change the preffix `display_` by `get_`, like this: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "search_results = pz_server.get_products_list(filters={\"product_type\": \"results\"}) # PZ Validation results\n", - "search_results" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## How to upload a data product to via Python API (alternative method) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The default method to upload a data product to the PZ Server is the upload tool on PZ Server website, as shown above. Alternatively, data products can be sent to the host service using the `pzserver` Python library. \n", - "## How to upload a data product to via Python API (alternative method) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The default method to upload a data product to the PZ Server is the upload tool on PZ Server website, as shown above. Alternatively, data products can be sent to the host service using the `pzserver` Python library. \n", - "\n", - "First, prepare a dictionary with the relevant information about your data product: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "data_to_upload = {\n", - " \"name\":\"upload example 1\",\n", - " \"product_type\": \"specz_catalog\", \n", - " \"release\": None, # LSST release, use None if not LSST data \n", - " \"main_file\": \"example.csv\", # full path \n", - " \"auxiliary_files\": [\"example.html\", \"example.ipynb\"] # full path\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "upload = pz_server.upload(**data_to_upload) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "columns_dict = {\"ID\" : \"ID\", \n", - " \"RA\" : \"RA\", \n", - " \"Dec\": \"DEC\",\n", - " \"z\" : \"Z\",\n", - " \"z_err\" : \"ERR_Z\",\n", - " \"z_flag\": \"FLAG_DES\" \n", - " }" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "upload.make_columns_association(columns_dict) " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "upload.product_id" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "First, prepare a dictionary with the relevant information about your data product: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "data_to_upload = {\n", - " \"name\":\"upload example 1\",\n", - " \"product_type\": \"specz_catalog\", \n", - " \"release\": None, # LSST release, use None if not LSST data \n", - " \"main_file\": \"example.csv\", # full path \n", - " \"auxiliary_files\": [\"example.html\", \"example.ipynb\"] # full path\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "upload = pz_server.upload(**data_to_upload) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "columns_dict = {\"ID\" : \"ID\", \n", - " \"RA\" : \"RA\", \n", - " \"Dec\": \"DEC\",\n", - " \"z\" : \"Z\",\n", - " \"z_err\" : \"ERR_Z\",\n", - " \"z_flag\": \"FLAG_DES\" \n", - " }" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "upload.make_columns_association(columns_dict) " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "upload.product_id" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## How to display the metadata of a data product " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The metadata of a given data product is the information provided by the user on the upload form. This information is attached to the data product contents and is available for consulting on the PZ Server page or using this Python API (`pzserver`). \n", - "The metadata of a given data product is the information provided by the user on the upload form. This information is attached to the data product contents and is available for consulting on the PZ Server page or using this Python API (`pzserver`). \n", - "\n", - "All data products stored on PZ Server are identified by a unique **id** number or an unique name, a _string_ called **internal_name**, which is created automatically at the moment of the upload by concatenating the product **id** to the name given by its owner (replacing blank spaces by \"_\", lowering cases, and removing special characters). " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The `PzServer`'s method `get_product_metadata()` returns a dictionary with the attibutes stored in the PZ Server about a given data product identified by its **id** or **internal_name**. For use in a Jupyter notebook, the equivalent `display_product_metadata()` shows the results in a formated table." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# pz_server.display_product_metadata() \n", - "# pz_server.display_product_metadata(6) \n", - "# pz_server.display_product_metadata(\"6\") \n", - "pz_server.display_product_metadata(\"6_simple_training_set\") " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "[back to the top](#notebook-contents)\n", - "\n", - "
\n", - "\n", - "## How to download data products as .zip files " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To download any data product stored in the PZ Server, use the `PzServer`'s method `download_product` informing the product's **internal_name** and the path to where it will be saved (the default is the current folder). This method downloads a compressed .zip file which contais all the files uploaded by the user, including data, anciliary files and description files. The time spent to download a data product depends on the internet connections between the user and the host. Let's try it with a small data product. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.download_product(14, save_in=\".\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "[back to the top](#notebook-contents)\n", - "\n", - "
\n", - "\n", - "## How to share data products with other RSP users\n", - "\n", - "All data products uploaded to the PZ Server are imediately available and visible to all PZ Server users (people with RSP credentials) through the PZ Server website or via the API. Besides informing the product **id** or **internal_name** for programatic access, another way to share a data product is providing the product's URL, which leads to the product's download page. The URL is composed by the PZ Server website address + **/products/** + **internal_name**:\n", - "\n", - "https://pz-server.linea.org.br/product/ + **internal_name** \n", - "\n", - "or, if still in the development phase, \n", - "\n", - "https://pz-server-dev.linea.org.br/product/ + **internal_name**\n", - "\n", - "\n", - "For example: \n", - "\n", - "https://pz-server-dev.linea.org.br/product/6_simple_training_set\n", - "\n", - " WARNING: The URL works only with the internal name, **not** with the **id** number. \n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "[back to the top](#notebook-contents)\n", - "\n", - "
\n", - "\n", - "## How to retrieve contents of data products (work on memory)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Another feature of the PZ Server API is to let users retrieve the contents of a given data product to work on memory (by atributing the results of the method `get_product()` to a variable in the code). This feature is available only for tabular data (product types: **Spec-z Catalog** and **Training Set**). \n", - "\n", - "By default, the method `get_product` returns an object from a particular class, depending on the product's type. The classes `SpeczCatalog` and `TrainingSet` are simple extensions of `pandas.DataFrame` (via class composition) with a couple of additional attributes and methods, such as the attribute `metadata`, and the method `display_metadata()`. Let's see an example: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "catalog = pz_server.get_product(8)\n", - "catalog" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "catalog.display_metadata()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The tabular data is alocated in the attribute `data`, which is a `pandas.DataFrame`. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "catalog.data\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "type(catalog.data)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "It preserves the useful methods from `pandas.DataFrame`, such as: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "catalog.data.info()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "catalog.data.describe()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the prod-types you will see more details about these specific classes. For those who prefer working with `astropy.Table` or pure `pandas.DataFrame`, the method `get_product()` gives the flexibility to choose the output format (`fmt=\"pandas\"` or `fmt=\"astropy\"`). " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "dataframe = pz_server.get_product(8, fmt=\"pandas\")\n", - "print(type(dataframe))\n", - "dataframe" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "table = pz_server.get_product(8, fmt=\"astropy\")\n", - "print(type(table))\n", - "table" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "--- \n", - "Clean up" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "del search_results, catalog, dataframe, table " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "--- " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "[back to the top](#notebook-contents)\n", - "\n", - "
\n", - "\n", - "# Product types " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The PZ Server API provides Python classes with useful methods to handle particular product types. Let's recap the product types available: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.display_product_types()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "
\n", - "\n", - "[back to the top](#notebook-contents)\n", - "\n", - "
\n", - "\n", - "## Spec-z Catalog " - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "In the context of the PZ Server, Spec-z Catalogs are defined as any catalog containing spherical equatorial coordinates and spectroscopic redshift measurements (or, analogously, true redshifts from simulations). A Spec-z Catalog can include data from a single spectroscopic survey or a combination of data from several sources. To be considered as a single Spec-z Catalog, the data should be provided as a single file to PZ Server's the upload tool. For multi-survey catalogs, it is recommended to add the survey name or identification as an extra column. \n", - "\n", - "\n", - "Mandatory columns: \n", - "* Right ascension [degrees] - `float`\n", - "* Declination [degrees] - `float`\n", - "* Spectroscopic or true redshift - `float`\n", - "\n", - "Recommended columns: \n", - "* Spectroscopic redshift error - `float`\n", - "* Quality flag - `integer`, `float`, or `string`\n", - "* Survey name (recommended for compilations of data from different surveys)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's see an example of Spec-z Catalog: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "gama = pz_server.get_product(14)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "gama.display_metadata()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Display basic statistics" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "gama.data.describe()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The spec-z catalog object has a very basic plot method for quick visualization of catalog properties " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "gama.plot()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The attribute `data`, which is a `DataFrame` preserves the `plot` method from Pandas. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "gama.data.plot(x=\"RA\", y=\"DEC\", kind=\"scatter\") " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "[back to the top](#notebook-contents)\n", - "\n", - "
\n", - "\n", - "## Training Sets \n", - " \n", - "In the context of the PZ Server, Training Sets are defined as the product of matching (spatially) a given Spec-z Catalog (single survey or compilation) to the photometric data, in this case, the LSST Objects Catalog. The PZ Server API offers a tool called _Training Set Maker_ for users to build customized Training Sets based on the Spec-z Catalogs available. Please see the companion Jupyter Notebook `pz_tsm_tutorial.ipynb` for details. \n", - "\n", - "\n", - "_Note 1: Commonly the training set is split into two or more subsets for photo-z validation purposes. If the Training Set owner has previously defined which objects should belong to each subset (trainining and validation/test sets), this information must be available as an extra column in the table or as clear instructions for reproducing the subsets separation in the data product description._\n", - "\n", - " \n", - "_Note 2: The PZ Server only supports catalog-level Training Sets. Image-based Training Sets, e.g., for deep-learning algorithms, are not supported yet._\n", - "\n", - "\n", - "Mandatory column: \n", - "* Spectroscopic (or true) redshift - `float`\n", - "\n", - "Other expected columns\n", - "* Object ID from LSST Objects Catalog - `integer`\n", - "* Observables: magnitudes (and/or colors, or fluxes) from LSST Objects Catalog - `float`\n", - "* Observable errors: magnitude errors (and/or color errors, or flux errors) from LSST Objects Catalog - `float`\n", - "* Right ascension [degrees] - `float`\n", - "* Declination [degrees] - `float`\n", - "* Quality Flag - `integer`, `float`, or `string`\n", - "* Subset Flag - `integer`, `float`, or `string`\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "train_goldenspike = pz_server.get_product(9)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "train_goldenspike.display_metadata()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Display basic statistics" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_goldenspike.data.describe()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Quick visualization of training set properties: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_goldenspike.plot(mag_name=\"mag_i_lsst\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "[back to the top](#notebook-contents)\n", - "\n", - "
\n", - "\n", - "## Photo-z Validation Results\n", - " \n", - "Validation Results are the outputs of any photo-z algorithm applied on a Validation Set. The format and number of files of this data product are strongly dependent on the algorithm used to create it, so there are no constraints on these two parameters. In the case of multiple files, for instance, if the user includes the results of training procedures (e.g., neural nets weights, decision trees files, or any machine learning by-product) or additional files (SED templates, filter transmission curves, theoretical magnitudes grid, Bayesian priors, etc.), it will be required to put all files together in a single compressed file (.zip or .tar, or .tar.gz) before uploading it to the Photo-z Server. \n", - "\n", - "### List Validation Results available on PZ Server" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.display_products_list(filters={\"product_type\": \"Validation Results\"})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Display metadata of a given data product of Photo-z Validation Results" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pz_server.display_product_metadata(\"11_goldenspike_flexzboost\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Retrieve a given Photo-z Validation Results: download file\n", - "\n", - "This product type is not necessarily (only) tabular data and can be a list of files. The methods `get_product` shown above just return the data to be used on memory and only supports single tabular files. To retrieve Photo-z Validation Results, you must download the data to open locally. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# pz_server.download_product(11, save_in=\".\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "
\n", - "\n", - "[back to the top](#notebook-contents)\n", - "\n", - "
\n", - "\n", - "### Photo-z Tables " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The Photo-z Tables are the results of photo-z estimation on photometrics samples. The data format is usually tabular, and might vary according to the phto-z estimation method used. \n", - "\n", - "The size limit for uploading files on the PZ Server is 200MB, therefore it does not support large Photo-z Tables such as the photo-zs of the LSST Objects catalog. The PZ Server can host small Photo-z Tables or, in case of large datasets, a data product can be registered to contain only the Photo-z Tables' metadata. For these cases, the instructions to find and access the data must be provided in the product's description. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# pz_server.download_product()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "--- \n", - "\n", - "### Users feedback \n", - "\n", - "Is something important missing? [Click here to open an issue in the PZ Server library repository on GitHub](https://github.com/linea-it/pzserver/issues/new). " - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.10" - "version": "3.10.10" - }, - "nbsphinx": { - "execute": "never" - }, - "vscode": { - "interpreter": { - "hash": "e9b653658693761946b8083bc5972c6593ddffeb81a0a81b81eabc816026cfc3" - } - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/docs/notebooks/intro_notebook.rst b/docs/notebooks/intro_notebook.rst deleted file mode 100644 index a86f529..0000000 --- a/docs/notebooks/intro_notebook.rst +++ /dev/null @@ -1,629 +0,0 @@ -Photo-z Server - Tutorial Notebook -================================== - -Contact author: `Julia Gschwend `__ - -Last verified run: 2024-Jan-23 - -.. raw:: html - -
- -Notebook contents -================= - -- PZ Server - - - `Introduction <#introduction>`__ - - `How to upload a data product to the PZ - Server <#how-to-upload-a-data-product-to-the-pz-server>`__ - - `How to download a data product from the PZ - Server <#how-to-download-a-data-product-from-the-pz-server>`__ - -- PZ Server API (Python library pz-server-lib) - - - `How to get general info from PZ - Server <#how-to-get-general-info-from-pz-server>`__ - - `How to display the metadata of a data - product <#how-to-display-the-metadata-of-a-data-product>`__ - - `How to download data products as .zip - files <#how-to-download-data-products-as-zip-files>`__ - - `How to share data products with other RSP - users <#how-to-share-data-products-with-other-rsp-users>`__ - - `How to retrieve contents of data products (work on - memory) <#how-to-retrieve-contents-of-data-products-work-on-memory>`__ - -- Product types - - - `Spec-z Catalogs <#spec-z-catalog>`__ - - `Training Sets <#training-sets>`__ - - `Photo-z Validation Results <#photo-z-validation-results>`__ - - `Photo-z Tables <#photo-z-tables>`__ - -The PZ Server -============= - -.. container:: - :name: introduction - -Introduction ------------- - -The Photo-z (PZ) Server is an online service available for the LSST -Community to host and share lightweight photo-z related data products. -The upload and download of data and metadata can be done at the website -`pz-server.linea.org.br `__ (during the -development phase, a test environment is available at -`pz-server-dev.linea.org.br `__). -There, you will find two separate pages containing a list of data -products each: one for LSST Data Management’s oficial data products, and -other for user-generated data products. **The registered data products -can also be accessed directly from Python code using the PZ Server’s -data access API, as demonstrated below.** - -The PZ Server is developed and delivered as part of the in-kind -contribution program BRA-LIN, from LIneA to the Rubin Observatory’s -LSST. The service is hosted in the Brazilian IDAC, not directly -connected to the `Rubin Science Platform -(RSP) `__. However, it requires RSP -credentials for user’s authentication. For a comprehensive documentation -about the PZ Server, please visit the `PZ Server’s documentation -page `__. There, you -will find also an overview of all LIneA’s contributions related to -Photo-zs. The internal documentation of the API functions is available -on the `API’s documentation -page `__. - -.. container:: - :name: how-to-upload-a-data-product-to-the-pz-server - - `back to the top <#notebook-contents>`__ - -How to upload a data product to the PZ Server ---------------------------------------------- - -To upload a data product, click on the button **NEW PRODUCT** on the top -left of the **User-generated Data Products** page and fill in the Upload -Form with relevant metadata. - -The photo-z-related products are organized into four categories (product -types): - -- **Spec-z Catalog:** Catalog of spectroscopic redshifts and positions - (usually equatorial coordinates). -- **Training Set:** Training set for photo-z algorithms (tabular data). - It usually contains magnitudes, errors, and true redshifts. -- **Photo-z Validation Results:** Results of a photo-z validation - procedure (free format). Usually contains photo-z estimates (single - estimates and/or pdf) of a validation set, photo-z validation - metrics, validation plots, etc. -- **Photo-z Table:** Results of a photo-z estimation procedure. Ideally - in the same format as the photo-z tables delivered by the DM as part - of the LSST data releases. If the data is larger than the file upload - limit (200MB), the product entry stores only the metadata (and - instructions on accessing the data should be provided in the - description field). - -.. container:: - - `back to the top <#notebook-contents>`__ - -How to download a data product from the PZ Server -------------------------------------------------- - -To download a data product available on the Photo-z Server, go to one of -the two pages by clicking on the card “LSST PZ Data Products” (for -official products released by LSST DM Team) or “User-generated Data -Products” (for products uploaded by the members of LSST community. The -download button is on the left side of each data product (each row of -the list). - -.. container:: - :name: how-to-download-a-data-product-from-the-pz-server - - `back to the top <#notebook-contents>`__ - -The PZ Server API (Python library pz-server-lib) -================================================ - -Installation ------------- - -**Using pip** - -The PZ Server API is avalialble on **pip** as ``pzserver``. To install -the API and its dependencies, type, on the Terminal: - -:: - - $ pip install pzserver - -**For developers** - -Alternatively, if you have cloned the repository with: - -:: - - $ git clone https://github.com/linea-it/pzserver.git - -To install the API and its dependencies, type: - -:: - - $ pip install -e . - $ pip install .[dev] - -OBS: You might need to restart the kernel on the notebook to incorporate -the new library. - -Imports and Setup ------------------ - -.. code:: ipython3 - - from pzserver import PzServer - import matplotlib.pyplot as plt - %reload_ext autoreload - %autoreload 2 - -The connection with the PZ Server from Python code is done by an object -of the class ``PzServer``. To get authorization to define an instance of -``PzServer``, the users must provide an **API Token** generated on the -top right menu on the `PZ Server -website `__ (during the development -phase, on the `test -environment `__). - - - -.. code:: ipython3 - - pz_server = PzServer(token="", host="pz-dev") # "pz-dev" is the temporary host for test phase - -.. container:: - :name: how-to-get-general-info-from-pz-server - - `back to the top <#notebook-contents>`__ - -How to get general info from PZ Server --------------------------------------- - -The object ``pz_server`` just created above can provide access to data -and metadata stored in the PZ Server. It also brings useful methods for -users to navigate through the available contents. The methods with the -preffix ``get_`` return the result of a query on the PZ Server database -as a Python dictionary, and are most useful to be used programatically -(see detaials on the `API documentation -page `__). -Alternatively, those with the preffix ``display_`` show the results as a -styled `Pandas -DataFrames `__, -optimized for Jupyter Notebook (note: column names might change in the -display version). For instance: - -Display the list of product types supported with a short description; - -.. code:: ipython3 - - pz_server.display_product_types() - -Display the list of users who uploaded data products to the server; - -.. code:: ipython3 - - pz_server.display_users() - -Display the list of data releases available at the time; - -.. code:: ipython3 - - pz_server.display_releases() - --------------- - -Display all data products available (WARNING: this list can rapdly grow -during the survey’s operation). - -.. code:: ipython3 - - pz_server.display_products_list() - -The information about product type, users, and releases shown above can -be used to filter the data products of interest for your search. For -that, the method ``list_products`` receives as argument a dictionary -mapping the products attributes to their values. - -.. code:: ipython3 - - pz_server.display_products_list(filters={"release": "LSST DP0", - "product_type": "Training Set"}) - -It also works if we type a string pattern that is part of the value. For -instance, just “DP0” instead of “LSST DP0”: - -.. code:: ipython3 - - pz_server.display_products_list(filters={"release": "DP0"}) - -It also allows the search for multiple strings by adding the suffix -``__or`` (two underscores + “or”) to the search key. For instance, to -get spec-z catalogs and training sets in the same search (notice that -filtering is not case sensitive): - -.. code:: ipython3 - - pz_server.display_products_list(filters={"product_type__or": ["Spec-z Catalog", "training set"]}) - -To fetch the results of a search and attribute to a variable, just -change the preffix ``display_`` by ``get_``, like this: - -.. code:: ipython3 - - search_results = pz_server.get_products_list(filters={"product_type": "results"}) # PZ Validation results - search_results - -.. container:: - :name: how-to-display-the-metadata-of-a-data-product - - `back to the top <#notebook-contents>`__ - -How to display the metadata of a data product ---------------------------------------------- - -The metadata of a given data product is the information provided by the -user on the upload form. This information is attached to the data -product contents and is available for consulting on the PZ Server page -or using this Python API (``pz-server-lib``). - -All data products stored on PZ Server are identified by a unique **id** -number or an unique name, a *string* called **internal_name**, which is -created automatically at the moment of the upload by concatenating the -product **id** to the name given by its owner (replacing blank spaces by -"_", lowering cases, and removing special characters). - -The ``PzServer``\ ’s method ``get_product_metadata()`` returns a -dictionary with the attibutes stored in the PZ Server about a given data -product identified by its **id** or **internal_name**. For use in a -Jupyter notebook, the equivalent ``display_product_metadata()`` shows -the results in a formated table. - -.. code:: ipython3 - - # pz_server.display_product_metadata() - # pz_server.display_product_metadata(6) - # pz_server.display_product_metadata("6") - pz_server.display_product_metadata("6_simple_training_set") - -.. container:: - :name: how-to-download-data-products-as-zip-files - - `back to the top <#notebook-contents>`__ - -How to download data products as .zip files -------------------------------------------- - -To download any data product stored in the PZ Server, use the -``PzServer``\ ’s method ``download_product`` informing the product’s -**internal_name** and the path to where it will be saved (the default is -the current folder). This method downloads a compressed .zip file which -contais all the files uploaded by the user, including data, anciliary -files and description files. The time spent to download a data product -depends on the internet connections between the user and the host. Let’s -try it with a small data product. - -.. code:: ipython3 - - pz_server.download_product(14, save_in=".") - -.. container:: - :name: how-to-share-data-products-with-other-rsp-users - - `back to the top <#notebook-contents>`__ - -How to share data products with other RSP users ------------------------------------------------ - -All data products uploaded to the PZ Server are imediately available and -visible to all PZ Server users (people with RSP credentials) through the -PZ Server website or via the API. Besides informing the product **id** -or **internal_name** for programatic access, another way to share a data -product is providing the product’s URL, which leads to the product’s -download page. The URL is composed by the PZ Server website address + -**/products/** + **internal_name**: - -https://pz-server.linea.org.br/product/ + **internal_name** - -or, if still in the development phase, - -https://pz-server-dev.linea.org.br/product/ + **internal_name** - -For example: - -https://pz-server-dev.linea.org.br/product/6_simple_training_set - -WARNING: The URL works only with the internal name, **not** with the -**id** number. - -.. container:: - :name: how-to-retrieve-contents-of-data-products-work-on-memory - - `back to the top <#notebook-contents>`__ - -How to retrieve contents of data products (work on memory) ----------------------------------------------------------- - -Another feature of the PZ Server API is to let users retrieve the -contents of a given data product to work on memory (by atributing the -results of the method ``get_product()`` to a variable in the code). This -feature is available only for tabular data (product types: **Spec-z -Catalog** and **Training Set**). - -By default, the method ``get_product`` returns an object from a -particular class, depending on the product’s type. The classes -``SpeczCatalog`` and ``TrainingSet`` are simple extensions of -``pandas.DataFrame`` (via class composition) with a couple of additional -attributes and methods, such as the attribute ``metadata``, and the -method ``display_metadata()``. Let’s see an example: - -.. code:: ipython3 - - catalog = pz_server.get_product(8) - catalog - -.. code:: ipython3 - - catalog.display_metadata() - -The tabular data is alocated in the attribute ``data``, which is a -``pandas.DataFrame``. - -.. code:: ipython3 - - catalog.data - -.. code:: ipython3 - - type(catalog.data) - -It preserves the useful methods from ``pandas.DataFrame``, such as: - -.. code:: ipython3 - - catalog.data.info() - -.. code:: ipython3 - - catalog.data.describe() - -In the prod-types you will see more details about these specific -classes. For those who prefer working with ``astropy.Table`` or pure -``pandas.DataFrame``, the method ``get_product()`` gives the flexibility -to choose the output format (``fmt="pandas"`` or ``fmt="astropy"``). - -.. code:: ipython3 - - dataframe = pz_server.get_product(8, fmt="pandas") - print(type(dataframe)) - dataframe - -.. code:: ipython3 - - table = pz_server.get_product(8, fmt="astropy") - print(type(table)) - table - --------------- - -Clean up - -.. code:: ipython3 - - del search_results, catalog, dataframe, table - --------------- - -.. container:: - - `back to the top <#notebook-contents>`__ - -Product types -============= - -The PZ Server API provides Python classes with useful methods to handle -particular product types. Let’s recap the product types available: - -.. code:: ipython3 - - pz_server.display_product_types() - -.. container:: - :name: spec-z-catalog - - `back to the top <#notebook-contents>`__ - -Spec-z Catalog --------------- - -In the context of the PZ Server, Spec-z Catalogs are defined as any -catalog containing spherical equatorial coordinates and spectroscopic -redshift measurements (or, analogously, true redshifts from -simulations). A Spec-z Catalog can include data from a single -spectroscopic survey or a combination of data from several sources. To -be considered as a single Spec-z Catalog, the data should be provided as -a single file to PZ Server’s the upload tool. For multi-survey catalogs, -it is recommended to add the survey name or identification as an extra -column. - -Mandatory columns: \* Right ascension [degrees] - ``float`` \* -Declination [degrees] - ``float`` \* Spectroscopic or true redshift - -``float`` - -Recommended columns: \* Spectroscopic redshift error - ``float`` \* -Quality flag - ``integer``, ``float``, or ``string`` \* Survey name -(recommended for compilations of data from different surveys) - -Let’s see an example of Spec-z Catalog: - -.. code:: ipython3 - - gama = pz_server.get_product(14) - -.. code:: ipython3 - - gama.display_metadata() - -Display basic statistics - -.. code:: ipython3 - - gama.data.describe() - -The spec-z catalog object has a very basic plot method for quick -visualization of catalog properties - -.. code:: ipython3 - - gama.plot() - -The attribute ``data``, which is a ``DataFrame`` preserves the ``plot`` -method from Pandas. - -.. code:: ipython3 - - gama.data.plot(x="RA", y="DEC", kind="scatter") - -.. container:: - :name: training-sets - - `back to the top <#notebook-contents>`__ - -Training Sets -------------- - -In the context of the PZ Server, Training Sets are defined as the -product of matching (spatially) a given Spec-z Catalog (single survey or -compilation) to the photometric data, in this case, the LSST Objects -Catalog. The PZ Server API offers a tool called *Training Set Maker* for -users to build customized Training Sets based on the Spec-z Catalogs -available. Please see the companion Jupyter Notebook -``pz_tsm_tutorial.ipynb`` for details. - -*Note 1: Commonly the training set is split into two or more subsets for -photo-z validation purposes. If the Training Set owner has previously -defined which objects should belong to each subset (trainining and -validation/test sets), this information must be available as an extra -column in the table or as clear instructions for reproducing the subsets -separation in the data product description.* - -*Note 2: The PZ Server only supports catalog-level Training Sets. -Image-based Training Sets, e.g., for deep-learning algorithms, are not -supported yet.* - -Mandatory column: \* Spectroscopic (or true) redshift - ``float`` - -Other expected columns \* Object ID from LSST Objects Catalog - -``integer`` \* Observables: magnitudes (and/or colors, or fluxes) from -LSST Objects Catalog - ``float`` \* Observable errors: magnitude errors -(and/or color errors, or flux errors) from LSST Objects Catalog - -``float`` \* Right ascension [degrees] - ``float`` \* Declination -[degrees] - ``float`` \* Quality Flag - ``integer``, ``float``, or -``string`` \* Subset Flag - ``integer``, ``float``, or ``string`` - -.. code:: ipython3 - - train_goldenspike = pz_server.get_product(9) - -.. code:: ipython3 - - train_goldenspike.display_metadata() - -Display basic statistics - -.. code:: ipython3 - - train_goldenspike.data.describe() - -Quick visualization of training set properties: - -.. code:: ipython3 - - train_goldenspike.plot(mag_name="mag_i_lsst") - -.. container:: - :name: photo-z-validation-results - - `back to the top <#notebook-contents>`__ - -Photo-z Validation Results --------------------------- - -Validation Results are the outputs of any photo-z algorithm applied on a -Validation Set. The format and number of files of this data product are -strongly dependent on the algorithm used to create it, so there are no -constraints on these two parameters. In the case of multiple files, for -instance, if the user includes the results of training procedures (e.g., -neural nets weights, decision trees files, or any machine learning -by-product) or additional files (SED templates, filter transmission -curves, theoretical magnitudes grid, Bayesian priors, etc.), it will be -required to put all files together in a single compressed file (.zip or -.tar, or .tar.gz) before uploading it to the Photo-z Server. - -List Validation Results available on PZ Server -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code:: ipython3 - - pz_server.display_products_list(filters={"product_type": "Validation Results"}) - -Display metadata of a given data product of Photo-z Validation Results -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code:: ipython3 - - pz_server.display_product_metadata("11_goldenspike_flexzboost") - -Retrieve a given Photo-z Validation Results: download file -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -This product type is not necessarily (only) tabular data and can be a -list of files. The methods ``get_product`` shown above just return the -data to be used on memory and only supports single tabular files. To -retrieve Photo-z Validation Results, you must download the data to open -locally. - -.. code:: ipython3 - - # pz_server.download_product(11, save_in=".") - -.. container:: - :name: photo-z-tables - - `back to the top <#notebook-contents>`__ - -Photo-z Tables -~~~~~~~~~~~~~~ - -The Photo-z Tables are the results of photo-z estimation on photometrics -samples. The data format is usually tabular, and might vary according to -the phto-z estimation method used. - -The size limit for uploading files on the PZ Server is 200MB, therefore -it does not support large Photo-z Tables such as the photo-zs of the -LSST Objects catalog. The PZ Server can host small Photo-z Tables or, in -case of large datasets, a data product can be registered to contain only -the Photo-z Tables’ metadata. For these cases, the instructions to find -and access the data must be provided in the product’s description. - -.. code:: ipython3 - - # pz_server.download_product() - --------------- - -Users feedback -~~~~~~~~~~~~~~ - -Is something important missing? `Click here to open an issue in the PZ -Server library repository on -GitHub `__. - diff --git a/docs/notebooks/rst/0_introduction.rst b/docs/notebooks/rst/0_introduction.rst new file mode 100644 index 0000000..8b9e8e1 --- /dev/null +++ b/docs/notebooks/rst/0_introduction.rst @@ -0,0 +1,2403 @@ +Photo-z Server +============== + +Tutorial Notebook 0 - Introduction +---------------------------------- + +Contact author: `Julia Gschwend `__ + +Last verified run: **2024-Jul-09** + +.. raw:: html + +
+ +Notebook contents +================= + +- PZ Server + + - `Introduction <#introduction>`__ + - `How to upload a data product to the PZ + Server <#how-to-upload-a-data-product-to-the-pz-server>`__ + - `How to download a data product from the PZ + Server <#how-to-download-a-data-product-from-the-pz-server>`__ + +- PZ Server API (Python library pzserver) + + - `How to get general info from PZ + Server <#how-to-get-general-info-from-pz-server>`__ + - `How to display the metadata of a data + product <#how-to-display-the-metadata-of-a-data-product>`__ + - `How to download data products as .zip + files <#how-to-download-data-products-as-zip-files>`__ + - `How to share data products with other RSP + users <#how-to-share-data-products-with-other-rsp-users>`__ + - `How to retrieve contents of data products (work on + memory) <#how-to-retrieve-contents-of-data-products-work-on-memory>`__ + +The PZ Server +============= + +.. container:: + :name: introduction + +Introduction +------------ + +The PZ Server is an online service available for the LSST Community to +host and share lightweight photo-z (PZ) -related data products. The +upload and download of data and metadata can be done at the website +`pz-server.linea.org.br `__ (during the +development phase, a test environment is available at +`pz-server-dev.linea.org.br `__). +There, you will find two separate pages containing a list of data +products each: one for Rubin Observatory Data Management’s official data +products and the other for user-generated data products. **The +registered data products can also be accessed directly from Python code +using the PZ Server’s data access API, as demonstrated below.** + +The PZ Server is developed and delivered as part of the in-kind +contribution program BRA-LIN, from LIneA to the Rubin Observatory’s LSST +project. The service is hosted in the Brazilian IDAC, not directly +connected to the `Rubin Science Platform +(RSP) `__. However, user authorization +requires the same credentials as RSP. For comprehensive documentation +about the PZ Server, please visit the `PZ Server’s documentation +page `__. There, you +will also find an overview of all LIneA’s contributions related to the +PZ production. The internal documentation of the API functions is +available on the `API’s documentation +page `__. + +How to upload a data product on the PZ Server website +----------------------------------------------------- + +To upload a data product, click on the button **NEW PRODUCT** on the top +left of the **User-generated Data Products** page and fill in the Upload +Form with relevant metadata. Alternatively, the user can upload files to +the PZ Server programmatically via the ``pzserver`` Python Library +(described below). + +The photo-z-related products are organized into four categories (product +types): + +- **Spec-z Catalog:** Catalog of spectroscopic redshifts and positions + (usually equatorial coordinates). +- **Training Set:** Sample for training photo-z algorithms (tabular + data). It usually contains magnitudes, errors, and true redshifts. +- **Photo-z Validation Results:** The Results of a photo-z validation + procedure (free format). They usually contain photo-z estimates + (single estimates and/or PDFs) of a validation set, photo-z + validation metrics, validation plots, etc. +- Photo-z Table: This category is for the results of a photo-z + estimation procedure. Ideally, the data should be in the same format + as the photo-z tables delivered by the DM as part of the LSST data + releases. If the data is larger than the file upload limit (200MB), + the product entry will store only the metadata, and instructions on + accessing the data should be provided in the description field. + Storage space can be provided exceptionally for larger tables, + depending on the science project justification (to be evaluated by + IDAC’s management committee). + +How to download a data product from the PZ Server website +--------------------------------------------------------- + +To download a data product available on the Photo-z Server, go to one of +the two pages by clicking on the card “Rubin Observatory PZ Data +Products” (for official products released by Rubin Data Management Team) +or “User-generated Data Products” (for products uploaded by the members +of LSST community). The **download** button is on the right side of each +data product (each row of the list). Also, there are buttons to +**share**, **remove**, and **edit** the metadata of a given data +product. + +.. raw:: html + +
+ +.. raw:: html + +
+ +The PZ Server API (Python library ``pzserver``) +=============================================== + +Installation +~~~~~~~~~~~~ + +**For regular users** + +The PZ Server API is avalialble on **pip** as ``pzserver``. To install +the API and its dependencies, type, on the Terminal: + +:: + + $ pip install pzserver + +.. code:: ipython3 + + ! pip install pzserver + +**For developers** + +Alternatively, if you have cloned the repository with: + +:: + + $ git clone https://github.com/linea-it/pzserver.git + +To install the API and its dependencies, type: + +:: + + $ pip install .[dev] + +-------------- + +OBS: You might need to restart the kernel on the notebook to incorporate +the new library. + +Imports and Setup +~~~~~~~~~~~~~~~~~ + +.. code:: ipython3 + + from pzserver import PzServer + import matplotlib.pyplot as plt + %reload_ext autoreload + %autoreload 2 + +The connection with the PZ Server from Python code is done by an object +of the class ``PzServer``. To get authorization to define an instance of +``PzServer``, the users must provide an **API Token** generated on the +top right menu on the `PZ Server +website `__ (during the development +phase, on the `test +environment `__). + + + +.. code:: ipython3 + + # pz_server = PzServer(token="", host="pz-dev") # "pz-dev" is the temporary host for test phase + +For convenience, the token can be saved into a file named as +``token.txt`` (which is already listed in the .gitignore file in this +repository). + +.. code:: ipython3 + + with open('token.txt', 'r') as file: + token = file.read() + pz_server = PzServer(token=token, host="pz-dev") # "pz-dev" is the temporary host for test phase + +How to get general info from PZ Server +-------------------------------------- + +The object ``pz_server`` just created above can provide access to data +and metadata stored in the PZ Server. It also brings useful methods for +users to navigate through the available contents. The methods with the +preffix ``get_`` return the result of a query on the PZ Server database +as a Python dictionary, and are most useful to be used programatically +(see detaials on the `API documentation +page `__). +Alternatively, those with the preffix ``display_`` show the results as a +styled `Pandas +DataFrames `__, +optimized for Jupyter Notebook (note: column names might change in the +display version). For instance: + +Display the list of product types supported with a short description; + +.. code:: ipython3 + + pz_server.display_product_types() + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Product Typeproduct_typeDescription
Spec-z Catalogspecz_catalogCatalog of spectroscopic redshifts and positions (usually equatorial coordinates).
Training Settraining_setTraining set for photo-z algorithms (tabular data). It usually contains magnitudes, errors, and true redshifts.
Validation Resultsvalidation_resultsResults of a photo-z validation procedure (free format). Usually contains photo-z estimates (single estimates and/or pdf) of a validation set and photo-z validation metrics.
Photo-z Tablephotoz_tableResults of a photo-z estimation procedure. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (instructions on accessing the data should be provided in the description field.
+ + + +Display the list of users who uploaded data products to the server; + +.. code:: ipython3 + + pz_server.display_users() + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
GitHub UsernameName
andreiadouradoDourado
Biancasilva9Silva
bruno-moraesMoraes
crisingulaniSingulani
deborajaniniJanini
drewoldagDrew Oldag
fpcardosoCardoso
glaubervilaGlauber Costa Vila-Verde
GloriaFAGloria Fonseca Alvarez
gschwendGschwend
gverde
hdanteHenrique
HeloisaMengisztkiHeloisaMengisztki
jandsonrjVitorino
leandrops19
luigilcsilvaSilva
luiz-nicolaci
MelissaGrahamMelissa Graham
saravizAviz
singulani
+ + + +Display the list of data releases available at the time; + +.. code:: ipython3 + + pz_server.display_releases() + + + +.. raw:: html + + + + + + + + + + + + + + + +
ReleaseDescription
LSST DP0LSST Data Preview 0
+ + + +-------------- + +Display all data products available (WARNING: this list can rapdly grow +during the survey’s operation). + +.. code:: ipython3 + + pz_server.display_products_list() + + + +.. raw:: html
idinternal_nameproduct_nameproduct_typereleaseuploaded_byofficial_productpz_codedescriptioncreated_at
8181_singo999singo999Training SetNonecrisingulaniFalseNoneNone2024-07-04T16:08:23.116849Z
8080_example_upload_via_libexample upload via libSpec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:58:58.075926Z
7979_example_upload_via_libexample upload via libSpec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:57:17.809438Z
7878_example_upload_via_libexample upload via libSpec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:40:29.732156Z
7777_upload_example_1upload example 1Spec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:37:29.099822Z
7676_upload_example_1upload example 1Spec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:18:43.371144Z
7575_upload_example_1upload example 1Spec-z CatalogNonegschwendFalseNoneNone2024-06-17T19:36:50.416031Z
7373_tpz_resultsTPZ ResultsValidation ResultsNoneandreiadouradoFalseResults of photoz validation using TPZ lite on simulated training set from DC2 TruthSummary table. Files: 03_run_tpz. html -> jupyter notebook (HTML version) with algorithm train; 04_metrics.html -> jupyter notebook (HTML version) with results analysis; model.pkl -> model generated in the "inform method"; output.hdf5 -> "estimate stage" results (PDFs).2024-06-06T23:20:04.439030Z
7272_pzcompute_results_for_qa_validationPZ-Compute Results for QA ValidationValidation ResultsLSST DP0HeloisaMengisztkiFalseThis zip contains two files: validation_set.hdf5 with the data input to run estimate, contains the redshift values so that it can be used as the truth file. And the validation_set_output.hdf5 is the output after running estimate, with the computed photoz for fzboost algorithm.2024-06-05T18:57:27.428106Z
6464_training_set_lsst_dp02Training set LSST DP0.2Training SetNoneandreiadouradoFalseSimulated training set from DC2 TruthSummary table. Random data (random_data.hdf5): table with the true magnitudes used to create the simulated set.2024-05-21T14:42:47.340619Z
6363_specz_sampleSpec-z sample LSST DP0.2Spec-z CatalogNoneandreiadouradoFalseSpec-z sample created from a random fraction of object Ids from Object Table.2024-05-21T13:36:19.884481Z
5252_2dflens_public_specz2dFLenS Public spec-zSpec-z CatalogNonesaravizFalseSample containing the 2dFLenS spec-z data contained in the original file Public spec-z compilation2024-04-08T14:21:46.577298Z
5151_zcosmos_public_speczzCOSMOS Public spec-zSpec-z CatalogNonesaravizFalseSample containing the zCOSMOS spec-z data contained in the original file Public spec-z compilation2024-04-07T23:06:40.185605Z
5050_vipers_public_speczVIPERS Public spec-zSpec-z CatalogNonesaravizFalseSample containing the VIPERS spec-z data contained in the original file Public spec-z compilation2024-04-07T23:05:10.825559Z
4949_sdss_dr16_public_speczSDSS (DR16) Public spec-zSpec-z CatalogNonesaravizFalseSample containing the SDSS spec-z data contained in the original file Public spec-z compilation2024-04-07T20:14:23.831347Z
4848_saga_public_speczSAGA Public spec-zSpec-z CatalogNonesaravizFalseSample containing the SAGA spec-z data contained in the original file Public spec-z compilation2024-04-07T19:39:01.003263Z
4747_glass_public_speczGLASS Public spec-zSpec-z CatalogNonesaravizFalseSample containing the GLASS spec-z data contained in the original file Public spec-z compilation.2024-04-07T19:20:41.913016Z
4545_gama_public_speczGAMA Public spec-zSpec-z CatalogNonesaravizFalseSample containing the GAMA spec-z data contained in the original file Public spec-z compilation2024-04-03T10:19:00.379907Z
4444_deep2_public_speczDEEP2 Public spec-zSpec-z CatalogNonesaravizFalseSample containing the DEEP2 spec-z data contained in the original file Public spec-z compilation2024-03-31T22:10:01.314578Z
4242_3dhst_public_specz3DHST Public spec-zSpec-z CatalogLSST DP0HeloisaMengisztkiFalseSample containing the 3DHST spec-z data contained in the original file Public spec-z compilation.2024-03-27T23:20:29.545013Z
4141_deimos_10k_public_speczDEIMOS 10K Public spec-zSpec-z CatalogNoneluigilcsilvaFalseSample containing the DEIMOS 10K spec-z data contained in the original file Public spec-z compilation.2024-03-27T19:29:59.552926Z
3333_simple_pz_training_setSimple pz training setTraining SetLSST DP0GloriaFAFalseSimple training set produced by https://github.com/rubin-dp0/delegate-contributions-dp02/blob/main/photoz/Training_Set_Creation/simple_pz_training_set.ipynb, developed by Melissa Graham.2024-02-28T07:00:41.119818Z
2828_dc2_tiny_true_z_sampleDC2 Tiny true z sampleSpec-z CatalogNonegschwendFalseA small sample with 16917 redshifts retrieved from RSP cloud.2023-11-29T20:30:26.900286Z
2727_public_training_set_des_dr2Public Training Set DES DR2Training SetNonegschwendFalseResult of cross-matching the public spec-z compilation with DES DR2 coadd objects catalog.2023-10-17T21:32:21.727199Z
2626_public_specz_compilationPublic spec-z compilationSpec-z CatalogNonegschwendFalseA compilation of public spec-z catalogs collected over the years of operation of the Dark Energy Survey (DES) and systematically grouped by a DES Science Portal tool to form the basis of a training set for photo-z algorithms based on machine learning.2023-10-17T21:29:08.341090Z
1414_gama_specz_subsampleGAMA spec-z subsampleSpec-z CatalogNonegschwendFalseA small subsample of the GAMA DR3 spec-z catalog (Baldry et al. 2018) as an example of a typical spec-z catalog from the literature.2023-03-29T20:02:45.223568Z
1313_vvds_specz_subsampleVVDS spec-z subsampleSpec-z CatalogNonegschwendFalseA small subsample of the VVDS spec-z catalog (Le Fèvre et al. 2004, Garilli et al. 2008) as an example of a typical spec-z catalog from the literature.2023-03-29T19:50:27.593735Z
1212_goldenspike_knnGoldenspike KNNValidation ResultsNonegschwendFalseKNNResults of photoz validation using KNN on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.2023-03-29T19:49:35.652295Z
1111_goldenspike_flexzboostGoldenspike FlexZBoostValidation ResultsNonegschwendFalseFlexZBoostResults of photoz validation using FlexZBoost on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.2023-03-29T19:48:34.864629Z
1010_goldenspike_bpzGoldenspike BPZValidation ResultsLSST DP0gschwendFalseBPZResults of photoz validation using BPZ on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.2023-03-29T19:42:04.424990Z
99_goldenspike_train_data_hdf5Goldenspike train data hdf5Training SetNonegschwendFalseA mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. + Test upload of files in hdf5 format.2023-03-29T19:12:59.746096Z
88_goldenspike_train_data_fitsGoldenspike train data fitsTraining SetNonegschwendFalseA mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. + Test upload of files in fits format.2023-03-29T19:09:12.958883Z
77_goldenspike_train_data_parquetGoldenspike train data parquetTraining SetNonegschwendFalseA mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in parquet format.2023-03-29T19:06:58.473920Z
66_simple_training_setSimple training setTraining SetLSST DP0gschwendFalseA simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms.2023-03-23T19:46:48.807872Z
11_simple_true_z_catalogSimple true z catalogSpec-z CatalogNonegschwendFalseA simple example of a spectroscopic (true) redshifts catalog created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains only coordinates and redshifts, as an illustration of a typical spec-z catalog.2023-03-23T13:19:32.050795Z
+ + + +The information about product type, users, and releases shown above can +be used to filter the data products of interest for your search. For +that, the method ``list_products`` receives as argument a dictionary +mapping the products attributes to their values. + +.. code:: ipython3 + + pz_server.display_products_list(filters={"release": "LSST DP0", + "product_type": "Training Set"}) + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
idinternal_nameproduct_nameproduct_typereleaseuploaded_byofficial_productpz_codedescriptioncreated_at
3333_simple_pz_training_setSimple pz training setTraining SetLSST DP0GloriaFAFalseSimple training set produced by https://github.com/rubin-dp0/delegate-contributions-dp02/blob/main/photoz/Training_Set_Creation/simple_pz_training_set.ipynb, developed by Melissa Graham.2024-02-28T07:00:41.119818Z
66_simple_training_setSimple training setTraining SetLSST DP0gschwendFalseA simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms.2023-03-23T19:46:48.807872Z
+ + + +It also works if we type a string pattern that is part of the value. For +instance, just “DP0” instead of “LSST DP0”: + +.. code:: ipython3 + + pz_server.display_products_list(filters={"release": "DP0"}) + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
idinternal_nameproduct_nameproduct_typereleaseuploaded_byofficial_productpz_codedescriptioncreated_at
7272_pzcompute_results_for_qa_validationPZ-Compute Results for QA ValidationValidation ResultsLSST DP0HeloisaMengisztkiFalseThis zip contains two files: validation_set.hdf5 with the data input to run estimate, contains the redshift values so that it can be used as the truth file. And the validation_set_output.hdf5 is the output after running estimate, with the computed photoz for fzboost algorithm.2024-06-05T18:57:27.428106Z
4242_3dhst_public_specz3DHST Public spec-zSpec-z CatalogLSST DP0HeloisaMengisztkiFalseSample containing the 3DHST spec-z data contained in the original file Public spec-z compilation.2024-03-27T23:20:29.545013Z
3333_simple_pz_training_setSimple pz training setTraining SetLSST DP0GloriaFAFalseSimple training set produced by https://github.com/rubin-dp0/delegate-contributions-dp02/blob/main/photoz/Training_Set_Creation/simple_pz_training_set.ipynb, developed by Melissa Graham.2024-02-28T07:00:41.119818Z
1010_goldenspike_bpzGoldenspike BPZValidation ResultsLSST DP0gschwendFalseBPZResults of photoz validation using BPZ on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.2023-03-29T19:42:04.424990Z
66_simple_training_setSimple training setTraining SetLSST DP0gschwendFalseA simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms.2023-03-23T19:46:48.807872Z
+ + + +It also allows the search for multiple strings by adding the suffix +``__or`` (two underscores + “or”) to the search key. For instance, to +get spec-z catalogs and training sets in the same search (notice that +filtering is not case sensitive): + +.. code:: ipython3 + + pz_server.display_products_list(filters={"product_type__or": ["Spec-z Catalog", "training set"]}) + + + +.. raw:: html
idinternal_nameproduct_nameproduct_typereleaseuploaded_byofficial_productpz_codedescriptioncreated_at
8181_singo999singo999Training SetNonecrisingulaniFalseNoneNone2024-07-04T16:08:23.116849Z
8080_example_upload_via_libexample upload via libSpec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:58:58.075926Z
7979_example_upload_via_libexample upload via libSpec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:57:17.809438Z
7878_example_upload_via_libexample upload via libSpec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:40:29.732156Z
7777_upload_example_1upload example 1Spec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:37:29.099822Z
7676_upload_example_1upload example 1Spec-z CatalogNonegschwendFalseNoneNone2024-07-04T15:18:43.371144Z
7575_upload_example_1upload example 1Spec-z CatalogNonegschwendFalseNoneNone2024-06-17T19:36:50.416031Z
6464_training_set_lsst_dp02Training set LSST DP0.2Training SetNoneandreiadouradoFalseSimulated training set from DC2 TruthSummary table. Random data (random_data.hdf5): table with the true magnitudes used to create the simulated set.2024-05-21T14:42:47.340619Z
6363_specz_sampleSpec-z sample LSST DP0.2Spec-z CatalogNoneandreiadouradoFalseSpec-z sample created from a random fraction of object Ids from Object Table.2024-05-21T13:36:19.884481Z
5252_2dflens_public_specz2dFLenS Public spec-zSpec-z CatalogNonesaravizFalseSample containing the 2dFLenS spec-z data contained in the original file Public spec-z compilation2024-04-08T14:21:46.577298Z
5151_zcosmos_public_speczzCOSMOS Public spec-zSpec-z CatalogNonesaravizFalseSample containing the zCOSMOS spec-z data contained in the original file Public spec-z compilation2024-04-07T23:06:40.185605Z
5050_vipers_public_speczVIPERS Public spec-zSpec-z CatalogNonesaravizFalseSample containing the VIPERS spec-z data contained in the original file Public spec-z compilation2024-04-07T23:05:10.825559Z
4949_sdss_dr16_public_speczSDSS (DR16) Public spec-zSpec-z CatalogNonesaravizFalseSample containing the SDSS spec-z data contained in the original file Public spec-z compilation2024-04-07T20:14:23.831347Z
4848_saga_public_speczSAGA Public spec-zSpec-z CatalogNonesaravizFalseSample containing the SAGA spec-z data contained in the original file Public spec-z compilation2024-04-07T19:39:01.003263Z
4747_glass_public_speczGLASS Public spec-zSpec-z CatalogNonesaravizFalseSample containing the GLASS spec-z data contained in the original file Public spec-z compilation.2024-04-07T19:20:41.913016Z
4545_gama_public_speczGAMA Public spec-zSpec-z CatalogNonesaravizFalseSample containing the GAMA spec-z data contained in the original file Public spec-z compilation2024-04-03T10:19:00.379907Z
4444_deep2_public_speczDEEP2 Public spec-zSpec-z CatalogNonesaravizFalseSample containing the DEEP2 spec-z data contained in the original file Public spec-z compilation2024-03-31T22:10:01.314578Z
4242_3dhst_public_specz3DHST Public spec-zSpec-z CatalogLSST DP0HeloisaMengisztkiFalseSample containing the 3DHST spec-z data contained in the original file Public spec-z compilation.2024-03-27T23:20:29.545013Z
4141_deimos_10k_public_speczDEIMOS 10K Public spec-zSpec-z CatalogNoneluigilcsilvaFalseSample containing the DEIMOS 10K spec-z data contained in the original file Public spec-z compilation.2024-03-27T19:29:59.552926Z
3333_simple_pz_training_setSimple pz training setTraining SetLSST DP0GloriaFAFalseSimple training set produced by https://github.com/rubin-dp0/delegate-contributions-dp02/blob/main/photoz/Training_Set_Creation/simple_pz_training_set.ipynb, developed by Melissa Graham.2024-02-28T07:00:41.119818Z
2828_dc2_tiny_true_z_sampleDC2 Tiny true z sampleSpec-z CatalogNonegschwendFalseA small sample with 16917 redshifts retrieved from RSP cloud.2023-11-29T20:30:26.900286Z
2727_public_training_set_des_dr2Public Training Set DES DR2Training SetNonegschwendFalseResult of cross-matching the public spec-z compilation with DES DR2 coadd objects catalog.2023-10-17T21:32:21.727199Z
2626_public_specz_compilationPublic spec-z compilationSpec-z CatalogNonegschwendFalseA compilation of public spec-z catalogs collected over the years of operation of the Dark Energy Survey (DES) and systematically grouped by a DES Science Portal tool to form the basis of a training set for photo-z algorithms based on machine learning.2023-10-17T21:29:08.341090Z
1414_gama_specz_subsampleGAMA spec-z subsampleSpec-z CatalogNonegschwendFalseA small subsample of the GAMA DR3 spec-z catalog (Baldry et al. 2018) as an example of a typical spec-z catalog from the literature.2023-03-29T20:02:45.223568Z
1313_vvds_specz_subsampleVVDS spec-z subsampleSpec-z CatalogNonegschwendFalseA small subsample of the VVDS spec-z catalog (Le Fèvre et al. 2004, Garilli et al. 2008) as an example of a typical spec-z catalog from the literature.2023-03-29T19:50:27.593735Z
99_goldenspike_train_data_hdf5Goldenspike train data hdf5Training SetNonegschwendFalseA mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. + Test upload of files in hdf5 format.2023-03-29T19:12:59.746096Z
88_goldenspike_train_data_fitsGoldenspike train data fitsTraining SetNonegschwendFalseA mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. + Test upload of files in fits format.2023-03-29T19:09:12.958883Z
77_goldenspike_train_data_parquetGoldenspike train data parquetTraining SetNonegschwendFalseA mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in parquet format.2023-03-29T19:06:58.473920Z
66_simple_training_setSimple training setTraining SetLSST DP0gschwendFalseA simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms.2023-03-23T19:46:48.807872Z
11_simple_true_z_catalogSimple true z catalogSpec-z CatalogNonegschwendFalseA simple example of a spectroscopic (true) redshifts catalog created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains only coordinates and redshifts, as an illustration of a typical spec-z catalog.2023-03-23T13:19:32.050795Z
+ + + +To fetch the results of a search and attribute to a variable, just +change the preffix ``display_`` by ``get_``, like this: + +.. code:: ipython3 + + search_results = pz_server.get_products_list(filters={"product_type": "results"}) # PZ Validation results + search_results + + + + +.. parsed-literal:: + + [{'id': 73, + 'release': None, + 'release_name': None, + 'product_type': 3, + 'product_type_name': 'Validation Results', + 'uploaded_by': 'andreiadourado', + 'is_owner': False, + 'can_delete': False, + 'can_update': False, + 'internal_name': '73_tpz_results', + 'display_name': 'TPZ Results', + 'official_product': False, + 'pz_code': '', + 'description': 'Results of photoz validation using TPZ lite on simulated training set from DC2 TruthSummary table. Files: 03_run_tpz. html -> jupyter notebook (HTML version) with algorithm train; 04_metrics.html -> jupyter notebook (HTML version) with results analysis; model.pkl -> model generated in the "inform method"; output.hdf5 -> "estimate stage" results (PDFs).', + 'created_at': '2024-06-06T23:20:04.439030Z', + 'updated_at': '2024-06-13T16:25:49.052608Z', + 'status': 1}, + {'id': 72, + 'release': 1, + 'release_name': 'LSST DP0', + 'product_type': 3, + 'product_type_name': 'Validation Results', + 'uploaded_by': 'HeloisaMengisztki', + 'is_owner': False, + 'can_delete': False, + 'can_update': False, + 'internal_name': '72_pzcompute_results_for_qa_validation', + 'display_name': 'PZ-Compute Results for QA Validation', + 'official_product': False, + 'pz_code': '', + 'description': 'This zip contains two files: validation_set.hdf5 with the data input to run estimate, contains the redshift values so that it can be used as the truth file. And the validation_set_output.hdf5 is the output after running estimate, with the computed photoz for fzboost algorithm.', + 'created_at': '2024-06-05T18:57:27.428106Z', + 'updated_at': '2024-06-06T16:22:33.231383Z', + 'status': 1}, + {'id': 12, + 'release': None, + 'release_name': None, + 'product_type': 3, + 'product_type_name': 'Validation Results', + 'uploaded_by': 'gschwend', + 'is_owner': True, + 'can_delete': True, + 'can_update': True, + 'internal_name': '12_goldenspike_knn', + 'display_name': 'Goldenspike KNN', + 'official_product': False, + 'pz_code': 'KNN', + 'description': "Results of photoz validation using KNN on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.", + 'created_at': '2023-03-29T19:49:35.652295Z', + 'updated_at': '2023-12-18T16:46:36.422893Z', + 'status': 1}, + {'id': 11, + 'release': None, + 'release_name': None, + 'product_type': 3, + 'product_type_name': 'Validation Results', + 'uploaded_by': 'gschwend', + 'is_owner': True, + 'can_delete': True, + 'can_update': True, + 'internal_name': '11_goldenspike_flexzboost', + 'display_name': 'Goldenspike FlexZBoost', + 'official_product': False, + 'pz_code': 'FlexZBoost', + 'description': "Results of photoz validation using FlexZBoost on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.", + 'created_at': '2023-03-29T19:48:34.864629Z', + 'updated_at': '2023-12-18T16:46:36.422893Z', + 'status': 1}, + {'id': 10, + 'release': 1, + 'release_name': 'LSST DP0', + 'product_type': 3, + 'product_type_name': 'Validation Results', + 'uploaded_by': 'gschwend', + 'is_owner': True, + 'can_delete': True, + 'can_update': True, + 'internal_name': '10_goldenspike_bpz', + 'display_name': 'Goldenspike BPZ', + 'official_product': False, + 'pz_code': 'BPZ', + 'description': "Results of photoz validation using BPZ on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.", + 'created_at': '2023-03-29T19:42:04.424990Z', + 'updated_at': '2023-12-18T16:46:36.422893Z', + 'status': 1}] + + + +How to upload a data product to via Python API (alternative method) +------------------------------------------------------------------- + +The default method to upload a data product to the PZ Server is the +upload tool on PZ Server website, as shown above. Alternatively, data +products can be sent to the host service using the ``pzserver`` Python +library. + +First, prepare a dictionary with the relevant information about your +data product: + +.. code:: ipython3 + + data_to_upload = { + "name":"example upload via lib", + "product_type": "specz_catalog", # Product type + "release": None, # LSST release, use None if not LSST data + "main_file": "example.csv", # full path + "auxiliary_files": ["example.html", "example.ipynb"] # full path + } + +.. code:: ipython3 + + upload = pz_server.upload(**data_to_upload) + +.. code:: ipython3 + + upload.product_id + + + + +.. parsed-literal:: + + 82 + + + +How to display the metadata of a data product +--------------------------------------------- + +The metadata of a given data product is the information provided by the +user on the upload form. This information is attached to the data +product contents and is available for consulting on the PZ Server page +or using this Python API (``pzserver``). + +All data products stored on PZ Server are identified by a unique **id** +number or an unique name, a *string* called **internal_name**, which is +created automatically at the moment of the upload by concatenating the +product **id** to the name given by its owner (replacing blank spaces by +“\_“, lowering cases, and removing special characters). + +The ``PzServer``\ ’s method ``get_product_metadata()`` returns a +dictionary with the attibutes stored in the PZ Server about a given data +product identified by its **id** or **internal_name**. For use in a +Jupyter notebook, the equivalent ``display_product_metadata()`` shows +the results in a formated table. + +.. code:: ipython3 + + # pz_server.display_product_metadata() + # pz_server.display_product_metadata(6) + # pz_server.display_product_metadata("6") + pz_server.display_product_metadata("6_simple_training_set") + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
keyvalue
id6
releaseLSST DP0
product_typeTraining Set
uploaded_bygschwend
internal_name6_simple_training_set
product_nameSimple training set
official_productFalse
pz_code
descriptionA simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms.
created_at2023-03-23T19:46:48.807872Z
main_filesimple_pz_training_set.csv
+ + + +.. container:: + :name: how-to-download-data-products-as-zip-files + + `back to the top <#notebook-contents>`__ + +How to download data products as .zip files +------------------------------------------- + +To download any data product stored in the PZ Server, use the +``PzServer``\ ’s method ``download_product`` informing the product’s +**internal_name** and the path to where it will be saved (the default is +the current folder). This method downloads a compressed .zip file which +contais all the files uploaded by the user, including data, anciliary +files and description files. The time spent to download a data product +depends on the internet connections between the user and the host. Let’s +try it with a small data product. + +.. code:: ipython3 + + pz_server.download_product(14, save_in=".") + + +.. parsed-literal:: + + Connecting to PZ Server... + File saved as: ./14_gama_specz_subsample_d7738.zip + Done! + + +.. container:: + :name: how-to-share-data-products-with-other-rsp-users + + `back to the top <#notebook-contents>`__ + +How to share data products with other RSP users +----------------------------------------------- + +All data products uploaded to the PZ Server are imediately available and +visible to all PZ Server users (people with RSP credentials) through the +PZ Server website or via the API. Besides informing the product **id** +or **internal_name** for programatic access, another way to share a data +product is providing the product’s URL, which leads to the product’s +download page. The URL is composed by the PZ Server website address + +**/products/** + **internal_name**: + +https://pz-server.linea.org.br/product/ + **internal_name** + +or, if still in the development phase, + +https://pz-server-dev.linea.org.br/product/ + **internal_name** + +For example: + +https://pz-server-dev.linea.org.br/product/6_simple_training_set + +WARNING: The URL works only with the **complete internal name**, not +with just the **id** number. + +.. container:: + :name: how-to-retrieve-contents-of-data-products-work-on-memory + + `back to the top <#notebook-contents>`__ + +How to retrieve contents of data products (work on memory) +---------------------------------------------------------- + +Another feature of the PZ Server API is to let users retrieve the +contents of a given data product to work on memory (by atributing the +results of the method ``get_product()`` to a variable in the code). This +feature is available only for tabular data (product types: **Spec-z +Catalog** and **Training Set**). + +By default, the method ``get_product`` returns an object from a +particular class, depending on the product’s type. The classes +``SpeczCatalog`` and ``TrainingSet`` are simple extensions of +``pandas.DataFrame`` (via class composition) with a couple of additional +attributes and methods, such as the attribute ``metadata``, and the +method ``display_metadata()``. Let’s see an example: + +.. code:: ipython3 + + catalog = pz_server.get_product(8) + catalog + + +.. parsed-literal:: + + Connecting to PZ Server... + Done! + + + + +.. parsed-literal:: + + + + + +.. code:: ipython3 + + catalog.display_metadata() + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
keyvalue
id8
releaseNone
product_typeTraining Set
uploaded_bygschwend
internal_name8_goldenspike_train_data_fits
product_nameGoldenspike train data fits
official_productFalse
pz_code
descriptionA mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. + Test upload of files in fits format.
created_at2023-03-29T19:09:12.958883Z
main_filegoldenspike_train_data.fits
+ + + +The tabular data is alocated in the attribute ``data``, which is a +``pandas.DataFrame``. + +.. code:: ipython3 + + catalog.data + + + + +.. raw:: html + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
redshiftmag_u_lsstmag_err_u_lsstmag_g_lsstmag_err_g_lsstmag_r_lsstmag_err_r_lsstmag_i_lsstmag_err_i_lsstmag_z_lsstmag_err_z_lsstmag_y_lsstmag_err_y_lsst
00.76952126.4968520.28898625.8631700.05699724.7295550.02070223.6106830.01201123.1435180.01371422.9151560.024561
11.08885726.2587270.23796425.5095240.04166824.4693440.01664823.5328600.01134422.5466800.00899222.0702550.012282
21.33309825.3738550.11225724.9432930.02535924.5249980.01743124.0136490.01648623.7332740.02231523.1021230.028906
..........................................
590.98637426.0506530.20016425.6416240.04683725.1610780.03009024.4601520.02404723.9772390.02756723.8319740.055121
600.47428127.0480560.44468326.4282110.09385424.8399840.02275524.2092260.01940323.8550820.02478723.5074560.041329
610.56192324.6804800.06118223.9586090.01143022.9001350.00634622.1435810.00582021.8675630.00646521.6126920.008967
+

62 rows × 13 columns

+
+ + + +.. code:: ipython3 + + type(catalog.data) + + + + +.. parsed-literal:: + + pandas.core.frame.DataFrame + + + +It preserves the useful methods from ``pandas.DataFrame``, such as: + +.. code:: ipython3 + + catalog.data.info() + + +.. parsed-literal:: + + + RangeIndex: 62 entries, 0 to 61 + Data columns (total 13 columns): + # Column Non-Null Count Dtype + --- ------ -------------- ----- + 0 redshift 62 non-null float64 + 1 mag_u_lsst 61 non-null float64 + 2 mag_err_u_lsst 61 non-null float64 + 3 mag_g_lsst 62 non-null float64 + 4 mag_err_g_lsst 62 non-null float64 + 5 mag_r_lsst 62 non-null float64 + 6 mag_err_r_lsst 62 non-null float64 + 7 mag_i_lsst 62 non-null float64 + 8 mag_err_i_lsst 62 non-null float64 + 9 mag_z_lsst 62 non-null float64 + 10 mag_err_z_lsst 62 non-null float64 + 11 mag_y_lsst 61 non-null float64 + 12 mag_err_y_lsst 61 non-null float64 + dtypes: float64(13) + memory usage: 6.4 KB + + +.. code:: ipython3 + + catalog.data.describe() + + + + +.. raw:: html + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
redshiftmag_u_lsstmag_err_u_lsstmag_g_lsstmag_err_g_lsstmag_r_lsstmag_err_r_lsstmag_i_lsstmag_err_i_lsstmag_z_lsstmag_err_z_lsstmag_y_lsstmag_err_y_lsst
count62.00000061.00000061.00000062.00000062.00000062.00000062.00000062.00000062.00000062.00000062.00000061.00000061.000000
mean0.78029825.4460080.18805024.8200000.03818224.0039700.01877023.3848040.01616523.0744810.02147822.9323540.054682
std0.3553651.2692770.1937471.3141120.0363981.3873580.0137501.3815870.0100691.4006730.0149611.5402840.115875
..........................................
50%0.76460025.5770290.13381525.0699700.02830924.4702150.01666023.7485060.01339023.5141850.01854023.2933840.034199
75%0.94849426.2632840.23885925.7054860.04957624.9852250.02580224.4886540.02465024.1659440.03255723.9930100.063585
max1.75576428.4823911.15407327.2961520.19819526.0369580.06536024.9496450.03693224.6931320.05188327.3421510.909230
+

8 rows × 13 columns

+
+ + + +In the prod-types you will see more details about these specific +classes. For those who prefer working with ``astropy.Table`` or pure +``pandas.DataFrame``, the method ``get_product()`` gives the flexibility +to choose the output format (``fmt="pandas"`` or ``fmt="astropy"``). + +.. code:: ipython3 + + dataframe = pz_server.get_product(8, fmt="pandas") + print(type(dataframe)) + dataframe + + +.. parsed-literal:: + + Connecting to PZ Server... + + + + + +.. raw:: html + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
redshiftmag_u_lsstmag_err_u_lsstmag_g_lsstmag_err_g_lsstmag_r_lsstmag_err_r_lsstmag_i_lsstmag_err_i_lsstmag_z_lsstmag_err_z_lsstmag_y_lsstmag_err_y_lsst
00.76952126.4968520.28898625.8631700.05699724.7295550.02070223.6106830.01201123.1435180.01371422.9151560.024561
11.08885726.2587270.23796425.5095240.04166824.4693440.01664823.5328600.01134422.5466800.00899222.0702550.012282
21.33309825.3738550.11225724.9432930.02535924.5249980.01743124.0136490.01648623.7332740.02231523.1021230.028906
..........................................
590.98637426.0506530.20016425.6416240.04683725.1610780.03009024.4601520.02404723.9772390.02756723.8319740.055121
600.47428127.0480560.44468326.4282110.09385424.8399840.02275524.2092260.01940323.8550820.02478723.5074560.041329
610.56192324.6804800.06118223.9586090.01143022.9001350.00634622.1435810.00582021.8675630.00646521.6126920.008967
+

62 rows × 13 columns

+
+ + + +.. code:: ipython3 + + table = pz_server.get_product(8, fmt="astropy") + print(type(table)) + table + + +.. parsed-literal:: + + Connecting to PZ Server... + + + + + +.. raw:: html + +
Table length=62 + + + + + + + + + + + + + + + + + + + + + + + +
redshiftmag_u_lsstmag_err_u_lsstmag_g_lsstmag_err_g_lsstmag_r_lsstmag_err_r_lsstmag_i_lsstmag_err_i_lsstmag_z_lsstmag_err_z_lsstmag_y_lsstmag_err_y_lsst
float64float64float64float64float64float64float64float64float64float64float64float64float64
0.769521057605743426.496851733359980.2889864016451496625.8631701801485930.056996849251325224.729555232665350.02070246989947576223.6106832612475230.01201139145700786723.143517979331420.01371427288818984422.9151560685081040.02456124411372624
1.088857054710388226.258726903647150.2379635474665983725.509524228603690.04166792240955244424.469344487165970.01664762131418696323.5328599838842970.01134372952245139122.5466795031786620.00899216749772303922.0702554732436660.01228199507795122
1.333097815513610825.3738551394507040.1122566959777225624.943293290995960.02535893280119127424.524997784555430.0174305351556827724.013648955119970.01648631007044298223.733274349215570.02231531417162041523.1021233624494760.028905678864388565
0.72126543521881125.994096311189090.190882668154784625.617772381977740.04585803883130551425.0057857477996420.02626806353391610424.3712855019873050.02227392052069140624.2216706781082040.03416701480043831524.0658108028302560.06782119456710035
0.508699238300323523.455644928497680.02118364967662736422.1549834614831620.00545619765168589620.8542210729006750.00505430534507590620.251517785749910.00504324008960251219.9879329924582550.00507670072076654719.75317944869660.00522219293990742
1.65459752082824725.5770293127979930.133814541165693825.3571902346599920.03642206846621718524.9853640973173760.02580530557653081724.6198659479305140.02762942851550346424.315427059847160.0371171910364455923.993010472490680.06358486650868432
0.630211710929870626.294569730983540.2450898080861663425.6619607423793780.0476886775022480124.9703507885164240.02547068619454275624.443595409788620.02370506584090058324.382525689614610.0393880559359265524.274314551454060.08154556925795062
0.944600462913513223.044391449579790.0152005556449064822.8598270289764150.00635841610055997822.3920310802753240.00559849714596083821.7631717128332550.00544387100243714921.3156064883045420.00561125690062150221.0788536978553220.006819135385098381
0.78505992889404326.103505068889410.2092094615519172225.6405705624662370.04679350252717620625.2245729107701920.03181687681782085424.5707148149561050.02646946352912577724.442766763258990.04154746484476924.363597198979420.08821825914170406
.......................................
1.346864342689514225.7990566691117140.1618358280018550325.1727477988701940.03096962874499195324.731768967630090.02074152874695367524.245889724632090.02001374539849744623.873261845298210.02518113625053323623.3017518833514930.034452507469982616
0.949792087078094523.698825570527960.02600386075499280523.420784681996990.00811079505924310722.7108851424223380.00599830947764476621.8968588151091620.00555089139182906321.273081732643840.00557091556571719321.0407246844617970.006716225485067447
0.740395247936248825.4430204282417660.1191933891565676324.9608997858341370.02574836484474985624.377556409002050.01544710496912196823.649963692291250.01237012405335630323.5513875922433830.0191178956640090623.4470210254503750.03917395554632945
0.909494757652282723.475399963653020.02153524007063957823.3424579781672250.007781748463140330622.7617712265424430.006082322317851221.9445992106821280.005595036341919999521.4451333732897640.00575226659995192721.2300351919156720.007284651334311665
0.973186552524566724.9153227926672470.0752057326533048224.4606451278066070.01686522837711815623.6600061666947960.00914992597156648822.8933591885015770.0076167823067076922.410792868885050.0082969741205060127.342150888785180.9092302567884765
0.609932243824005124.60127263188670.0570645984782256723.093594024040720.00693258721546263621.696081313083990.00519564520756141520.695570374958980.00508274447208671120.387208468976390.00514013802410804720.127856340119390.00540364900110484
0.9768770933151245----26.846220736674640.1350618516521750725.70928911942380.0488676683278345724.8665762050749420.0343179145306803724.0488701229033670.02934960146430494823.784055887582750.052825335046252773
0.986374497413635326.0506532837846230.2001642698007547225.6416240096463350.0468371901521604125.1610781812188340.03008953675706706224.4601524141291370.02404672900351872323.9772390036210.02756678105109978323.8319736186345280.05512066706093889
0.474280714988708527.0480560874079860.444682506357735426.4282112805191750.0938543394596348124.839983603182140.0227549353128951224.20922601749360.0194026127508123923.8550822431599340.0247873017109994123.5074559295742880.041328512368478044
0.561922669410705624.6804795305431630.06118153192966563323.9586089979737020.0114295663681752622.9001349679331020.00634586977358199822.1435806332706240.00581963097081042821.8675628493294060.00646548086334226921.612691594536260.008966510628950788
+ + + +Specific features for each product type +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Please take a look at the other tutorial notebooks with particular +examples of how to use the ``pzserver`` library to access and manipulate +data from the PZ Server. + +-------------- + +Users feedback +~~~~~~~~~~~~~~ + +Is something important missing? `Click here to open an issue in the PZ +Server library repository on +GitHub `__. diff --git a/docs/notebooks/rst/1_specz_catalogs.rst b/docs/notebooks/rst/1_specz_catalogs.rst new file mode 100644 index 0000000..a43b793 --- /dev/null +++ b/docs/notebooks/rst/1_specz_catalogs.rst @@ -0,0 +1,355 @@ +Photo-z Server +============== + +Tutorial Notebook 1 - Spec-z Catalogs +------------------------------------- + +Contact author: `Julia Gschwend `__ + +Last verified run: **2024-Jul-04** + +Introduction +~~~~~~~~~~~~ + +Welcome to the PZ Server tutorials. If you are reading this notebooks +for the first time, we recommend to not skip the introduction notebook: +``0_introduction.ipynb`` also available in this same repository. + +Imports and Setup +~~~~~~~~~~~~~~~~~ + +.. code:: ipython3 + + from pzserver import PzServer + import matplotlib.pyplot as plt + %reload_ext autoreload + %autoreload 2 + +.. code:: ipython3 + + # pz_server = PzServer(token="", host="pz-dev") # "pz-dev" is the temporary host for test phase + +For convenience, the token can be saved into a file named as +``token.txt`` (which is already listed in the .gitignore file in this +repository). + +.. code:: ipython3 + + with open('token.txt', 'r') as file: + token = file.read() + pz_server = PzServer(token=token, host="pz-dev") # "pz-dev" is the temporary host for test phase + +Product types +~~~~~~~~~~~~~ + +The PZ Server API provides Python classes with useful methods to handle +particular product types. Let’s recap the product types available: + +.. code:: ipython3 + + pz_server.display_product_types() + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Product Typeproduct_typeDescription
Spec-z Catalogspecz_catalogCatalog of spectroscopic redshifts and positions (usually equatorial coordinates).
Training Settraining_setTraining set for photo-z algorithms (tabular data). It usually contains magnitudes, errors, and true redshifts.
Validation Resultsvalidation_resultsResults of a photo-z validation procedure (free format). Usually contains photo-z estimates (single estimates and/or pdf) of a validation set and photo-z validation metrics.
Photo-z Tablephotoz_tableResults of a photo-z estimation procedure. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (instructions on accessing the data should be provided in the description field.
+ + + +Spec-z Catalogs +--------------- + +In the context of the PZ Server, Spec-z Catalogs are defined as any +catalog containing spherical equatorial coordinates and spectroscopic +redshift measurements (or, analogously, true redshifts from +simulations). A Spec-z Catalog can include data from a single +spectroscopic survey or a combination of data from several sources. To +be considered as a single Spec-z Catalog, the data should be provided as +a single file to PZ Server’s the upload tool. For multi-survey catalogs, +it is recommended to add the survey name or identification as an extra +column. + +Mandatory columns: \* Right ascension [degrees] - ``float`` \* +Declination [degrees] - ``float`` \* Spectroscopic or true redshift - +``float`` + +Recommended columns: \* Spectroscopic redshift error - ``float`` \* +Quality flag - ``integer``, ``float``, or ``string`` \* Survey name +(recommended for compilations of data from different surveys) + +PZ Server Pipelines +^^^^^^^^^^^^^^^^^^^ + +Spec-z Catalogs can be uploaded by users on PZ Server website or via the +``pzserver`` library. Also, they can be created as the combination of a +list of other Spec-z Catalogs previously registered in the system by the +PZ Sever’s pipeline “Combine Spec-z Catalogs” (under development). Any +catalog built by the pipeline is automaticaly registered as a regular +user-generated data product and has no difference from the uploaded +ones. + +Let’s see an example of Spec-z Catalog: + +.. code:: ipython3 + + gama = pz_server.get_product(14) + + +.. parsed-literal:: + + Connecting to PZ Server... + Done! + + +.. code:: ipython3 + + gama.display_metadata() + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
keyvalue
id14
releaseNone
product_typeSpec-z Catalog
uploaded_bygschwend
internal_name14_gama_specz_subsample
product_nameGAMA spec-z subsample
official_productFalse
pz_code
descriptionA small subsample of the GAMA DR3 spec-z catalog (Baldry et al. 2018) as an example of a typical spec-z catalog from the literature.
created_at2023-03-29T20:02:45.223568Z
main_filespecz_subsample_gama_example.csv
+ + + +Display basic statistics + +.. code:: ipython3 + + gama.data.describe() + + + + +.. raw:: html + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IDRADECZERR_ZFLAG_DES
count2.576000e+032576.0000002576.0000002576.0000002576.02576.000000
mean1.105526e+06154.526343-1.1018650.22481199.03.949534
std4.006668e+0470.7838682.9950360.1025710.00.218947
.....................
50%1.103558e+06180.140145-0.4808300.21780499.04.000000
75%1.140619e+06215.8365831.1703630.29181099.04.000000
max1.176440e+06223.4970802.9981800.72871799.04.000000
+

8 rows × 6 columns

+
+ + + +The spec-z catalog object has a very basic plot method for quick +visualization of catalog properties. For advanced interactive data +visualization tips, we recommend the notebook +`DP02_06b_Interactive_Catalog_Visualization.ipynb `__ +from Rubin Observatory’s DP0.2 `tutorial-notebooks +repository `__. + +.. code:: ipython3 + + gama.plot() + + + +.. image:: output_20_0.png + + +The attribute ``data``, which is a ``DataFrame`` preserves the ``plot`` +method from Pandas. + +.. code:: ipython3 + + gama.data.plot(x="RA", y="DEC", kind="scatter") + + + + +.. parsed-literal:: + + + + + + +.. image:: output_22_1.png + + +-------------- + +Users feedback +~~~~~~~~~~~~~~ + +Is something important missing? `Click here to open an issue in the PZ +Server library repository on +GitHub `__. diff --git a/docs/notebooks/rst/2_training_sets.rst b/docs/notebooks/rst/2_training_sets.rst new file mode 100644 index 0000000..cff1228 --- /dev/null +++ b/docs/notebooks/rst/2_training_sets.rst @@ -0,0 +1,402 @@ +Photo-z Server +============== + +Tutorial Notebook 2 - Training Sets +----------------------------------- + +Contact author: `Julia Gschwend `__ + +Last verified run: **2024-Jul-04** + +Introduction +~~~~~~~~~~~~ + +Welcome to the PZ Server tutorials. If you are reading this notebooks +for the first time, we recommend to not skip the introduction notebook: +``0_introduction.ipynb`` also available in this same repository. + +Imports and Setup +~~~~~~~~~~~~~~~~~ + +.. code:: ipython3 + + from pzserver import PzServer + import matplotlib.pyplot as plt + %reload_ext autoreload + %autoreload 2 + +.. code:: ipython3 + + # pz_server = PzServer(token="", host="pz-dev") # "pz-dev" is the temporary host for test phase + +For convenience, the token can be saved into a file named as +``token.txt`` (which is already listed in the .gitignore file in this +repository). + +.. code:: ipython3 + + with open('token.txt', 'r') as file: + token = file.read() + pz_server = PzServer(token=token, host="pz-dev") # "pz-dev" is the temporary host for test phase + +Product types +~~~~~~~~~~~~~ + +The PZ Server API provides Python classes with useful methods to handle +particular product types. Let’s recap the product types available: + +.. code:: ipython3 + + pz_server.display_product_types() + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Product Typeproduct_typeDescription
Spec-z Catalogspecz_catalogCatalog of spectroscopic redshifts and positions (usually equatorial coordinates).
Training Settraining_setTraining set for photo-z algorithms (tabular data). It usually contains magnitudes, errors, and true redshifts.
Validation Resultsvalidation_resultsResults of a photo-z validation procedure (free format). Usually contains photo-z estimates (single estimates and/or pdf) of a validation set and photo-z validation metrics.
Photo-z Tablephotoz_tableResults of a photo-z estimation procedure. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (instructions on accessing the data should be provided in the description field.
+ + + +Training Sets +------------- + +In the context of the PZ Server, Training Sets are defined as the +product of matching (spatially) a given Spec-z Catalog (single survey or +compilation) to the photometric data, in this case, the LSST Objects +Catalog. The PZ Server API offers a tool called *Training Set Maker* for +users to build customized Training Sets based on the Spec-z Catalogs +available. Please see the companion Jupyter Notebook +``pz_tsm_tutorial.ipynb`` for details. + +*Note 1: Commonly the training set is split into two or more subsets for +photo-z validation purposes. If the Training Set owner has previously +defined which objects should belong to each subset (trainining and +validation/test sets), this information must be available as an extra +column in the table or as clear instructions for reproducing the subsets +separation in the data product description.* + +*Note 2: The PZ Server only supports catalog-level Training Sets. +Image-based Training Sets, e.g., for deep-learning algorithms, are not +supported yet.* + +Mandatory column: \* Spectroscopic (or true) redshift - ``float`` + +Other expected columns \* Object ID from LSST Objects Catalog - +``integer`` \* Observables: magnitudes (and/or colors, or fluxes) from +LSST Objects Catalog - ``float`` \* Observable errors: magnitude errors +(and/or color errors, or flux errors) from LSST Objects Catalog - +``float`` \* Right ascension [degrees] - ``float`` \* Declination +[degrees] - ``float`` \* Quality Flag - ``integer``, ``float``, or +``string`` \* Subset Flag - ``integer``, ``float``, or ``string`` + +PZ Server Pipelines +^^^^^^^^^^^^^^^^^^^ + +Trainind Sets can be uploaded by users on PZ Server website or via the +``pzserver`` library. Also, they can be created as the spatial +cross-matching between a given Spec-z Catalog previously registered in +the system and an Object table from a given LSST Data Release available +in the Brazilian IDAC by the PZ Sever’s pipeline “Training Set Maker” +(under development). Any Training Set built by the pipeline is +automaticaly registered as a regular user-generated data product and has +no difference from the uploaded ones. + +.. code:: ipython3 + + train_goldenspike = pz_server.get_product(9) + + +.. parsed-literal:: + + Connecting to PZ Server... + Done! + + +.. code:: ipython3 + + train_goldenspike.display_metadata() + + + +.. raw:: html + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
keyvalue
id9
releaseNone
product_typeTraining Set
uploaded_bygschwend
internal_name9_goldenspike_train_data_hdf5
product_nameGoldenspike train data hdf5
official_productFalse
pz_code
descriptionA mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. + Test upload of files in hdf5 format.
created_at2023-03-29T19:12:59.746096Z
main_filegoldenspike_train_data.hdf5
+ + + +Display basic statistics + +.. code:: ipython3 + + train_goldenspike.data.describe() + + + + +.. raw:: html + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
mag_err_g_lsstmag_err_i_lsstmag_err_r_lsstmag_err_u_lsstmag_err_y_lsstmag_err_z_lsstmag_g_lsstmag_i_lsstmag_r_lsstmag_u_lsstmag_y_lsstmag_z_lsstredshift
count62.00000062.00000062.00000061.00000061.00000062.00000062.00000062.00000062.00000061.00000061.00000062.00000062.000000
mean0.0381820.0161650.0187700.1880500.0546820.02147824.82000023.38480424.00397025.44600822.93235423.0744810.780298
std0.0363980.0100690.0137500.1937470.1158750.0149611.3141121.3815871.3873581.2692771.5402841.4006730.355365
..........................................
50%0.0283090.0133900.0166600.1338150.0341990.01854025.06997023.74850624.47021525.57702923.29338423.5141850.764600
75%0.0495760.0246500.0258020.2388590.0635850.03255725.70548624.48865424.98522526.26328423.99301024.1659440.948494
max0.1981950.0369320.0653601.1540730.9092300.05188327.29615224.94964526.03695828.48239127.34215124.6931321.755764
+

8 rows × 13 columns

+
+ + + +The training set object has a very basic plot method for quick +visualization of catalog properties. For advanced interactive data +visualization tips, we recommend the notebook +`DP02_06b_Interactive_Catalog_Visualization.ipynb `__ +from Rubin Observatory’s DP0.2 `tutorial-notebooks +repository `__. + +.. code:: ipython3 + + train_goldenspike.plot(mag_name="mag_i_lsst") + + + +.. image:: output_18_0.png + + +-------------- + +Users feedback +~~~~~~~~~~~~~~ + +Is something important missing? `Click here to open an issue in the PZ +Server library repository on +GitHub `__. diff --git a/docs/notebooks/rst/output_18_0.png b/docs/notebooks/rst/output_18_0.png new file mode 100644 index 0000000..06d5743 Binary files /dev/null and b/docs/notebooks/rst/output_18_0.png differ diff --git a/docs/notebooks/rst/output_20_0.png b/docs/notebooks/rst/output_20_0.png new file mode 100644 index 0000000..a76b0e7 Binary files /dev/null and b/docs/notebooks/rst/output_20_0.png differ diff --git a/docs/notebooks/rst/output_22_1.png b/docs/notebooks/rst/output_22_1.png new file mode 100644 index 0000000..34ec56d Binary files /dev/null and b/docs/notebooks/rst/output_22_1.png differ