Merge branch 'main' into fix_open_table

linea-it · Jul 22, 2024 · eda2806 · eda2806
2 parents 7273d48 + 0a6b8bb
commit eda2806
Show file tree

Hide file tree

Showing 11 changed files with 1,277 additions and 1,324 deletions.
diff --git a/docs/notebooks/0_introduction.ipynb b/docs/notebooks/0_introduction.ipynb
@@ -18,29 +18,7 @@
    "source": [
     "Contact author: [Julia Gschwend](mailto:[email protected])\n",
     "\n",
-    "Last verified run: **2024-Jul-09**"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "tags": []
-   },
-   "source": [
-    "<div id=\"notebook-contents\" />\n",
-    "\n",
-    "# Notebook contents\n",
-    "\n",
-    "- PZ Server\n",
-    "    - [Introduction](#introduction) \n",
-    "    - [How to upload a data product to the PZ Server](#how-to-upload-a-data-product-to-the-pz-server)\n",
-    "    - [How to download a data product from the PZ Server](#how-to-download-a-data-product-from-the-pz-server)\n",
-    "- PZ Server API (Python library pzserver)\n",
-    "    - [How to get general info from PZ Server](#how-to-get-general-info-from-pz-server)\n",
-    "    - [How to display the metadata of a data product](#how-to-display-the-metadata-of-a-data-product)\n",
-    "    - [How to download data products as .zip files](#how-to-download-data-products-as-zip-files)  \n",
-    "    - [How to share data products with other RSP users](#how-to-share-data-products-with-other-rsp-users)\n",
-    "    - [How to retrieve contents of data products (work on memory)](#how-to-retrieve-contents-of-data-products-work-on-memory)"
+    "Last verified run: **2024-Jul-22**"
    ]
   },
   {
@@ -553,7 +531,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The tabular data is alocated in the attribute `data`, which is a `pandas.DataFrame`. "
+    "The tabular data is allocated in the attribute `data`, which is a `pandas.DataFrame`. "
    ]
   },
   {

diff --git a/docs/notebooks/1_specz_catalogs.ipynb b/docs/notebooks/1_specz_catalogs.ipynb
@@ -17,7 +17,7 @@
    "source": [
     "Contact author: [Julia Gschwend](mailto:[email protected])\n",
     "\n",
-    "Last verified run: **2024-Jul-04**"
+    "Last verified run: **2024-Jul-22**"
    ]
   },
   {
@@ -28,7 +28,7 @@
    "source": [
     "### Introduction \n",
     "\n",
-    "Welcome to the PZ Server tutorials. If you are reading this notebooks for the first time, we recommend to not skip the introduction notebook: `0_introduction.ipynb` also available in this same repository. \n"
+    "Welcome to the PZ Server tutorials. If you are reading this notebook for the first time, we recommend not to skip the introduction notebook: `0_introduction.ipynb` also available in this same repository. \n"
    ]
   },
   {
@@ -117,7 +117,7 @@
     "tags": []
    },
    "source": [
-    "In the context of the PZ Server, Spec-z Catalogs are defined as any catalog containing spherical equatorial coordinates and spectroscopic redshift measurements (or, analogously, true redshifts from simulations). A Spec-z Catalog can include data from a single spectroscopic survey or a combination of data from several sources. To be considered as a single Spec-z Catalog, the data should be provided as a single file to PZ Server's the upload tool. For multi-survey catalogs, it is recommended to add the survey name or identification as an extra column. \n",
+    "In the context of the PZ Server, Spec-z Catalogs are defined as any catalog containing spherical equatorial coordinates and spectroscopic redshift measurements (or, analogously, true redshifts from simulations). A Spec-z Catalog can include data from a single spectroscopic survey or a combination of data from several sources. To be considered as a single Spec-z Catalog, the data should be provided as a single file to PZ Server's upload tool. For multi-survey catalogs, it is recommended to add the survey name or identification as an extra column. \n",
     "\n",
     "\n",
     "Mandatory columns: \n",

diff --git a/docs/notebooks/2_training_sets.ipynb b/docs/notebooks/2_training_sets.ipynb
@@ -17,7 +17,7 @@
    "source": [
     "Contact author: [Julia Gschwend](mailto:[email protected])\n",
     "\n",
-    "Last verified run: **2024-Jul-04**"
+    "Last verified run: **2024-Jul-22**"
    ]
   },
   {
@@ -28,7 +28,7 @@
    "source": [
     "### Introduction \n",
     "\n",
-    "Welcome to the PZ Server tutorials. If you are reading this notebooks for the first time, we recommend to not skip the introduction notebook: `0_introduction.ipynb` also available in this same repository. \n"
+    "Welcome to the PZ Server tutorials. If you are reading this notebook for the first time, we recommend not to skip the introduction notebook: `0_introduction.ipynb` also available in this same repository. \n"
    ]
   },
   {
@@ -136,7 +136,7 @@
    "metadata": {},
    "source": [
     "#### PZ Server Pipelines\n",
-    "Trainind Sets can be uploaded by users on PZ Server website or via the `pzserver` library. Also, they can be created as the spatial cross-matching between a given Spec-z Catalog previously registered in the system and an Object table from a given LSST Data Release available in the Brazilian IDAC by the PZ Sever's pipeline \"Training Set Maker\" (under development). Any Training Set built by the pipeline is automaticaly registered as a regular user-generated data product and has no difference from the uploaded ones. \n",
+    "Training Sets can be uploaded by users on PZ Server website or via the `pzserver` library. Also, they can be created as the spatial cross-matching between a given Spec-z Catalog previously registered in the system and an Object table from a given LSST Data Release available in the Brazilian IDAC by the PZ Sever's pipeline \"Training Set Maker\" (under development). Any Training Set built by the pipeline is automatically registered as a regular user-generated data product and has no difference from the uploaded ones. \n",
     "\n"
    ]
   },
@@ -208,9 +208,9 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "LSST",
+   "display_name": "pzlib",
    "language": "python",
-   "name": "lsst"
+   "name": "pzlib"
   },
   "language_info": {
    "codemirror_mode": {
@@ -222,7 +222,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.7"
+   "version": "3.12.4"
   },
   "nbsphinx": {
    "execute": "never"

diff --git a/docs/notebooks/public-specz-compilation.ipynb b/docs/notebooks/public-specz-compilation.ipynb
@@ -5,17 +5,15 @@
    "id": "9d42adbb-55a5-4edd-aaae-2a5c2279e279",
    "metadata": {},
    "source": [
-    "<img align='left' src = https://linea.org.br/wp-content/themes/LIneA/imagens/logo-header.jpg width=150 style='padding: 20px'> \n",
+    "<img align='left' src = https://www.linea.org.br/brand/linea-logo-color.svg width=150 style='padding: 20px'> \n",
     "\n",
     "## Spectroscopic Redshifts Compilation\n",
     "\n",
-    "Public collection of redshift measurements made available by spectroscopic surveys prior to DES DR2.\n",
+    "Collection of public redshift catalogs made available by spectroscopic surveys prior to DES DR2.\n",
     "\n",
     "\n",
-    "Contact: Julia Gschwend ([[email protected]](mailto:[email protected]))\n",
-    "<br>\n",
-    "<br>\n",
-    "\n"
+    "Contact: Julia Gschwend ([[email protected]](mailto:[email protected])) <br>\n",
+    "Last verified run: **2024-Jul-22** <br>\n"
    ]
   },
   {
@@ -34,11 +32,13 @@
    "id": "dbbe57a5-fb05-4166-a2c3-dfd2b719f8ec",
    "metadata": {},
    "source": [
-    "#### Notes about the curation of spectroscopic _redshifts_ catalogs\n",
+    "#### Notes \n",
     "\n",
     "This notebook contains a brief characterization of a collection of spectroscopic _redshifts_ (spec-z) catalogs that have been publicly distributed and described in detail in scientific literature by their original projects. These catalogs were collected over the years of operation of the Dark Energy Survey (DES) and systematically grouped by the LIneA team (initially by Aurelio Carnero, then by Julia Gschwend) using the DES Science Portal tool (_pipeline_ Spectroscopic Sample) to form the basis of a training set for photometric _redshifts_ calculation algorithms based in machine learning. \n",
     "\n",
-    "The latest version of this notebook "
+    "The characterization of the training set created with this dataset combined with DES DR2 photometric data is available in a [separate notebook](https://github.com/linea-it/pzserver/blob/main/docs/notebooks/public-training-set-des-dr2.ipynb). If you have questions, please contact us.\n",
+    "\n",
+    "---\n"
    ]
   },
   {
@@ -153,45 +153,9 @@
     "\n",
     "1. measure with the highest quality _flag_ (`flag_des`)\n",
     "2. measurement with lowest error in redshift (`err_z`)\n",
-    "3. measurement taken by the most recent survey"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "4e7ba2d5-4d08-460e-a1fd-c94cab0b1c50",
-   "metadata": {},
-   "source": [
-    "\n",
-    "--- \n",
-    "\n",
-    "## Sample characterization\n",
+    "3. measurement taken by the most recent survey\n",
     "\n",
-    "Check below a brief characterization of the data contained in the compiled collection of spectroscopic catalogs."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f7eeca28-7bfc-4449-9afd-b88b648d0418",
-   "metadata": {},
-   "source": [
-    "Requirements for this notebook:\n",
-    "\n",
-    "* **Auxiliary file**: [des-round19-poly.txt](https://github.com/kadrlica/skymap/blob/master/skymap/data/des-round19-poly.txt) (contours of the area covered by the survey, i.e., DES _footprint_, 2019 version).\n",
-    "* **View libraries**: seaborn, bokeh, holoviews\n",
-    "\n",
-    "_Download_ the file `des-round19-poly.txt` from the repository [kadrlica/skymap](https://github.com/kadrlica/skymap) on GitHub:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b8c1f12b-4212-42cd-8271-fcbe881c9d7a",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "! wget https://raw.githubusercontent.com/kadrlica/skymap/master/skymap/data/des-round19-poly.txt  "
+    "--- \n"
    ]
   },
   {
@@ -216,42 +180,39 @@
     "import pandas as pd\n",
     "import matplotlib.pyplot as plt\n",
     "import seaborn as sns\n",
-    "import tables_io\n",
-    "import psutil\n",
     "import sys\n",
     "\n",
     "# Astropy\n",
     "from astropy import units as u\n",
     "from astropy.coordinates import SkyCoord\n",
-    "from astropy.units.quantity import Quantity\n",
     "\n",
     "# Bokeh\n",
     "import bokeh\n",
     "from bokeh.io import output_notebook, show, output_file, reset_output\n",
-    "from bokeh.models import ColumnDataSource, Range1d, HoverTool\n",
+    "from bokeh.models import ColumnDataSource, HoverTool, LinearColorMapper, ColorBar\n",
     "from bokeh.models import CDSView, GroupFilter\n",
     "from bokeh.plotting import figure, show, gridplot, output_notebook\n",
-    "from bokeh.models import Range1d, LinearColorMapper, ColorBar\n",
-    "from bokeh.transform import factor_cmap\n",
     "\n",
     "# HoloViews\n",
     "import holoviews as hv\n",
-    "from holoviews import streams, opts\n",
-    "from holoviews.operation.datashader import datashade, dynspread\n",
-    "from holoviews.plotting.util import process_cmap\n",
     "\n",
+    "# PZ Server\n",
+    "from pzserver import PzServer\n",
+    "with open('token.txt', 'r') as file:\n",
+    "    token = file.read()\n",
+    "pz_server = PzServer(token=token, host=\"pz-dev\") # \"pz-dev\" is the temporary host for test phase  \n",
     "\n",
-    "# Config\n",
+    "# Configs\n",
     "import warnings\n",
     "warnings.filterwarnings('ignore')\n",
-    "%reload_ext autoreload \n",
-    "%autoreload 2 \n",
-    "%matplotlib inline \n",
     "sns.set(color_codes=True, font_scale=1.5) \n",
     "sns.set_style('whitegrid')\n",
     "plt.rcParams.update({'figure.max_open_warning': 0})\n",
     "hv.extension('bokeh')\n",
-    "output_notebook()"
+    "output_notebook()\n",
+    "%reload_ext autoreload \n",
+    "%autoreload 2 \n",
+    "%matplotlib inline "
    ]
   },
   {
@@ -269,17 +230,35 @@
     "print('HoloViews version: ' + hv.__version__)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "2b0a5956-9bd1-4229-afc7-574fae49db74",
+   "metadata": {},
+   "source": [
+    "\n",
+    "## Retrieve Data\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f7eeca28-7bfc-4449-9afd-b88b648d0418",
+   "metadata": {},
+   "source": [
+    "Auxiliary file: `des-round19-poly.txt` (contours of the area covered by the survey, i.e., DES _footprint_, 2019 version) \n",
+    "\n",
+    "Download the file from the repository [kadrlica/skymap](https://github.com/kadrlica/skymap/blob/master/skymap/data/des-round19-poly.txt) on GitHub:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "69883b2d-b313-4fe7-8dfd-ab511dc3082a",
+   "id": "b8c1f12b-4212-42cd-8271-fcbe881c9d7a",
    "metadata": {
     "tags": []
    },
    "outputs": [],
    "source": [
-    "def fmt(x):\n",
-    "    return '{:.1f}%'.format(x)"
+    "! wget https://raw.githubusercontent.com/kadrlica/skymap/master/skymap/data/des-round19-poly.txt  "
    ]
   },
   {
@@ -310,7 +289,7 @@
    "id": "0de5e4d4-eed4-4d3c-8b06-f2175a5d4209",
    "metadata": {},
    "source": [
-    "Read spec-z catalog file"
+    "Retrieve spec-z catalog from PZ Server  "
    ]
   },
   {
@@ -322,7 +301,29 @@
    },
    "outputs": [],
    "source": [
-    "specz_catalog = tables_io.read('public_specz_compilation.pq')"
+    "specz_catalog_obj = pz_server.get_product('26_public_specz_compilation') "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d4895b7-6085-4f78-b15c-5b02972f0ed5",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "specz_catalog_obj.display_metadata()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6d7a7fc2-4316-47ab-8264-3c0c6e66bbbc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "specz_catalog = specz_catalog_obj.data"
    ]
   },
   {
@@ -419,7 +420,22 @@
    "id": "1b7af5a5-219f-4177-94ed-68f51013905c",
    "metadata": {},
    "source": [
-    "Note from the minimum and maximum values of the **flag_des** column that a quality cutoff was applied where only objects with **flag_des** $\\geqslant$ 3 were included in the sample.\n"
+    "Note from the minimum value of the **flag_des** column that a quality cutoff was applied where only objects with **flag_des** $\\geqslant$ 3 were included in the sample.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4e7ba2d5-4d08-460e-a1fd-c94cab0b1c50",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "\n",
+    "--- \n",
+    "\n",
+    "## Sample characterization\n",
+    "\n",
+    "Check below a brief characterization of the data contained in the compiled collection of spectroscopic catalogs."
    ]
   },
   {
@@ -443,7 +459,6 @@
    "id": "6f7edf41-c446-4467-9dbc-6f197d1cd9e0",
    "metadata": {},
    "source": [
-    "--- \n",
     "\n",
     "#### Spatial distribution\n"
    ]
@@ -555,6 +570,8 @@
    },
    "outputs": [],
    "source": [
+    "def fmt(x):\n",
+    "    return '{:.1f}%'.format(x)\n",
     "counts = pd.DataFrame(data={'flag_des':[len(specz_catalog.query('flag_des ==3')), \n",
     "                                        len(specz_catalog.query('flag_des ==4'))]}, index= [3, 4])\n",
     "counts.plot.pie(y='flag_des', labels=None, autopct=fmt, colors=['darkorange', 'steelblue']) \n",
@@ -665,9 +682,9 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "pzlib",
    "language": "python",
-   "name": "python3"
+   "name": "pzlib"
   },
   "language_info": {
    "codemirror_mode": {
@@ -679,7 +696,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.12.4"
   }
  },
  "nbformat": 4,