Assorted small doc fixes (#310)

* Point to more information for reader kwargs. * Point to code of conduct; inherit docstring members. * Clear output
astronomy-commons · May 16, 2024 · 94c069b · 94c069b
1 parent 3f7a423
commit 94c069b
Show file tree

Hide file tree

Showing 7 changed files with 50 additions and 40 deletions.
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -26,7 +26,7 @@ If it fixes an open issue, please link to the issue here. If this PR closes an i
 
 
 ## Code Quality
-- [ ] I have read the Contribution Guide
+- [ ] I have read the [Contribution Guide](https://hipscat-import.readthedocs.io/en/stable/guide/contributing.html) and [LINCC Frameworks Code of Conduct](https://lsstdiscoveryalliance.org/programs/lincc-frameworks/code-conduct/)
 - [ ] My code follows the code style of this project
 - [ ] My code builds (or compiles) cleanly without any errors or warnings
 - [ ] My code contains relevant comments and necessary documentation

diff --git a/docs/catalogs/arguments.rst b/docs/catalogs/arguments.rst
@@ -9,7 +9,7 @@ A minimal arguments block will look something like:
 
 .. code-block:: python
 
-    from hipscat_import.pipeline import ImportArguments
+    from hipscat_import.pipeline.catalog.arguments import ImportArguments
 
     args = ImportArguments(
         sort_columns="ObjectID",

diff --git a/docs/conf.py b/docs/conf.py
@@ -38,7 +38,9 @@
 extensions = [
     "sphinx.ext.mathjax",
     "sphinx.ext.napoleon",
-    "sphinx.ext.viewcode",
+    # viewcode and autoapi interaction creates some failures. See:
+    # https://github.com/readthedocs/sphinx-autoapi/issues/422
+    # "sphinx.ext.viewcode",
     "sphinx.ext.intersphinx",
     "sphinx_copybutton",
     "autoapi.extension",
@@ -57,6 +59,16 @@
 autoapi_add_toc_tree_entry = False
 autoapi_member_order = "bysource"
 
+autoapi_options = [
+    "members",
+    "undoc-members",
+    "show-inheritance",
+    "show-module-summary",
+    "special-members",
+    "imported-members",
+    "inherited-members",
+]
+
 napoleon_google_docstring = True
 
 # -- sphinx-copybutton configuration ----------------------------------------

diff --git a/docs/guide/contributing.rst b/docs/guide/contributing.rst
@@ -40,13 +40,11 @@ create and activate a new environment.
 
 
 Once you have created a new environment, you can install this project for local
-development using the following commands:
+development using the following command:
 
 .. code-block:: console
 
-   >> pip install -e .'[dev]'
-   >> pre-commit install
-   >> conda install pandoc
+   >> source .setup_dev.sh
 
 
 Notes:
@@ -61,29 +59,6 @@ Notes:
    the Python Project Template documentation on
    `Sphinx and Python Notebooks <https://lincc-ppt.readthedocs.io/en/stable/practices/sphinx.html#python-notebooks>`_.
 
-
-.. tip::
-    Installing on Mac
-
-    ``healpy`` is a very necessary dependency for hipscat libraries at this time, but
-    native prebuilt binaries for healpy on Apple Silicon Macs 
-    `do not yet exist <https://healpy.readthedocs.io/en/latest/install.html#binary-installation-with-pip-recommended-for-most-other-python-users>`_, 
-    so it's recommended to install via conda before proceeding to hipscat-import.
-
-    .. code-block:: console
-
-        >> conda config --add channels conda-forge
-        >> conda install healpy
-        >> git clone https://github.com/astronomy-commons/hipscat-import
-        >> cd hipscat-import
-        >> pip install -e .
-        
-    When installing dev dependencies, make sure to include the single quotes.
-
-    .. code-block:: console
-        
-        >> pip install -e '.[dev]'
-
 Testing
 -------------------------------------------------------------------------------
 
@@ -131,3 +106,9 @@ Optional - Release a new version
 Once your PR is merged you can create a new release to make your changes available. 
 GitHub's `instructions <https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository>`_ for doing so are here. 
 Use your best judgement when incrementing the version. i.e. is this a major, minor, or patch fix.
+
+Be kind
+-------------------------------------------------------------------------------
+
+You are expected to comply with the 
+`LINCC Frameworks Code of Conduct <https://lsstdiscoveryalliance.org/programs/lincc-frameworks/code-conduct/>`_`.
diff --git a/docs/notebooks/estimate_pixel_threshold.ipynb b/docs/notebooks/estimate_pixel_threshold.ipynb
@@ -32,7 +32,7 @@
     "## Create a sample parquet file\n",
     "\n",
     "The first step is to read in your survey data in its original form, and convert a sample into parquet. This has a few benefits:\n",
-    "- parquet uses compression in various ways, and by creating the sample, we can get a sense of both the overall and field-level compression with real dat\n",
+    "- parquet uses compression in various ways, and by creating the sample, we can get a sense of both the overall and field-level compression with real data\n",
     "- using the importer `FileReader` interface now sets you up for more success when you get around to importing!\n",
     "\n",
     "If your data is already in parquet format, just change the `sample_parquet_file` path to an existing file, and don't run the second cell."
@@ -96,10 +96,10 @@
     "parquet_file = pq.ParquetFile(sample_parquet_file)\n",
     "num_rows = parquet_file.metadata.num_rows\n",
     "\n",
-    "## 100MB\n",
-    "ideal_file_small = 100 * 1024 * 1024\n",
-    "## 800MB\n",
-    "ideal_file_large = 800 * 1024 * 1024\n",
+    "## 300MB\n",
+    "ideal_file_small = 300 * 1024 * 1024\n",
+    "## 1G\n",
+    "ideal_file_large = 1024 * 1024 * 1024\n",
     "\n",
     "threshold_small = ideal_file_small / sample_file_size * num_rows\n",
     "threshold_large = ideal_file_large / sample_file_size * num_rows\n",
@@ -191,6 +191,11 @@
   }
  ],
  "metadata": {
+  "kernelspec": {
+   "display_name": "hipscatenv",
+   "language": "python",
+   "name": "python3"
+  },
   "language_info": {
    "codemirror_mode": {
     "name": "ipython",
@@ -201,7 +206,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.15"
+   "version": "3.10.14"
   }
  },
  "nbformat": 4,

diff --git a/docs/notebooks/unequal_schema.ipynb b/docs/notebooks/unequal_schema.ipynb
@@ -308,7 +308,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.13"
+   "version": "3.10.14"
   }
  },
  "nbformat": 4,

diff --git a/src/hipscat_import/catalog/file_readers.py b/src/hipscat_import/catalog/file_readers.py
@@ -38,6 +38,7 @@ def get_file_reader(
             is available. for fits files, a list of columns to *keep*.
         skip_column_names (list[str]): for fits files, a list of columns to remove.
         type_map (dict): for CSV files, the data types to use for columns
+        kwargs: additional keyword arguments to pass to the underlying file reader.
     """
     if file_format == "csv":
         return CsvReader(
@@ -113,9 +114,11 @@ class CsvReader(InputReader):
         column_names (list[str]): the names of columns if no header is available
         type_map (dict): the data types to use for columns
         parquet_kwargs (dict): additional keyword arguments to use when
-            reading the parquet schema metadata.
+            reading the parquet schema metadata, passed to pandas.read_parquet.
+            See https://pandas.pydata.org/docs/reference/api/pandas.read_parquet.html
         kwargs (dict): additional keyword arguments to use when reading
-            the CSV files.
+            the CSV files with pandas.read_csv.
+            See https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
     """
 
     def __init__(
@@ -188,7 +191,12 @@ class AstropyEcsvReader(InputReader):
     """Reads astropy ascii .ecsv files.
 
     Note that this is NOT a chunked reader. Use caution when reading
-    large ECSV files with this reader."""
+    large ECSV files with this reader.
+
+    Attributes:
+        kwargs: keyword arguments passed to astropy ascii reader.
+            See https://docs.astropy.org/en/stable/api/astropy.io.ascii.read.html#astropy.io.ascii.read
+    """
 
     def __init__(self, **kwargs):
         self.kwargs = kwargs
@@ -228,6 +236,8 @@ class FitsReader(InputReader):
             one of `column_names` or `skip_column_names`
         skip_column_names (list[str]): list of column names to skip. only use
             one of `column_names` or `skip_column_names`
+        kwargs: keyword arguments passed along to astropy.Table.read.
+            See https://docs.astropy.org/en/stable/api/astropy.table.Table.html#astropy.table.Table.read
     """
 
     def __init__(self, chunksize=500_000, column_names=None, skip_column_names=None, **kwargs):
@@ -281,6 +291,8 @@ class ParquetReader(InputReader):
         chunksize (int): number of rows of the file to process at once.
             For large files, this can prevent loading the entire file
             into memory at once.
+        kwargs: arguments to pass along to pyarrow.parquet.ParquetFile.
+            See https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetFile.html
     """
 
     def __init__(self, chunksize=500_000, **kwargs):