Skip to content

Commit

Permalink
Assorted small doc fixes (#310)
Browse files Browse the repository at this point in the history
* Point to more information for reader kwargs.

* Point to code of conduct; inherit docstring members.

* Clear output
  • Loading branch information
delucchi-cmu authored May 16, 2024
1 parent 3f7a423 commit 94c069b
Show file tree
Hide file tree
Showing 7 changed files with 50 additions and 40 deletions.
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ If it fixes an open issue, please link to the issue here. If this PR closes an i


## Code Quality
- [ ] I have read the Contribution Guide
- [ ] I have read the [Contribution Guide](https://hipscat-import.readthedocs.io/en/stable/guide/contributing.html) and [LINCC Frameworks Code of Conduct](https://lsstdiscoveryalliance.org/programs/lincc-frameworks/code-conduct/)
- [ ] My code follows the code style of this project
- [ ] My code builds (or compiles) cleanly without any errors or warnings
- [ ] My code contains relevant comments and necessary documentation
Expand Down
2 changes: 1 addition & 1 deletion docs/catalogs/arguments.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ A minimal arguments block will look something like:

.. code-block:: python
from hipscat_import.pipeline import ImportArguments
from hipscat_import.pipeline.catalog.arguments import ImportArguments
args = ImportArguments(
sort_columns="ObjectID",
Expand Down
14 changes: 13 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,9 @@
extensions = [
"sphinx.ext.mathjax",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
# viewcode and autoapi interaction creates some failures. See:
# https://github.com/readthedocs/sphinx-autoapi/issues/422
# "sphinx.ext.viewcode",
"sphinx.ext.intersphinx",
"sphinx_copybutton",
"autoapi.extension",
Expand All @@ -57,6 +59,16 @@
autoapi_add_toc_tree_entry = False
autoapi_member_order = "bysource"

autoapi_options = [
"members",
"undoc-members",
"show-inheritance",
"show-module-summary",
"special-members",
"imported-members",
"inherited-members",
]

napoleon_google_docstring = True

# -- sphinx-copybutton configuration ----------------------------------------
Expand Down
35 changes: 8 additions & 27 deletions docs/guide/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,11 @@ create and activate a new environment.
Once you have created a new environment, you can install this project for local
development using the following commands:
development using the following command:

.. code-block:: console
>> pip install -e .'[dev]'
>> pre-commit install
>> conda install pandoc
>> source .setup_dev.sh
Notes:
Expand All @@ -61,29 +59,6 @@ Notes:
the Python Project Template documentation on
`Sphinx and Python Notebooks <https://lincc-ppt.readthedocs.io/en/stable/practices/sphinx.html#python-notebooks>`_.


.. tip::
Installing on Mac

``healpy`` is a very necessary dependency for hipscat libraries at this time, but
native prebuilt binaries for healpy on Apple Silicon Macs
`do not yet exist <https://healpy.readthedocs.io/en/latest/install.html#binary-installation-with-pip-recommended-for-most-other-python-users>`_,
so it's recommended to install via conda before proceeding to hipscat-import.

.. code-block:: console
>> conda config --add channels conda-forge
>> conda install healpy
>> git clone https://github.com/astronomy-commons/hipscat-import
>> cd hipscat-import
>> pip install -e .
When installing dev dependencies, make sure to include the single quotes.

.. code-block:: console
>> pip install -e '.[dev]'
Testing
-------------------------------------------------------------------------------

Expand Down Expand Up @@ -131,3 +106,9 @@ Optional - Release a new version
Once your PR is merged you can create a new release to make your changes available.
GitHub's `instructions <https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository>`_ for doing so are here.
Use your best judgement when incrementing the version. i.e. is this a major, minor, or patch fix.

Be kind
-------------------------------------------------------------------------------

You are expected to comply with the
`LINCC Frameworks Code of Conduct <https://lsstdiscoveryalliance.org/programs/lincc-frameworks/code-conduct/>`_`.
17 changes: 11 additions & 6 deletions docs/notebooks/estimate_pixel_threshold.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
"## Create a sample parquet file\n",
"\n",
"The first step is to read in your survey data in its original form, and convert a sample into parquet. This has a few benefits:\n",
"- parquet uses compression in various ways, and by creating the sample, we can get a sense of both the overall and field-level compression with real dat\n",
"- parquet uses compression in various ways, and by creating the sample, we can get a sense of both the overall and field-level compression with real data\n",
"- using the importer `FileReader` interface now sets you up for more success when you get around to importing!\n",
"\n",
"If your data is already in parquet format, just change the `sample_parquet_file` path to an existing file, and don't run the second cell."
Expand Down Expand Up @@ -96,10 +96,10 @@
"parquet_file = pq.ParquetFile(sample_parquet_file)\n",
"num_rows = parquet_file.metadata.num_rows\n",
"\n",
"## 100MB\n",
"ideal_file_small = 100 * 1024 * 1024\n",
"## 800MB\n",
"ideal_file_large = 800 * 1024 * 1024\n",
"## 300MB\n",
"ideal_file_small = 300 * 1024 * 1024\n",
"## 1G\n",
"ideal_file_large = 1024 * 1024 * 1024\n",
"\n",
"threshold_small = ideal_file_small / sample_file_size * num_rows\n",
"threshold_large = ideal_file_large / sample_file_size * num_rows\n",
Expand Down Expand Up @@ -191,6 +191,11 @@
}
],
"metadata": {
"kernelspec": {
"display_name": "hipscatenv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
Expand All @@ -201,7 +206,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.15"
"version": "3.10.14"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/unequal_schema.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
"version": "3.10.14"
}
},
"nbformat": 4,
Expand Down
18 changes: 15 additions & 3 deletions src/hipscat_import/catalog/file_readers.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ def get_file_reader(
is available. for fits files, a list of columns to *keep*.
skip_column_names (list[str]): for fits files, a list of columns to remove.
type_map (dict): for CSV files, the data types to use for columns
kwargs: additional keyword arguments to pass to the underlying file reader.
"""
if file_format == "csv":
return CsvReader(
Expand Down Expand Up @@ -113,9 +114,11 @@ class CsvReader(InputReader):
column_names (list[str]): the names of columns if no header is available
type_map (dict): the data types to use for columns
parquet_kwargs (dict): additional keyword arguments to use when
reading the parquet schema metadata.
reading the parquet schema metadata, passed to pandas.read_parquet.
See https://pandas.pydata.org/docs/reference/api/pandas.read_parquet.html
kwargs (dict): additional keyword arguments to use when reading
the CSV files.
the CSV files with pandas.read_csv.
See https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
"""

def __init__(
Expand Down Expand Up @@ -188,7 +191,12 @@ class AstropyEcsvReader(InputReader):
"""Reads astropy ascii .ecsv files.
Note that this is NOT a chunked reader. Use caution when reading
large ECSV files with this reader."""
large ECSV files with this reader.
Attributes:
kwargs: keyword arguments passed to astropy ascii reader.
See https://docs.astropy.org/en/stable/api/astropy.io.ascii.read.html#astropy.io.ascii.read
"""

def __init__(self, **kwargs):
self.kwargs = kwargs
Expand Down Expand Up @@ -228,6 +236,8 @@ class FitsReader(InputReader):
one of `column_names` or `skip_column_names`
skip_column_names (list[str]): list of column names to skip. only use
one of `column_names` or `skip_column_names`
kwargs: keyword arguments passed along to astropy.Table.read.
See https://docs.astropy.org/en/stable/api/astropy.table.Table.html#astropy.table.Table.read
"""

def __init__(self, chunksize=500_000, column_names=None, skip_column_names=None, **kwargs):
Expand Down Expand Up @@ -281,6 +291,8 @@ class ParquetReader(InputReader):
chunksize (int): number of rows of the file to process at once.
For large files, this can prevent loading the entire file
into memory at once.
kwargs: arguments to pass along to pyarrow.parquet.ParquetFile.
See https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetFile.html
"""

def __init__(self, chunksize=500_000, **kwargs):
Expand Down

0 comments on commit 94c069b

Please sign in to comment.