diff --git a/docs/catalogs/arguments.rst b/docs/catalogs/arguments.rst
index b2ebb7ba..d7ab545e 100644
--- a/docs/catalogs/arguments.rst
+++ b/docs/catalogs/arguments.rst
@@ -220,6 +220,7 @@ There are several stages to the pipeline execution, and you can expect progress
 reporting to look like the following:
 
 .. code-block::
+    :class: no-copybutton
 
     Mapping  : 100%|██████████| 72/72 [58:55:18<00:00, 2946.09s/it]
     Binning  : 100%|██████████| 1/1 [01:15<00:00, 75.16s/it]
diff --git a/docs/conf.py b/docs/conf.py
index 836f4b84..d7d1df2a 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -24,16 +24,23 @@
 # -- General configuration ---------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
 
-extensions = ["sphinx.ext.mathjax", "sphinx.ext.napoleon", "sphinx.ext.viewcode"]
-
-extensions.append("autoapi.extension")
-extensions.append("nbsphinx")
-
 templates_path = []
 exclude_patterns = ["_build", "**.ipynb_checkpoints"]
 
 # This assumes that sphinx-build is called from the root directory
 master_doc = "index"
+html_theme = "sphinx_rtd_theme"
+
+extensions = [
+    "sphinx.ext.mathjax",
+    "sphinx.ext.napoleon",
+    "sphinx.ext.viewcode",
+    "sphinx_copybutton",
+    "autoapi.extension",
+    "nbsphinx",
+]
+
+# -- autoapi configuration ----------------------------------------
 # Remove 'view source code' from top of page (for html, not python)
 html_show_sourcelink = False
 # Remove namespaces from class/method signatures
@@ -46,4 +53,12 @@
 autoapi_member_order = "bysource"
 
 napoleon_google_docstring = True
-html_theme = "sphinx_rtd_theme"
+
+# -- sphinx-copybutton configuration ----------------------------------------
+## sets up the expected prompt text from console blocks, and excludes it from
+## the text that goes into the clipboard.
+copybutton_exclude = ".linenos, .gp"
+copybutton_prompt_text = ">> "
+
+## lets us suppress the copy button on select code blocks.
+copybutton_selector = "div:not(.no-copybutton) > div.highlight > pre"
diff --git a/docs/guide/contributing.rst b/docs/guide/contributing.rst
index 13abf1bf..b6ca6c71 100644
--- a/docs/guide/contributing.rst
+++ b/docs/guide/contributing.rst
@@ -33,7 +33,7 @@ virtual environment. LINCC-Frameworks engineers primarily use `conda` to manage
 environments. If you have conda installed locally, you can run the following to
 create and activate a new environment.
 
-.. code-block:: bash
+.. code-block:: console
 
    >> conda create env -n <env_name> python=3.10
    >> conda activate <env_name>
@@ -42,7 +42,7 @@ create and activate a new environment.
 Once you have created a new environment, you can install this project for local
 development using the following commands:
 
-.. code-block:: bash
+.. code-block:: console
 
    >> pip install -e .'[dev]'
    >> pre-commit install
@@ -70,19 +70,19 @@ Notes:
     `do not yet exist <https://healpy.readthedocs.io/en/latest/install.html#binary-installation-with-pip-recommended-for-most-other-python-users>`_, 
     so it's recommended to install via conda before proceeding to hipscat-import.
 
-    .. code-block:: bash
+    .. code-block:: console
 
-        $ conda config --add channels conda-forge
-        $ conda install healpy
-        $ git clone https://github.com/astronomy-commons/hipscat-import
-        $ cd hipscat-import
-        $ pip install -e .
+        >> conda config --add channels conda-forge
+        >> conda install healpy
+        >> git clone https://github.com/astronomy-commons/hipscat-import
+        >> cd hipscat-import
+        >> pip install -e .
         
     When installing dev dependencies, make sure to include the single quotes.
 
-    .. code-block:: bash
+    .. code-block:: console
         
-        $ pip install -e '.[dev]'
+        >> pip install -e '.[dev]'
 
 Testing
 -------------------------------------------------------------------------------
@@ -97,17 +97,17 @@ paths. These are defined in ``conftest.py`` files. They're powerful and flexible
 Please add or update unit tests for all changes made to the codebase. You can run
 unit tests locally simply with:
 
-.. code-block:: bash
+.. code-block:: console
 
-    pytest
+    >> pytest
 
 If you're making changes to the sphinx documentation (anything under ``docs``),
 you can build the documentation locally with a command like:
 
-.. code-block:: bash
+.. code-block:: console
 
-    cd docs
-    make html
+    >> cd docs
+    >> make html
 
 Create your PR
 -------------------------------------------------------------------------------
diff --git a/docs/guide/index_table.rst b/docs/guide/index_table.rst
index a5f6d1bb..74382e58 100644
--- a/docs/guide/index_table.rst
+++ b/docs/guide/index_table.rst
@@ -1,8 +1,293 @@
 Index Table
 ===============================================================================
 
-See the API documentation for :py:class:`hipscat_import.index.arguments.IndexArguments`
+This page discusses topics around setting up a pipeline to generate a secondary
+index lookup for a field on an existing hipscat catalog on disk.
 
-.. note::
+This is useful if you would like to have quick access to rows of your table using
+a survey-provided unique identifier that is NOT spatially correlated. To find 
+the row, you would have to perform a full scan of all partitions. A secondary
+index performs this full scan and shuffle and writes out a simple mapping from 
+the unique identifier to where it can be found in the sky. 
 
-    TODO - write the rest
\ No newline at end of file
+At a minimum, you need arguments that include where to find the input files,
+and where to put the output files. A minimal arguments block will look something like:
+
+.. code-block:: python
+
+    from hipscat_import.index.arguments import IndexArguments
+
+    args = IndexArguments(
+        input_catalog_path="./my_data/my_catalog",
+        indexing_column="target_id",
+        output_path="./output",
+        output_artifact_name="my_catalog_target_id",
+    )
+
+More details on each of these parameters is provided in sections below.
+
+For the curious, see the API documentation for 
+:py:class:`hipscat_import.index.arguments.IndexArguments`,
+and its superclass :py:class:`hipscat_import.runtime_arguments.RuntimeArguments`.
+
+Dask setup
+-------------------------------------------------------------------------------
+
+We will either use a user-provided dask ``Client``, or create a new one with
+arguments:
+
+``dask_tmp`` - ``str`` - directory for dask worker space. this should be local to
+the execution of the pipeline, for speed of reads and writes. For much more 
+information, see :doc:`/catalogs/temp_files`
+
+``dask_n_workers`` - ``int`` - number of workers for the dask client. Defaults to 1.
+
+``dask_threads_per_worker`` - ``int`` - number of threads per dask worker. Defaults to 1.
+
+If you find that you need additional parameters for your dask client (e.g are creating
+a SLURM worker pool), you can instead create your own dask client and pass along 
+to the pipeline, ignoring the above arguments. This would look like:
+
+.. code-block:: python
+
+    from dask.distributed import Client
+    from hipscat_import.pipeline import pipeline_with_client
+
+    args = IndexArguments(...)
+    with Client('scheduler:port') as client:
+        pipeline_with_client(args, client)
+
+If you're running within a ``.py`` file, we recommend you use a ``main`` guard to
+potentially avoid some python threading issues with dask:
+
+.. code-block:: python
+
+    from hipscat_import.pipeline import pipeline
+
+    def index_pipeline():
+        args = IndexArguments(...)
+        pipeline(args)
+
+    if __name__ == '__main__':
+        index_pipeline()
+
+Input Catalog
+-------------------------------------------------------------------------------
+
+For this pipeline, you will need to have already transformed your catalog into 
+hipscat parquet format. Provide the path to the catalog data with the argument
+``input_catalog_path``.
+
+``indexing_column`` is required, and is the column that you would like to create
+and index and sort over.
+
+``extra_columns`` will also be stored alongside the ``indexing_column``. This 
+can be useful if you will need to perform some lookups frequently, and you 
+can just perform a single read.
+
+
+Divisions
+-------------------------------------------------------------------------------
+
+In generating an index over a catalog column, we use dask's ``set_index``
+method to shuffle the catalog data around. This can be a very expensive operation. 
+We can save a lot of time and general compute resources if we have some intelligent 
+prior information about the distribution of the values inside the column we're 
+building an index on.
+
+Roughly speaking, the index table will have some even buckets of values for 
+the ``indexing_column``. The ``division_hints`` argument provides a reasonable
+prior for starting values for those histogram bins.
+
+Note that these will NOT necessarily be the divisions that the data is 
+partitioned along.
+
+.. note:: 
+    Use a python list
+
+    It's important to dask that the divisions be a list, and not a numpy array,
+    and don't forget to append the maximum value as an extra division at the end.
+
+
+String IDs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Gaia DR3 uses a string identifier for its catalog and we show how to create
+sample divisions for that data.
+
+We can create these divisions with just the **prefixes** of strings, and 
+string sorting will be smart enough to collate the various strings appropriately.
+
+.. code-block:: python
+
+    divisions = [f"Gaia DR3 {i}" for i in range(10000, 99999, 12)]
+    divisions.append("Gaia DR3 999999988604363776")
+
+Getting hints from ``_metadata``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. note:: 
+    Don't panic!
+
+    This is totally optional, and just provided here for reference if you
+    really aren't sure how to provide some division priors.
+
+Parquet's ``_metadata`` file provides some high-level statistics about its columns,
+which includes the minimum and maximum value within individual parquet files.
+By reading just the ``_metadata`` file, we can construct a reasonable set 
+of divisions.
+
+First, find the minimum and maximum values across all of our data. We do this
+just by looking inside that _metadata file - we don't need to do a full 
+catalog scan for these high-level statistics!
+
+Then use those values, and a little arithmetic to create a **list** of divisions 
+(it's important to dask that this be a list, and not a numpy array). Pass this 
+list along to your ``ImportArguments``!
+
+.. code-block:: python
+
+    import numpy as np
+    import os
+    from hipscat.io.parquet_metadata import write_parquet_metadata
+    from hipscat.io import file_io
+
+    ## Specify the catalog and column you're making your index over.
+    input_catalog_path="/data/input_catalog"
+    indexing_column="target_id"
+
+    ## you might not need to change anything after that.
+    total_metadata = file_io.read_parquet_metadata(os.path.join(input_catalog_path, "_metadata"))
+
+    # This block just finds the indexing column within the _metadata file
+    first_row_group = total_metadata.row_group(0)
+    index_column_idx = -1
+    for i in range(0, first_row_group.num_columns):
+        column = first_row_group.column(i)
+        if column.path_in_schema == indexing_column:
+            index_column_idx = i
+
+    # Now loop through all of the partitions in the input data and find the 
+    # overall bounds of the indexing_column.
+    num_row_groups = total_metadata.num_row_groups
+    global_min = total_metadata.row_group(0).column(index_column_idx).statistics.min
+    global_max = total_metadata.row_group(0).column(index_column_idx).statistics.max
+
+    for index in range(1, num_row_groups):
+        global_min = min(global_min, total_metadata.row_group(index).column(index_column_idx).statistics.min)
+        global_max = max(global_max, total_metadata.row_group(index).column(index_column_idx).statistics.max)
+
+    print("global min", global_min)
+    print("global max", global_max)
+
+    increment = int((global_max-global_min)/num_row_groups)
+
+    divisions = np.append(np.arange(start = global_min, stop = global_max, step = increment), global_max)
+    divisions = divisions.tolist()
+
+
+Progress Reporting
+-------------------------------------------------------------------------------
+
+By default, we will display some progress bars during pipeline execution. To 
+disable these (e.g. when you expect no output to standard out), you can set
+``progress_bar=False``.
+
+There are several stages to the pipeline execution, and you can expect progress
+reporting to look like the following:
+
+.. code-block::
+    :class: no-copybutton
+
+    Mapping  : 100%|██████████| 2352/2352 [9:25:00<00:00, 14.41s/it]
+    Reducing : 100%|██████████| 2385/2385 [00:43<00:00, 54.47it/s] 
+    Finishing: 100%|██████████| 4/4 [00:03<00:00,  1.15it/s]
+
+For very long-running pipelines (e.g. multi-TB inputs), you can get an 
+email notification when the pipeline completes using the 
+``completion_email_address`` argument. This will send a brief email, 
+for either pipeline success or failure.
+
+Output
+-------------------------------------------------------------------------------
+
+Where?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You must specify a name for the index table, using ``output_artifact_name``.
+A good convention is the name of the primary input catalog, followed by the
+index column threshold, e.g. ``gaia_designation`` would be an index table
+based on gaia that indexes over the ``designation`` field.
+
+You must specify where you want your index table to be written, using
+``output_path``. This path should be the base directory for your catalogs, as 
+the full path for the index will take the form of ``output_path/output_artifact_name``.
+
+If there is already catalog or index data in the indicated directory, you can 
+force new data to be written in the directory with the ``overwrite`` flag. It's
+preferable to delete any existing contents, however, as this may cause 
+unexpected side effects.
+
+If you're writing to cloud storage, or otherwise have some filesystem credential
+dict, put those in ``output_storage_options``.
+
+In addition, you can specify directories to use for various intermediate files:
+
+- dask worker space (``dask_tmp``)
+- sharded parquet files (``tmp_dir``)
+
+Most users are going to be ok with simply setting the ``tmp_dir`` for all intermediate
+file use. For more information on these parameters, when you would use each, 
+and demonstrations of temporary file use see :doc:`/catalogs/temp_files`
+
+How?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You may want to tweak parameters of the final index output, and we have helper 
+arguments for a few of those.
+
+``compute_partition_size`` - ``int`` - partition size used when 
+computing the leaf parquet files.
+
+``include_hipscat_index`` - ``bool`` - whether or not to include the 64-bit
+hipscat spatial index in the index table. Defaults to ``True``. It can be 
+useful to keep this value if the ``_hipscat_index`` is your only unique
+identifier, or you intend to re-partition your data.
+
+``include_order_pixel`` - ``bool`` - whether to include partitioning columns, 
+``Norder``, ``Dir``, and ``Npix``. You probably want to keep these!
+Defaults to ``True``. If you change this, there might be unexpected behavior
+when trying to use the index table.
+
+``drop_duplicates`` - ``bool`` - drop duplicate occurrences of all fields
+that are included in the index table. This is enabled by default, but can be
+**very** slow. This has an interaction with the above ``include_hipscat_index``
+and ``include_order_pixel`` options above. We desribe some common patterns below:
+
+- I want to create an index over the target ID in my catalog. There are no
+  lightcurves in my data and it is a flat catalog.
+
+    .. code-block:: python
+
+        indexing_column="target_id",
+        # target_id is unique, and I don't need to keep extra data
+        include_hipscat_index=False,
+        # I want to know where my data is in the sky.
+        include_order_pixel=True,
+        # target_id is unique, and I don't need to do extra work to de-duplicate
+        drop_duplicates=False,
+
+- I have a catalog of light curve data. there is a unique ``detection_id``
+  and light curves are grouped by the ``target_id``. I want to create an 
+  index over the ``target_id`` to quickly get a light curve for a target.
+  I want one row in my index for each partition with a given ``target_id``
+
+    .. code-block:: python
+
+        indexing_column="target_id",
+        # target_id is NOT unique
+        drop_duplicates=True,
+        # target_id is NOT unique, but including the _hipscat_index will bloat results
+        include_hipscat_index=False,
+        # I want to know where my data is in the sky.
+        include_order_pixel=True,
diff --git a/docs/guide/margin_cache.rst b/docs/guide/margin_cache.rst
index 1c97dcb6..e081ab3d 100644
--- a/docs/guide/margin_cache.rst
+++ b/docs/guide/margin_cache.rst
@@ -1,9 +1,151 @@
 Margin Cache
 ===============================================================================
 
-See the API documentation for 
-:py:class:`hipscat_import.margin_cache.margin_cache_arguments.MarginCacheArguments`
+For more discussion of the whys and hows of margin caches, please see 
+`Max's AAS iPoster <https://aas242-aas.ipostersessions.com/?s=66-E9-54-B6-6B-C3-4B-47-79-24-44-5A-13-25-82-E7>`_
+for more information.
 
-.. note::
+This page discusses topics around setting up a pipeline to generate a margin
+cache from an existing hipscat catalog on disk.
 
-    TODO - write the rest
\ No newline at end of file
+At a minimum, you need arguments that include where to find the input files,
+and where to put the output files. A minimal arguments block will look something like:
+
+.. code-block:: python
+
+    from hipscat_import.margin_cache.margin_cache_arguments import MarginCacheArguments
+
+    args = MarginCacheArguments(
+        input_catalog_path="./my_data/my_catalog",
+        output_path="./output",
+        output_artifact_name="my_catalog_10arcs",
+    )
+
+More details on each of these parameters is provided in sections below.
+
+For the curious, see the API documentation for 
+:py:class:`hipscat_import.margin_cache.margin_cache_arguments.MarginCacheArguments`,
+and its superclass :py:class:`hipscat_import.runtime_arguments.RuntimeArguments`.
+
+Dask setup
+-------------------------------------------------------------------------------
+
+We will either use a user-provided dask ``Client``, or create a new one with
+arguments:
+
+``dask_tmp`` - ``str`` - directory for dask worker space. this should be local to
+the execution of the pipeline, for speed of reads and writes. For much more 
+information, see :doc:`/catalogs/temp_files`
+
+``dask_n_workers`` - ``int`` - number of workers for the dask client. Defaults to 1.
+
+``dask_threads_per_worker`` - ``int`` - number of threads per dask worker. Defaults to 1.
+
+If you find that you need additional parameters for your dask client (e.g are creating
+a SLURM worker pool), you can instead create your own dask client and pass along 
+to the pipeline, ignoring the above arguments. This would look like:
+
+.. code-block:: python
+
+    from dask.distributed import Client
+    from hipscat_import.pipeline import pipeline_with_client
+
+    args = MarginCacheArguments(...)
+    with Client('scheduler:port') as client:
+        pipeline_with_client(args, client)
+
+If you're running within a ``.py`` file, we recommend you use a ``main`` guard to
+potentially avoid some python threading issues with dask:
+
+.. code-block:: python
+
+    from hipscat_import.pipeline import pipeline
+
+    def margin_pipeline():
+        args = MarginCacheArguments(...)
+        pipeline(args)
+
+    if __name__ == '__main__':
+        margin_pipeline()
+
+Input Catalog
+-------------------------------------------------------------------------------
+
+For this pipeline, you will need to have already transformed your catalog into 
+hipscat parquet format. Provide the path to the catalog data with the argument
+``input_catalog_path``.
+
+The input hipscat catalog will provide its own right ascension and declination
+that will be used when computing margin populations.
+
+Margin calculation parameters
+-------------------------------------------------------------------------------
+
+When creating a margin catalog, we need to know how large of a margin to include
+around each pixel in the input catalog.
+
+``margin_threshold`` is the size of the margin cache boundary, given in arcseconds.
+This defaults to 5 arcseconds, but you should set this value to whatever is
+appropriate for the astrometry error/PSF width for your instruments. If you're
+not sure how to determine this, please reach out! We'd love to help! :doc:`/guide/contact`.
+
+Setting ``margin_order`` *can* make your pipeline run faster.
+
+#. For each input catalog partition, we can quickly determine all possible 
+   neighboring healpix pixels at the given ``margin_order``. All of these partitions 
+   *may* contain points that are inside the ``margin_threshold``.
+#. For each point in the input catalog, we can quickly determine the healpix
+   pixel at ``margin_order`` and filter points based on this. 
+#. Using this smaller, constrained data set, we do precise boundary checking
+   to determine if the points are within the ``margin_threshold``.
+
+Progress Reporting
+-------------------------------------------------------------------------------
+
+By default, we will display some progress bars during pipeline execution. To 
+disable these (e.g. when you expect no output to standard out), you can set
+``progress_bar=False``.
+
+There are several stages to the pipeline execution, and you can expect progress
+reporting to look like the following:
+
+.. code-block::
+    :class: no-copybutton
+
+    Mapping  : 100%|██████████| 2352/2352 [9:25:00<00:00, 14.41s/it]
+    Reducing : 100%|██████████| 2385/2385 [00:43<00:00, 54.47it/s] 
+    Finishing: 100%|██████████| 4/4 [00:03<00:00,  1.15it/s]
+
+For very long-running pipelines (e.g. multi-TB inputs), you can get an 
+email notification when the pipeline completes using the 
+``completion_email_address`` argument. This will send a brief email, 
+for either pipeline success or failure.
+
+Output
+-------------------------------------------------------------------------------
+
+You must specify a name for the margin catalog, using ``output_artifact_name``.
+A good convention is the name of the primary input catalog, followed by the
+margin threshold, e.g. ``gaia_10arcs`` would be a margin catalog based on gaia
+that uses 10 arcseconds for margins.
+
+You must specify where you want your margin data to be written, using
+``output_path``. This path should be the base directory for your catalogs, as 
+the full path for the margin will take the form of ``output_path/output_artifact_name``.
+
+If there is already catalog or margin data in the indicated directory, you can 
+force new data to be written in the directory with the ``overwrite`` flag. It's
+preferable to delete any existing contents, however, as this may cause 
+unexpected side effects.
+
+If you're writing to cloud storage, or otherwise have some filesystem credential
+dict, put those in ``output_storage_options``.
+
+In addition, you can specify directories to use for various intermediate files:
+
+- dask worker space (``dask_tmp``)
+- sharded parquet files (``tmp_dir``)
+
+Most users are going to be ok with simply setting the ``tmp_dir`` for all intermediate
+file use. For more information on these parameters, when you would use each, 
+and demonstrations of temporary file use see :doc:`/catalogs/temp_files`
diff --git a/docs/index.rst b/docs/index.rst
index 799e98d4..14e41452 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -9,7 +9,7 @@ Installation
 We recommend installing in a virtual environment, like venv or conda. You may
 need to install or upgrade versions of dependencies to work with hipscat-import.
 
-.. code-block:: bash
+.. code-block:: console
 
     pip install hipscat-import
 
@@ -21,10 +21,10 @@ need to install or upgrade versions of dependencies to work with hipscat-import.
     `do not yet exist <https://healpy.readthedocs.io/en/latest/install.html#binary-installation-with-pip-recommended-for-most-other-python-users>`_, 
     so it's recommended to install via conda before proceeding to hipscat-import.
 
-    .. code-block:: bash
+    .. code-block:: console
 
-        $ conda config --append channels conda-forge
-        $ conda install healpy
+        >> conda config --append channels conda-forge
+        >> conda install healpy
 
 Setting up a pipeline
 -------------------------------------------------------------------------------
diff --git a/docs/requirements.txt b/docs/requirements.txt
index e4a22bb3..157d2f40 100644
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -5,3 +5,4 @@ nbsphinx
 ipython
 jupytext
 jupyter
+sphinx-copybutton
\ No newline at end of file