-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #27 from NREL/feat/ingest-multiple-tables
Allow user to ingest multiple tables at once
- Loading branch information
Showing
29 changed files
with
1,771 additions
and
145 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line, and also | ||
# from the environment for the first two. | ||
SPHINXOPTS ?= | ||
SPHINXBUILD ?= sphinx-build | ||
SOURCEDIR = . | ||
BUILDDIR = _build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Configuration file for the Sphinx documentation builder. | ||
# | ||
# For the full list of built-in configuration values, see the documentation: | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html | ||
|
||
# -- Project information ----------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information | ||
|
||
project = "Chronify" | ||
copyright = "2024, NREL" | ||
author = "NREL" | ||
release = "0.1.0" | ||
|
||
# -- General configuration --------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration | ||
|
||
extensions = [ | ||
"myst_parser", | ||
"sphinx.ext.githubpages", | ||
"sphinx.ext.autodoc", | ||
"sphinx.ext.napoleon", | ||
"sphinx.ext.todo", | ||
"sphinx_copybutton", | ||
"sphinxcontrib.autodoc_pydantic", | ||
"sphinx_tabs.tabs", | ||
] | ||
|
||
templates_path = ["_templates"] | ||
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] | ||
|
||
|
||
# -- Options for HTML output ------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output | ||
|
||
source_suffix = { | ||
".txt": "markdown", | ||
".md": "markdown", | ||
} | ||
|
||
html_theme = "furo" | ||
html_title = "Chronify Documentation" | ||
html_theme_options = { | ||
"navigation_with_keys": True, | ||
} | ||
html_static_path = ["_static"] | ||
|
||
todo_include_todos = True | ||
autoclass_content = "both" | ||
autodoc_member_order = "bysource" | ||
todo_include_todos = True | ||
copybutton_only_copy_prompt_lines = True | ||
copybutton_exclude = ".linenos, .gp, .go" | ||
copybutton_line_continuation_character = "\\" | ||
copybutton_here_doc_delimiter = "EOT" | ||
copybutton_prompt_text = "$" | ||
copybutton_copy_empty_lines = False |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
```{eval-rst} | ||
.. _explanation-page: | ||
``` | ||
# Explanation | ||
|
||
```{eval-rst} | ||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Contents: | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Getting Started | ||
|
||
```{eval-rst} | ||
.. toctree:: | ||
:maxdepth: 2 | ||
installation | ||
quick_start | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
|
||
```{eval-rst} | ||
.. _installation: | ||
``` | ||
|
||
# Installation | ||
|
||
1. Install Python 3.11 or later. | ||
|
||
#. Create a Python 3.11+ virtual environment. This example uses the ``venv`` module in the standard | ||
library to create a virtual environment in your home directory. You may prefer a single | ||
`python-envs` in your home directory instead of the current directory. You may also prefer ``conda`` | ||
or ``mamba``. | ||
|
||
```{eval-rst} | ||
.. code-block:: console | ||
$ python -m venv env | ||
``` | ||
|
||
2. Activate the virtual environment. | ||
|
||
```{eval-rst} | ||
.. code-block:: console | ||
$ source env/bin/activate | ||
``` | ||
|
||
Whenever you are done using chronify, you can deactivate the environment by running ``deactivate``. | ||
|
||
3. Install the Python package `chronify`. | ||
|
||
```{eval-rst} | ||
.. code-block:: console | ||
$ pip install chronify | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# Quick Start | ||
|
||
```python | ||
|
||
from datetime import datetime, timedelta | ||
|
||
import numpy as np | ||
import pandas as pd | ||
from chronify import DatetimeRange, Store, TableSchema | ||
|
||
store = Store.create_file_db(file_path="time_series.db") | ||
resolution = timedelta(hours=1) | ||
time_range = pd.date_range("2020-01-01", "2020-12-31 23:00:00", freq=resolution) | ||
store.ingest_tables( | ||
( | ||
pd.DataFrame({"timestamp": time_range, "value": np.random.random(8784), "id": 1}), | ||
pd.DataFrame({"timestamp": time_range, "value": np.random.random(8784), "id": 2}), | ||
), | ||
TableSchema( | ||
name="devices", | ||
value_column="value", | ||
time_config=DatetimeRange( | ||
time_column="timestamp", | ||
start=datetime(2020, 1, 1, 0), | ||
length=8784, | ||
resolution=timedelta(hours=1), | ||
), | ||
time_array_id_columns=["id"], | ||
) | ||
) | ||
query = "SELECT timestamp, value FROM devices WHERE id = ?" | ||
df = store.read_query("devices", query, params=(2,)) | ||
df.head() | ||
``` | ||
|
||
``` | ||
timestamp value id | ||
0 2020-01-01 00:00:00 0.594748 2 | ||
1 2020-01-01 01:00:00 0.608295 2 | ||
2 2020-01-01 02:00:00 0.297535 2 | ||
3 2020-01-01 03:00:00 0.870238 2 | ||
4 2020-01-01 04:00:00 0.376144 2 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
```{eval-rst} | ||
.. _how-tos-page: | ||
``` | ||
# How Tos | ||
|
||
```{eval-rst} | ||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Contents: | ||
getting_started/index | ||
ingest_multiple_tables | ||
map_time_config | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
# How to Ingest Multiple Tables Efficiently | ||
|
||
There are a few important considerations when ingesting many tables: | ||
- Use one database connection. | ||
- Avoid loading all tables into memory at once, if possible. | ||
- Ensure additions are atomic. If anything fails, the final state should be the same as the initial | ||
state. | ||
|
||
**Setup** | ||
|
||
The input data are in CSV files. Each file contains a timestamp column and one value column per | ||
device. | ||
|
||
```python | ||
from datetime import datetime, timedelta | ||
|
||
import numpy as np | ||
import pandas as pd | ||
from chronify import DatetimeRange, Store, TableSchema, CsvTableSchema | ||
|
||
store = Store.create_in_memory_db() | ||
resolution = timedelta(hours=1) | ||
time_config = DatetimeRange( | ||
time_column="timestamp", | ||
start=datetime(2020, 1, 1, 0), | ||
length=8784, | ||
resolution=timedelta(hours=1), | ||
) | ||
src_schema = CsvTableSchema( | ||
time_config=time_config, | ||
column_dtypes=[ | ||
ColumnDType(name="timestamp", dtype=DateTime(timezone=False)), | ||
ColumnDType(name="device1", dtype=Double()), | ||
ColumnDType(name="device2", dtype=Double()), | ||
ColumnDType(name="device3", dtype=Double()), | ||
], | ||
value_columns=["device1", "device2", "device3"], | ||
pivoted_dimension_name="device", | ||
) | ||
dst_schema = TableSchema( | ||
name="devices", | ||
value_column="value", | ||
time_array_id_columns=["id"], | ||
) | ||
``` | ||
|
||
## Automated through chronfiy | ||
Chronify will manage the database connection and errors. | ||
```python | ||
store.ingest_from_csvs( | ||
src_schema, | ||
dst_schema, | ||
( | ||
"/path/to/file1.csv", | ||
"/path/to/file2.csv", | ||
"/path/to/file3.csv", | ||
), | ||
) | ||
|
||
``` | ||
|
||
## Self-Managed | ||
Open one connection to the database for the duration of your additions. Handle errors. | ||
```python | ||
with store.engine.connect() as conn: | ||
try: | ||
store.ingest_from_csv(src_schema, dst_schema, "/path/to/file1.csv") | ||
store.ingest_from_csv(src_schema, dst_schema, "/path/to/file2.csv") | ||
store.ingest_from_csv(src_schema, dst_schema, "/path/to/file3.csv") | ||
except Exception: | ||
conn.rollback() | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# How to Map Time | ||
This recipe demonstrates how to map a table's time configuration from one type to another. | ||
|
||
**Source table**: data is stored in representative time where there is one week of data per month by | ||
hour for one year. | ||
|
||
**Destination table**: data is stored with `datetime` timestamps for each hour of the year. | ||
|
||
**Workflow**: | ||
- Add the source table to the database. | ||
- Call `Store.map_table_time_config()` | ||
- Chronify adds the destination table to the database. | ||
|
||
This example creates a representative time table used in chronify's tests. | ||
|
||
1. Ingest the source data. | ||
|
||
```python | ||
from datetime import datetime, timedelta | ||
|
||
import numpy as np | ||
import pandas as pd | ||
|
||
from chronify import ( | ||
DatetimeRange, | ||
RepresentativePeriodFormat, | ||
RepresentativePeriodTime, | ||
Store, | ||
CsvTableSchema, | ||
TableSchema, | ||
) | ||
|
||
src_table_name = "ev_charging" | ||
dst_table_name = "ev_charging_datetime" | ||
hours_per_year = 12 * 7 * 24 | ||
num_time_arrays = 3 | ||
df = pd.DataFrame({ | ||
"id": np.concat([np.repeat(i, hours_per_year) for i in range(1, 1 + num_time_arrays)]), | ||
"month": np.tile(np.repeat(range(1, 13), 7 * 24), num_time_arrays), | ||
"day_of_week": np.tile(np.tile(np.repeat(range(7), 24), 12), num_time_arrays), | ||
"hour": np.tile(np.tile(range(24), 12 * 7), num_time_arrays), | ||
"value": np.random.random(hours_per_year * num_time_arrays), | ||
}) | ||
schema = TableSchema( | ||
name=src_table_name, | ||
value_column="value", | ||
time_config=RepresentativePeriodTime( | ||
time_format=RepresentativePeriodFormat.ONE_WEEK_PER_MONTH_BY_HOUR, | ||
), | ||
time_array_id_columns=["id"], | ||
) | ||
store = Store.create_in_memory_db() | ||
store.ingest_table(df, schema) | ||
store.read_query(src_table_name, f"SELECT * FROM {src_table_name} LIMIT 5").head() | ||
``` | ||
|
||
``` | ||
id month day_of_week hour value | ||
0 1 1 0 0 0.578496 | ||
1 1 1 0 1 0.092271 | ||
2 1 1 0 2 0.111521 | ||
3 1 1 0 3 0.671668 | ||
4 1 1 0 4 0.782365 | ||
``` | ||
|
||
2. Map the table's time to datetime. | ||
```python | ||
dst_schema = TableSchema( | ||
name=dst_table_name, | ||
value_column="value", | ||
time_array_id_columns=["id"], | ||
time_config=DatetimeRange( | ||
time_column="timestamp", | ||
start=datetime(2020, 1, 1, 0), | ||
length=8784, | ||
resolution=timedelta(hours=1), | ||
) | ||
) | ||
store.map_table_time_config(src_table_name, dst_schema) | ||
store.read_query(dst_table_name, f"SELECT * FROM {dst_table_name} LIMIT 5").head() | ||
``` | ||
|
||
``` | ||
id value timestamp | ||
0 3 0.006213 2020-01-01 00:00:00 | ||
1 3 0.865765 2020-01-01 01:00:00 | ||
2 3 0.187256 2020-01-01 02:00:00 | ||
3 3 0.336157 2020-01-01 03:00:00 | ||
4 3 0.582281 2020-01-01 04:00:00 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Chronify | ||
|
||
This package implements validation, mapping, and storage of time series data in support of | ||
Python-based modeling packages. | ||
|
||
## Features | ||
- Stores time series data in any database supported by SQLAlchemy. | ||
- Supports data ingestion in a variety of file formats and configurations. | ||
- Supports efficient retrieval of time series through SQL queries. | ||
- Validates consistency of timestamps and resolution. | ||
- Provides mappings between different time configurations. | ||
|
||
```{eval-rst} | ||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Contents: | ||
:hidden: | ||
how_tos/index | ||
tutorials/index | ||
reference/index | ||
explanation/index | ||
``` | ||
|
||
## How to use this guide | ||
- Refer to [How Tos](#how-tos-page) for step-by-step instructions for creating store and ingesting data. | ||
- Refer to [Tutorials](#tutorials-page) examples of ingesting different types of data and mapping | ||
between time configurations. | ||
- Refer to [Reference](#reference-page) for API reference material. | ||
- Refer to [Explanation](#explanation-page) for descriptions and behaviors of the time series store. | ||
|
||
# Indices and tables | ||
|
||
- {ref}`genindex` | ||
- {ref}`modindex` | ||
- {ref}`search` |
Oops, something went wrong.