An extension of pandas for efficient representation of nested associated datasets.
Nested-Pandas extends the pandas package with tooling and support for nested dataframes packed into values of top-level dataframe columns. Pyarrow is used internally to aid in scalability and performance.
Nested-Pandas is motivated by time-domain astronomy use cases, where we see
typically two levels of information, information about astronomical objects and
then an associated set of N
measurements of those objects. Nested-Pandas offers
a performant and memory-efficient package for working with these types of datasets.
Core advantages being:
- hierarchical column access
- efficient packing of nested information into inputs to custom user functions
- avoiding costly groupby operations
This is a LINCC Frameworks project - find more information about LINCC Frameworks here.
This project is supported by Schmidt Sciences.