Install

CAUTION: PROTOTYPE! NO PRODUCTION USE, not even development.

Merdb is a data processing library that

is a relational api to query data (like SQL but in Python)
has Unix like pipes to compose operators using the | syntax
scales to multi core or a cluster(via Modin)
processes data too big to fit into memory(via Modin)
support interactive and optimized processing(optimizations in roadmap)

Install

pip install merdb

Example

import pandas as pd
from merdb.interactive import *

def is_senior(row) -> bool:
    return row['age'] > 35


def double_age(row) -> int:
    return row["age"] * 2


cols = ["name", "age"]
people_df = pd.DataFrame([
    ["Raj", 35],
    ["Sona", 20],
    ["Abby", 70],
    ["Abba", 90],
], columns=cols)

# One can specify functions without any source data like quadruple age
# map is a merdb function
quadruple_age = map(double_age, "age") | map(double_age, "age")

result = (t(people_df) # convert people_df to a merdb table
          | where(is_senior)
          | order_by("name", "asc")
          | quadruple_age # Unix like pipe syntax making it easy to refactor out intermediate processing
          | select("age")
          | rename({"age": "new_age"})
          )

# Convert to Pandas Dataframe and print
print(result.df())

# Output
   new_age
0      360
1      280

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src/merdb		src/merdb
tests		tests
.gitignore		.gitignore
README.md		README.md
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Example

About

Releases

Packages

Languages

RAbraham/merdb

Folders and files

Latest commit

History

Repository files navigation

Install

Example

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages