docs | |
---|---|
tests | |
package |
This package allows you to read/write pandas dataframes in MongoDB in the simplest way possible.
- Free software: MIT license
Install pdmongo:
pip install pdmongo
Write a pandas DataFrame to a MongoDB collection:
import pandas as pd import pdmongo as pdm df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df.to_mongo("MyCollection", "mongodb://localhost:27017/mydb")
Read a MongoDB collection into a pandas DataFrame:
import pdmongo as pdm df = pdm.read_mongo("MyCollection", [], "mongodb://localhost:27017/mydb") print(df)
You can use an aggregation query to filter/transform data in MongoDB before fetching them into a data frame. This allows you to delegate the slow operation to MongoDB.
Reading a collection from MongoDB into a pandas DataFrame by using an aggregation query:
import pdmongo as pdm import pandas as pd # First generate some data and write them to MongoDB df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df.to_mongo(df, 'MyCollection', "mongodb://localhost:27017/mydb") # Filter with an aggregate query and parse results into a data frame. query = [{"$match": {'A': 1} }] df = pdm.read_mongo("MyCollection", query, "mongodb://localhost:27017/mydb") print(df) # Only values where A > 1 is returned
The query accepts the same arguments as the aggregate method of pymongo package.
You can write a MongoDB collection to a PostgreSQL table:
import numpy as np import pandas as pd import pdmongo as pdm from sqlalchemy import create_engine # Generate some data and write them to MongoDB df = pd.DataFrame({'A': [1, 2, 3]}) df.to_mongo("MyCollection", "mongodb://localhost:27017/mydb") # Read data from MongoDB and write them to PostgreSQL new_df = pdm.read_mongo("MyCollection", [], "mongodb://localhost:27017/mydb") engine = create_engine('postgres://postgres:postgres@localhost:5432', echo=False) new_df[["A"]].to_sql("APostgresTable", engine)
You can plot a collection retrieved from MongoDB
import numpy as np import pandas as pd import pdmongo as pdm import matplotlib.pyplot as plt # Generate data and write them to MongoDB df = pd.DataFrame({'Value': np.random.randn(1000)}) df.to_mongo('TimeSeries', 'mongodb://localhost:27017/mydb') # Read collection from MongoDB and plot data new_df = pdm.read_mongo("TimeSeries", [], "mongodb://localhost:27017/mydb") new_df.plot() plt.show()
pip install pdmongo
You can also install the in-development version with:
pip install https://github.com/pakallis/python-pandas-mongo/archive/master.zip
You can find the documentation at:
https://python-pandas-mongo.readthedocs.io/
To run the all tests run:
tox
Note, to combine the coverage data from all the tox environments run:
Windows | set PYTEST_ADDOPTS=--cov-append tox |
---|---|
Other | PYTEST_ADDOPTS=--cov-append tox |