Skip to content

Describe columns in a pandas dataframe for data quality testing

License

Notifications You must be signed in to change notification settings

rogelj/DescribeCol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DescribeCol

License: GPL v3 Python Version PRs Welcome Twitter BioSite Website Website

DescribeCol is a function that facilitates data quality testing. It is a basic function that takes information form a pandas dataframe and based on the type of the data it provides information such as:

  • data description
  • data visualisation
  • data examples

USAGE

describecol(df, n)

df - pandas DataFrame n - index for the column to be described

Example:

import pandas as pd
import describecol as dc

d = {'one': [1., 2., 3., 4., 5.],
     'two': ['red', 'green', 'green', 'red', 'red']}
df = pd.DataFrame(d)
cols = list(df.columns)

for n in range(0, len(df.columns)):
    dc.describecol(df, n)
   

About

Describe columns in a pandas dataframe for data quality testing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages