TabFairGAN

This repository contains the implementation of TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks . TabFairGAN is a synthetic tabular data generator which could produce synthetic data, with or without fairness constraint. The model uses a Wasserstein Generative Adversarial Network to produce synthetic data with high quality.

Installation

You can install TabFairGAN either via pip from PyPI or directly from this repository.

Option 1: Install from PyPI

To install the latest version of TabFairGAN from PyPI, run:

pip install tabfairgan

Option 2: Install from source

Alternatively, you can install the package directly from the source:

git clone https://github.com/amirarsalan90/TabFairGAN.git
cd TabFairGAN
pip install .

Usage

TabFairGAN is used programmatically in Python. You can either generate synthetic data with fairness constraints or without fairness constraints. The package now provides a more modular interface.

Basic Usage

Without Fairness Constraints: If you do not need fairness constraints, you simply omit the fairness_config parameter.
With Fairness Constraints: To enforce fairness constraints, you must pass a dictionary with specific parameters as explained below.

Example 1: Without Fairness Constraints

import pandas as pd
from tabfairgan import TFG

# Load your dataset
df = pd.read_csv("adult.csv")

# Initialize TabFairGAN without fairness constraints
tfg = TFG(df, epochs=200, batch_size=256, device='cuda:0')

# Train the model
tfg.train()

# Generate synthetic data
fake_df = tfg.generate_fake_df(num_rows=32561)

In this case, the model will focus solely on generating high-quality synthetic data without considering fairness.

Example 2: With Fairness Constraints

To generate fair synthetic data, you need to pass a dictionary containing the following parameters:

fair_epochs: Number of fair training epochs (integer).
lamda: Lambda parameter controlling the trade-off between fairness and accuracy (float).
S: Protected attribute (string, e.g., "sex").
Y: Decision label (string, e.g., "income").
S_under: Value representing the underprivileged group for the protected attribute (string, e.g., " Female").
Y_desire: Desired value for the label (string, e.g., " >50K").

import pandas as pd
from tabfairgan import TFG

# Load your dataset
df = pd.read_csv("adult/adult.csv")

# Define fairness configuration
fairness_config = {
    'fair_epochs': 50,
    'lamda': 0.5,
    'S': 'sex',
    'Y': 'income',
    'S_under': ' Female',
    'Y_desire': ' >50K'
}

# Initialize TabFairGAN with fairness constraints
tfg = TFG(df, epochs=200, batch_size=256, device='cuda:0', fairness_config=fairness_config)

# Train the model
tfg.train()

# Generate synthetic data
fake_df = tfg.generate_fake_df(num_rows=32561)

In this case, the model will generate synthetic data that not only preserves high quality but also enforces fairness with respect to the specified protected attribute and decision label.

Important Notes:

Fairness Configuration: If you want to use fairness constraints, you must provide a dictionary containing all the required fairness parameters: fair_epochs, lamda, S, Y, S_under, and Y_desire.
Without Fairness: If no fairness_config is provided, the model will default to generating synthetic data without fairness constraints.

Citing TabFairGAN

If you use TabFairGAN, please cite the paper:

@article{rajabi2022tabfairgan,
  title={Tabfairgan: Fair tabular data generation with generative adversarial networks},
  author={Rajabi, Amirarsalan and Garibay, Ozlem Ozmen},
  journal={Machine Learning and Knowledge Extraction},
  volume={4},
  number={2},
  pages={488--501},
  year={2022},
  publisher={MDPI}
}

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
adult		adult
notebook		notebook
src/tabfairgan		src/tabfairgan
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TabFairGAN

Installation

Option 1: Install from PyPI

Option 2: Install from source

Usage

Basic Usage

Example 1: Without Fairness Constraints

Example 2: With Fairness Constraints

Important Notes:

Citing TabFairGAN

About

Releases 1

Packages

Contributors 2

Languages

License

amirarsalan90/TabFairGAN

Folders and files

Latest commit

History

Repository files navigation

TabFairGAN

Installation

Option 1: Install from PyPI

Option 2: Install from source

Usage

Basic Usage

Example 1: Without Fairness Constraints

Example 2: With Fairness Constraints

Important Notes:

Citing TabFairGAN

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages