Skip to content

Collection of all the papers talking about/relevant to the topic of privacy-preserving LLMs

Notifications You must be signed in to change notification settings

michele17284/Awesome-Privacy-Preserving-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 

Repository files navigation

Taxonomy of Privacy-Preserving LLMs

Awesome-Privacy-Preserving-LLMs Awesome

LLMs have taken the world by storm, showing outstanding capabilities in several NLP-related domains. They have been proven to have astonishing emergent capabilities and unfortunately it has become painfully obvious that memorization is one of them. While this is not a problem for models dealing with public data, when the task at hand requires to deal with sensitive data this issue cannot be overlooked. This is why, spurring from our research survey, we present here a curated list of papers on the subjects of LLMs data memorization, the privacy attacks that this allows and potential solutions, including data anonymization, Differential Privacy and Machine Unlearning.

Table of Contents

Data Extraction

Membership Inference Attacks

Model Inversion

Re-Identification from Anonymized Data

Attacks against Synthetic Data Generators

Data anonymization

Data anonymization with Differential Privacy

Pre-training with Differential Privacy

Fine-tuning with Differential Privacy

Parameter-Efficient Fine-Tuning with Differential Privacy

Reinforcement Learning with Differential Privacy

Inference with Differential Privacy

Federated Learning with Differential Privacy

Machine Unlearning

Tools and Frameworks

  • TensorFlow Privacy Python library with optimizers for training ML models with DP.
  • PyVacy Pytorch translation of TensorFlow Privacy.
  • OpenDP project Collection of algorithms for generating DP statistics.
  • DiffPrivLib Provides a wide range of DP tools for ML and data analysis.
  • Google DP Provides a broad set of DP tools.
  • Microsoft DP Inference-DP framework.
  • EKTELO Flexible and extensible framework for DP data analysis.
  • PyTorch Opacus Enables training PyTorch models with DP.
  • private-transformers Provides a privacy engine built off Opacus rewritten specifically to facilitate integration with the transformers library.
  • dp-transformers Toolkit that provides a simplified integration of transformers training with DP.
  • Chorus DP statistical queries through a cooperative query processing system.
  • autodp Automates the process of calculating the privacy guarantees for complex algorithms and supports several standard DP mechanisms.

About

Collection of all the papers talking about/relevant to the topic of privacy-preserving LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published