Miscellaneous cookbooks and code made available for purposes of education, reproducibility, and transparency.
Example | Description |
---|---|
TLM-Demo-Notebook | A demo notebook showcasing various TLM applications |
TLM-PII-Detection | A demo notebook showcasing finding and removing PII with TLM |
TLM-Record-Matching | A tutorial showcasing how Cleanlab's Trustworthy Language Model (TLM) can be used for record matching use cases. In particular, this shows how TLM can reliably match records between two different data tables, achieving higher levels of accuracy than existing methods. |
benchmarking_hallucination_metrics | Notebook that compares the performance of popular hallucination detection metrics on a set of hallucination benchmarks. |
fine_tuning_data_curation | Notebook showing how to use Cleanlab TLM and Cleanlab Studio to detect bad data in instruction tuning LLM datasets. |
Detecting GDPR Violations with TLM | Notebook showing the code used to analyze application logs using TLM to detect GDPR violations |
Customer Support AI Agent with NeMo Guardrails | Reliable customer support AI Agent with Guardrails and trustworthiness scoring |
few_shot_prompt_selection | Notebook showing how to clean few-shot examples pool to improve prompt template for OpenAI LLM. |
fine_tuning_classification | Notebook showing how to use Cleanlab Studio to improve the accuracy of fine-tuned LLMs for classification tasks. |
generate_llm_response | Notebook showing how to generate LLM responses for customer service requests using Llama 2 and OpenAI's API. |
gpt4-rag-logprobs | Notebook showing how to obtain logprobs from a GPT-4 based RAG system. |
fine_tuning_mistral_beavertails | Analyze human annotated AI-safety-related labels (like toxicity) using Cleanlab Studio, and thus generate safer responses from LLMs. |
Evaluating_Toxicity_Datasets_Large_Language_Models | Notebook on analyzing toxicity annotations in the Jigsaw dataset using Cleanlab Studio. |
time_series_automl | Notebook showing how to model time series data in a tabular format and use AutoML with Cleanlab Studio to improve out-of-sample accuracy. |