Skip to content

PSTAT197-F23/vignette-XGBoost

Repository files navigation

vignette-XGBoost

Vignette on implementing XGBoost using credit card fraud data; created as a class project for PSTAT197A in Fall 2023.

Contributors: Mu Niu, Qifei Cui, Yixin Xue, Amber Wang, David Lin

Vignette abstract: Our vignette presents a comprehensive exploration of XGBoost, including its mathematical principles, programming language support, and practical application in classification problems. Our vignette is conducted on a simulated “Credit Card Fraud” dataset obtained on Kaggle (https://www.kaggle.com/datasets/dhanushnarayananr/credit-card-fraud/data). This dataset presents a binary classification problem, classifying transactions as either fraudulent or legitimate. With 8 variables and a total of 1,000,000 observations. Our model has an AUC score of 0.99988, indicating that the model is almost a perfect classifier. Such an unusual result is probably because the simulated dataset is designed for practice purposes.

Repository contents:

root directory
├── data
   ├── raw
   ├── processed
├── scripts
   ├── drafts
   ├── vignette-script.R
├── img
├── vignette.qmd
├── vignette.html
├── README.md 
├── vignette-xgboost.Rproj

Reference list:

  • Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016.

  • He, Shen, et al. “An Effective Cost-Sensitive XGBoost Method for Malicious Urls Detection in Imbalanced Dataset.” IEEE Access, vol. 9, 2021, pp. 93089–93096, doi:10.1109/access.2021.3093094.

  • Narayanan R, Dhanush. “Credit Card Fraud”, https://www.kaggle.com/datasets/dhanushnarayananr/credit-card-fraud/data. Accessed 06 Dec. 2023.

  • “XGBoost Documentation.” XGBoost Documentation - Xgboost 2.0.2 Documentation, xgboost.readthedocs.io/.

About

XGBoost in R and application on fraud detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published