Skip to content

ai-mindset/xgboost_outlier_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

XGBoost Time Series Outlier Detection

A system for detecting outliers in time series data using XGBoost regression with dynamic thresholding, optimised for handling periodic patterns.

Requirements

  • Python 3.13
  • uv (for dependency management)
  • Node.js/Deno (for synthetic data generation)

Dependencies

See pyproject.toml for more details

Components

1. Data Generator

Run using Deno:

deno run gen_synth_data.js > synthetic_data.csv

Generates time series data with:

  • Daily observations (100 points from 2023-01-01)
  • Upward trend (+0.5/day)
  • Two step changes (+20 at day 30, +15 at day 60)
  • Cyclical patterns (10-point amplitude)
  • Random noise (±1.5 points)

2. Outlier Detector

from outlier_detection import detect_outliers, plot_time_series

# Load data
df = pd.read_csv("synthetic_data.csv")

# Detect outliers
df_with_outliers = detect_outliers(df)

# Visualize (saves to plots/time_series_outliers_{sheet_name}.html)
fig = plot_time_series(
    df_with_outliers,
    "Time Series Analysis",
    "dataset_name",
    df_with_outliers["is_outlier"]
)
fig.show()

Key Features:

  • Temporal feature engineering (day/month/year patterns)
  • Dynamic thresholding for periodic spikes
  • Interactive Plotly visualizations
  • Excel and CSV support

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages