SynthPred.jl is a Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
- 🔍 Descriptive statistics and missing data reporting
- 🧼 Simple and advanced imputation:
- Mean, median, mode
- Forward/backward fill
- Gaussian distribution sampling
- Time series-based: ARIMA
- Sequence learning-based: RNN (Flux.jl)
- 🤖 AutoML for classification (MLJ.jl-based)
- ⚖️ Blending top-performing models via ensembling
- 📊 Predictions on new data
- 📑 JSON/CSV imputation reports
using Pkg
Pkg.add(url="https://github.com/TyMill/SynthPred.jl")
using SynthPred
using CSV, DataFrames
# Load training data
df = CSV.read("data/example.csv", DataFrame)
# Explore data
SynthPred.Exploration.describe_data(df)
# Impute missing values (e.g. RNN strategy)
df_clean, report = SynthPred.Imputer.impute_advanced(df, "rnn", threshold=0.1)
SynthPred.Imputer.save_imputation_report(report, "reports/imputation_report.json")
# Run AutoML pipeline
top_models, scores = SynthPred.AutoML.run_automl(df_clean, :target)
X = select(df_clean, Not(:target))
y = df_clean[:, :target]
ensemble = SynthPred.AutoML.blend_top_models(top_models, X, y)
# Predict on new data
Xnew = CSV.read("data/new_data.csv", DataFrame)
preds = SynthPred.AutoML.predict_ensemble(ensemble, Xnew)
println(preds)
Full documentation is available at: https://your-username.github.io/SynthPred.jl
SynthPred/
├── Project.toml
├── src/
│ ├── SynthPred.jl
│ ├── Exploration.jl
│ ├── Imputer.jl
│ └── AutoML.jl
├── data/
│ ├── example.csv
│ └── new_data.csv
├── reports/
│ └── imputation_report.json
├── docs/
│ └── src/index.md
├── test/
│ └── runtests.jl
└── main.jl
- Core modules: Exploration, Imputer, AutoML
- ARIMA and RNN-based imputations
- AutoML + model blending with MLJ.jl
- Imputation reports (CSV/JSON)
- Documentation (Documenter.jl + GitHub Pages)
- Exporting trained models (
JLD2
,BSON
) - Web GUI with Pluto.jl or Dash.jl
- Integration with JuliaHub and Zenodo DOI
Pull requests are welcome! For major changes, please open an issue first to discuss your proposal.
MIT License © 2025 Tymoteusz Miller
Built with ❤️ in Julia for real-world ML and scientific discovery.