Skip to content

Commit

Permalink
Primed and Fitted models
Browse files Browse the repository at this point in the history
  • Loading branch information
ekoutanov committed Nov 9, 2023
1 parent 2a64315 commit 168ba23
Show file tree
Hide file tree
Showing 19 changed files with 347 additions and 205 deletions.
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ use stanza::renderer::Renderer;
use brumby::display::DisplaySlice;
use brumby::file::ReadJsonFile;
use brumby::market::{Market, OverroundMethod};
use brumby::model::{Calibrator, Config, WinPlace};
use brumby::model::{Fitter, FitterConfig, WinPlace, Model};
use brumby::model::cf::Coefficients;
use brumby::model::fit::FitOptions;
use brumby::print;
Expand Down Expand Up @@ -53,13 +53,13 @@ fn main() -> Result<(), Box<dyn Error>> {
28.0,
];

// load coefficients from a file and create a calibrator for model fitting
// load coefficients from a file and create a fitter
let coefficients = Coefficients::read_json_file(PathBuf::from("config/thoroughbred.cf.json"))?;
let config = Config {
let config = FitterConfig {
coefficients,
fit_options: FitOptions::fast() // use the default presents in production; fast presets are used for testing
};
let calibrator = Calibrator::try_from(config)?;
let fitter = Fitter::try_from(config)?;

// fit Win and Place probabilities from the supplied prices, undoing the overrounds
let wp_markets = WinPlace {
Expand All @@ -72,10 +72,10 @@ fn main() -> Result<(), Box<dyn Error>> {
let overrounds = wp_markets.extrapolate_overrounds()?;

// fit a model using the Win/Place prices and extrapolated overrounds
let model = calibrator.fit(wp_markets, &overrounds)?.value;
let model = fitter.fit(&wp_markets, &overrounds)?.value;

// nicely format the derived price matrix
let table = print::tabulate_derived_prices(&model.top_n.as_price_matrix());
let table = print::tabulate_derived_prices(&model.prices().as_price_matrix());
println!("\n{}", Console::default().render(&table));

// simulate a same-race multi for a chosen selection vector using the previously fitted model
Expand Down Expand Up @@ -111,17 +111,17 @@ Note, when all rows are identical, the biased model behaves identically to the n

Take, for example, a field of 6 with win probabilities _P_ = (0.05, 0.1, 0.25, 0.1, 0.35, 0.15). For a two-place podium, _W_ might resemble the following:

_W_<sub>1,_</sub> = (0.05, 0.1, 0.25, 0.1, 0.35, 0.15) = _P_
_W_<sub>1,_</sub> = (0.05, 0.1, 0.25, 0.1, 0.35, 0.15) = _P_;

_W_<sub>2,_</sub> = (0.09, 0.13, 0.22, 0.13, 0.28, 0.15)
_W_<sub>2,_</sub> = (0.09, 0.13, 0.22, 0.13, 0.28, 0.15).

In other words, the high-probability runners have had their relative ranking probabilities penalised, while low-probability runners were instead boosted. This reflects our updated assumption that low(/high)-probability runners are under(/over)estimated to place by a naive model.
In other words, the high-probability runners have had their relative ranking probabilities suppressed, while low-probability runners were instead boosted. This reflects our updated assumption that low(/high)-probability runners are under(/over)estimated to place by a naive model.

A pertinent questions is how to assign the relative probabilities in rows 2–_N_, given _P_ and possibly other data. An intuitive approach is to fit the probabilities based on historical data. Brumby uses a linear regression model with a configurable set of regressors. For example, a third degree polynomial comprising runner prices and the field size. (Which we found to be a reasonably effective predictor.) Distinct models may be used for different race types, competitor classes, track conditions, and so forth. The fitting process is performed offline; its output is a set of regression factor and coefficient pairs.
A pertinent questions is how to assign the relative probabilities in rows 2–_N_, given _P_ and possibly other data. An intuitive approach is to fit the probabilities based on historical data. Brumby uses a linear regression model with a configurable set of regressors. For example, a third degree polynomial comprising runner prices and the field size. (Which we found to be a reasonably effective predictor.) Distinct models may be used for different race types, competitor classes, track conditions, and so forth. The fitting process is performed offline; its output is a set of regression factors and corresponding coefficients.

The offline-fitted model does not cater to specific biases present in individual races and, crucially, it does not protect the operator of the model against _internal arbitrage_ opportunities. Let the Place market be paying _X_ places, where _X_ is typically 2 or 3. When deriving the Top-1.._N_ price matrix solely from Win prices, it is possible that the Top-_X_ prices differ from the Places price when the latter are sourced from an alternate model. This creates an internal price incoherency, where a semi-rational bettor will select the higher of the two prices, all other terms being equal. In the extreme case, the price difference may expose value in the bet and even enable rational bettors to take a risk-free position across a pair of incoherent markets.

This problem is ideally solved by unifying the models so that the Place prices are taken directly from the Top-1.._N_ matrix. Often this is not viable, particularly when the operator sources its headline Win and Place markets from a commodity pricing supplier and/or applies manual price overrides on select runners. As such, Brumby allows the fitting of the Top-_X_ prices to the offered Place prices. The fitting is entirely online, typically following a price update, iterating while adjusting _W_<sub>_X_, _</sub> until the Top-_X_ prices match the Place prices within some margin of error.
This problem is ideally solved by unifying the models so that the Place prices are taken directly from the Top-1.._N_ matrix. Often this is not viable, particularly when the operator sources its Win and Place markets from a commodity pricing supplier and/or trades them manually. As such, Brumby allows the fitting of the Top-_X_ prices to the offered Place prices. The fitting is entirely online, typically following a price update, iterating while adjusting _W_<sub>_X_, _</sub> until the Top-_X_ prices match the Place prices within some acceptable margin of error.

Fitting of the Top-_X_ market to the Place market is a _closed loop_ process, using the fitted residuals to moderate subsequent adjustments and eventually terminate the fitting process. In each iteration, for every rank _i_ and every runner _j_, a price is fitted and compared with the sample price. The difference is used to scale the probability at _W_<sub>_i_,_j_</sub>. For example, let the fitted price _f_ be 2.34 and the sample price _s_ be 2.41 for runner 5 in rank 3. The adjustment factor is _s_ / _f_ = 1.03. _W′_<sub>3,5</sub> = _W_<sub>3,5</sub> × 1.03.

Expand Down
2 changes: 1 addition & 1 deletion benches/cri_mc_engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ fn criterion_benchmark(c: &mut Criterion) {
let mut bitmap = [true; 14];
let mut totals = [1.0; 4];
let mut engine = MonteCarloEngine::default()
.with_iterations(1_000)
.with_trials(1_000)
.with_bitmap(CaptureMut::Borrowed(&mut bitmap))
.with_totals(CaptureMut::Borrowed(&mut totals))
.with_podium(CaptureMut::Borrowed(&mut podium))
Expand Down
2 changes: 1 addition & 1 deletion examples/basic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ fn main() {

// create an MC engine for reuse
let mut engine = mc::MonteCarloEngine::default()
.with_iterations(100_000)
.with_trials(100_000)
.with_probs(Capture::Owned(
DilatedProbs::default()
.with_win_probs(Capture::Borrowed(&probs))
Expand Down
12 changes: 6 additions & 6 deletions examples/multi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use stanza::renderer::Renderer;
use brumby::display::DisplaySlice;
use brumby::file::ReadJsonFile;
use brumby::market::{Market, OverroundMethod};
use brumby::model::{Calibrator, Config, WinPlace};
use brumby::model::{Fitter, FitterConfig, WinPlace, Model};
use brumby::model::cf::Coefficients;
use brumby::model::fit::FitOptions;
use brumby::print;
Expand Down Expand Up @@ -38,13 +38,13 @@ fn main() -> Result<(), Box<dyn Error>> {
28.0,
];

// load coefficients from a file and create a calibrator
// load coefficients from a file and create a fitter
let coefficients = Coefficients::read_json_file(PathBuf::from("config/thoroughbred.cf.json"))?;
let config = Config {
let config = FitterConfig {
coefficients,
fit_options: FitOptions::fast(),
};
let calibrator = Calibrator::try_from(config)?;
let fitter = Fitter::try_from(config)?;

// fit Win and Place probabilities from the supplied prices, undoing the effect of the overrounds
let wp_markets = WinPlace {
Expand All @@ -57,10 +57,10 @@ fn main() -> Result<(), Box<dyn Error>> {
let overrounds = wp_markets.extrapolate_overrounds()?;

// fit a model using the Win/Place prices and extrapolated overrounds
let model = calibrator.fit(wp_markets, &overrounds)?.value;
let model = fitter.fit(&wp_markets, &overrounds)?.value;

// nicely format the derived prices
let table = print::tabulate_derived_prices(&model.top_n.as_price_matrix());
let table = print::tabulate_derived_prices(&model.prices().as_price_matrix());
println!("\n{}", Console::default().render(&table));

// simulate a same-race multi for a chosen selection vector using the previously fitted model
Expand Down
4 changes: 4 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ test:
cargo doc --no-deps
cargo bench --no-run --profile dev

# run clippy with pedantic checks
clippy:
cargo clippy -- -D clippy::pedantic -A clippy::must-use-candidate -A clippy::struct-excessive-bools -A clippy::single-match-else -A clippy::inline-always -A clippy::cast-possible-truncation -A clippy::cast-precision-loss -A clippy::items-after-statements

# install Rust
install-rust:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
24 changes: 12 additions & 12 deletions src/bin/datadump.rs
Original file line number Diff line number Diff line change
Expand Up @@ -101,28 +101,28 @@ fn main() -> Result<(), Box<dyn Error>> {
Market::fit(&OVERROUND_METHOD, prices, rank as f64 + 1.0)
})
.collect();
let fit_outcome = fit::fit_all(FitOptions::default(), &markets)?;
let fit_outcome = fit::fit_all(&FitOptions::default(), &markets)?;
debug!(
"individual fitting complete: stats: {:?}, probs: \n{}",
fit_outcome.stats,
fit_outcome.fitted_probs.verbose()
);

let num_runners = markets[0].probs.len();
let runners = markets[0].probs.len();
let active_runners = markets[0].probs.iter().filter(|&&prob| prob != 0.).count();
let stdev = markets[0].probs.stdev();
for runner in 0..num_runners {
for runner in 0..runners {
if markets[0].probs[runner] != 0.0 {
let mut record = Record::with_capacity(Factor::COUNT);
record.set(Factor::RaceId, race.id);
record.set(Factor::RunnerIndex, runner);
record.set(Factor::ActiveRunners, active_runners);
record.set(Factor::PlacesPaying, race.places_paying);
record.set(Factor::Stdev, stdev);
record.set(Factor::Weight0, fit_outcome.fitted_probs[(0, runner)]);
record.set(Factor::Weight1, fit_outcome.fitted_probs[(1, runner)]);
record.set(Factor::Weight2, fit_outcome.fitted_probs[(2, runner)]);
record.set(Factor::Weight3, fit_outcome.fitted_probs[(3, runner)]);
record.set(Factor::RaceId, &race.id);
record.set(Factor::RunnerIndex, &runner);
record.set(Factor::ActiveRunners, &active_runners);
record.set(Factor::PlacesPaying, &race.places_paying);
record.set(Factor::Stdev, &stdev);
record.set(Factor::Weight0, &fit_outcome.fitted_probs[(0, runner)]);
record.set(Factor::Weight1, &fit_outcome.fitted_probs[(1, runner)]);
record.set(Factor::Weight2, &fit_outcome.fitted_probs[(2, runner)]);
record.set(Factor::Weight3, &fit_outcome.fitted_probs[(3, runner)]);
debug!("{record:?}");
csv.append(record)?;
csv.flush()?;
Expand Down
8 changes: 4 additions & 4 deletions src/bin/evaluate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ use brumby::data::{EventDetailExt, PlacePriceDeparture, PredicateClosures, RaceS
use brumby::file::ReadJsonFile;
use brumby::market::{Market, OverroundMethod};
use brumby::model::cf::Coefficients;
use brumby::model::{fit, Calibrator, Config, TopN, WinPlace};
use brumby::model::{fit, Fitter, FitterConfig, TopN, WinPlace};

const OVERROUND_METHOD: OverroundMethod = OverroundMethod::Multiplicative;
const TOP_SUBSET: usize = 25;
Expand Down Expand Up @@ -83,7 +83,7 @@ fn main() -> Result<(), Box<dyn Error>> {
EventType::Harness => unimplemented!(),
};
debug!("loading {race_type} config from {filename}");
let config = Config {
let config = FitterConfig {
coefficients: Coefficients::read_json_file(filename)?,
fit_options: Default::default(),
};
Expand All @@ -108,7 +108,7 @@ fn main() -> Result<(), Box<dyn Error>> {
);
let departure = race_file.race.place_price_departure();
let race = race_file.race.summarise();
let calibrator = Calibrator::try_from(configs[&race.race_type].clone())?;
let calibrator = Fitter::try_from(configs[&race.race_type].clone())?;
let sample_top_n = TopN {
markets: (0..race.prices.rows())
.map(|rank| {
Expand All @@ -123,7 +123,7 @@ fn main() -> Result<(), Box<dyn Error>> {
places_paying: race.places_paying,
};
let sample_overrounds = sample_top_n.overrounds()?;
let model = calibrator.fit(sample_wp, &sample_overrounds)?.value;
let model = calibrator.fit(&sample_wp, &sample_overrounds)?.value;
let derived_prices = model.top_n.as_price_matrix();
let errors: Vec<_> = (0..derived_prices.rows())
.map(|rank| {
Expand Down
53 changes: 40 additions & 13 deletions src/bin/prices.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ use std::env;
use std::error::Error;
use std::path::PathBuf;

use anyhow::bail;
use anyhow::{anyhow, bail};
use clap::Parser;
use racing_scraper::models::{EventDetail, EventType};
use stanza::renderer::console::Console;
Expand All @@ -16,8 +16,8 @@ use brumby::display::DisplaySlice;
use brumby::file::ReadJsonFile;
use brumby::market::{Market, Overround, OverroundMethod};
use brumby::model::cf::Coefficients;
use brumby::model::fit::compute_msre;
use brumby::model::{fit, Calibrator, Config, TopN, WinPlace, PODIUM};
use brumby::model::fit::{compute_msre, FitOptions};
use brumby::model::{fit, Fitter, FitterConfig, TopN, WinPlace, PODIUM, Model, Primer};
use brumby::print::{tabulate_derived_prices, tabulate_prices, tabulate_probs, tabulate_values};
use brumby::selection::Selections;

Expand All @@ -35,6 +35,10 @@ struct Args {

/// selections to price
selections: Option<Selections<'static>>,

/// model type
#[clap(short = 'm', long, value_parser = parse_model_type, default_value = "fitted")]
model: ModelType
}
impl Args {
fn validate(&self) -> anyhow::Result<()> {
Expand All @@ -47,6 +51,19 @@ impl Args {
}
}

#[derive(Debug, Clone)]
enum ModelType {
Primed,
Fitted
}
fn parse_model_type(s: &str) -> anyhow::Result<ModelType> {
match s.to_lowercase().as_str() {
"primed" => Ok(ModelType::Primed),
"fitted" => Ok(ModelType::Fitted),
_ => Err(anyhow!("unsupported model type {s}")),
}
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
if env::var("RUST_BACKTRACE").is_err() {
Expand Down Expand Up @@ -83,11 +100,6 @@ async fn main() -> Result<(), Box<dyn Error>> {
})
.collect(),
};

let calibrator = Calibrator::try_from(Config {
coefficients,
fit_options: Default::default(),
})?;
let sample_wp = WinPlace {
win: sample_top_n.markets[0].clone(),
place: sample_top_n.markets[race.places_paying - 1].clone(),
Expand Down Expand Up @@ -129,14 +141,29 @@ async fn main() -> Result<(), Box<dyn Error>> {
);
}

let model = calibrator.fit(sample_wp, &sample_overrounds)?;
debug!("fitted {model:?}");
let model = model.value;
let fit_options = FitOptions::default();
let model: Box<dyn Model> = match args.model {
ModelType::Primed => {
let primer = Primer::try_from(coefficients)?;
let model = primer.prime(&sample_wp.win, sample_wp.places_paying, fit_options.mc_trials, &sample_overrounds)?;
debug!("fitted {model:?}");
Box::new(model.value)
}
ModelType::Fitted => {
let calibrator = Fitter::try_from(FitterConfig {
coefficients,
fit_options
})?;
let model = calibrator.fit(&sample_wp, &sample_overrounds)?;
debug!("fitted {model:?}");
Box::new(model.value)
}
};

let probs_table = tabulate_probs(&model.fit_outcome.fitted_probs);
let probs_table = tabulate_probs(model.weighted_probs());
println!("{}", Console::default().render(&probs_table));

let derived_prices = model.top_n.as_price_matrix();
let derived_prices = model.prices().as_price_matrix();
let table = tabulate_derived_prices(&derived_prices);
info!("\n{}", Console::default().render(&table));

Expand Down
7 changes: 2 additions & 5 deletions src/capture.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
//! [Capture] is a minimalistic analogue of [Cow](std::borrow::Cow) that relaxes the [ToOwned] constrain while
//! supporting [?Sized](Sized) types. [CaptureMut] extends [Capture] with support for mutable references.
//! [`Capture`] is a minimalistic analogue of [`Cow`](std::borrow::Cow) that relaxes the [`ToOwned`] constrain while
//! supporting [`?Sized`](Sized) types. [`CaptureMut`] extends [`Capture`] with support for mutable references.
use std::borrow::{Borrow, BorrowMut};
use std::ops::{Deref, DerefMut};
Expand All @@ -9,9 +9,6 @@ pub enum Capture<'a, W: Borrow<B>, B: ?Sized> {
Owned(W),
Borrowed(&'a B),
}
// impl<W: Borrow<B> + Default, B: ?Sized> Capture<'_, W, B> {
// pub fn
// }

impl<'a, W: Borrow<B> + Default, B: ?Sized> Default for Capture<'a, W, B> {
fn default() -> Self {
Expand Down
6 changes: 3 additions & 3 deletions src/csv.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ impl CsvWriter {
R::Item: AsRef<str>,
{
let mut first = true;
for datum in record.into_iter() {
for datum in record {
if first {
first = false;
} else {
Expand Down Expand Up @@ -94,8 +94,8 @@ impl Record {
Self { items }
}

pub fn set(&mut self, ordinal: impl Into<usize>, value: impl ToString) {
self.items[ordinal.into()] = Cow::Owned(value.to_string())
pub fn set(&mut self, ordinal: impl Into<usize>, value: &impl ToString) {
self.items[ordinal.into()] = Cow::Owned(value.to_string());
}

pub fn len(&self) -> usize {
Expand Down
2 changes: 1 addition & 1 deletion src/linear/regression.rs
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ impl<O: AsIndex> RegressionModel<O> {
table.push_row(Row::new(
Styles::default(),
vec![
format!("{:?}", regressor).into(),
format!("{regressor:?}").into(),
format!("{:.8}", self.predictor.coefficients[regressor_index]).into(),
format!("{:.6}", self.std_errors[regressor_index]).into(),
format!("{:.6}", self.p_values[regressor_index]).into(),
Expand Down
Loading

0 comments on commit 168ba23

Please sign in to comment.