Skip to content

Conversation

yuji3w
Copy link
Contributor

@yuji3w yuji3w commented Apr 19, 2022

The objective of the drift detector benchmarker is to create an easy-to-use framework that benchmarks data drift detection methods. The benchmarked currently uses the NYC taxi dataset.

Key features:

Created benchmarking framework, allowing easy testing of the following drift detectors created in this pr:

  1. ClassifierDriftModel,
  2. ClassifierUncertaintyModel,
  3. CVMModel,
  4. FETModel,
  5. LearnedKernelModel,
  6. MMDModel,
  7. ModelWrapperInterface,
  8. SpotTheDiffModel,
  9. TabularModel,

Created graphing utility, allowing for plotting inter-week drift on a feature-wise level

Created accessor class and factory, allows for easy access to data frame data.

Created batch loader for loading large date ranges (especially greater than 1 year).

1. Add f1, accuracy scores to CSV
2. Aggregate CSVs
3. Track runtimes for each method
4. All methods now ready for use
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

`tip_amount` leaks information to `tip_percent_greater_15`
df.loc[df['vendorid'] == '2', 'vendorid'] = 2
df.loc[df['vendorid'] == '2.0', 'vendorid'] = 2
# Correct string type confusion in vendorid, payment_type
df['vendorid'] = df['vendorid'].astype(float).astype(int)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why astype(float) before astype(int)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to massage the type conversions because int("1.0") is invalid but int(float("1.0")) is valid.

@yuji3w yuji3w changed the title [WIP] Add Drift Detector Benchmarker Prototype Add Drift Detector Benchmarker Prototype Apr 28, 2022
@yuji3w yuji3w marked this pull request as draft April 28, 2022 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants