Skip to content

The Medkit-Learn(ing) Environment. An open-source library for offline sequential decision making with a focus on medicine.

License

Notifications You must be signed in to change notification settings

vanderschaarlab/medkit-learn

 
 

Repository files navigation

Alex J. Chan, Ioana Bica, Alihan Huyuk, Daniel Jarrett, and Mihaela van der Schaar

License Code style: black

The Medkit-Learn(ing) Environment, or Medkit, is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.

Primarily, Medkit is a tool that supports: (1) a variety of realistic environment models—learned from actual data, to reflect real medical settings), thus allowing simulation of (2) a variety of expressive and customisable policy models that represent complex human decision-behaviours; as well as (3) ensuring that the environment and policy components are disentangled—hence independently controllable.

By fulfilling the above, Medkit seeks to enable advances in decision modelling to be validated more easily and robustly by enabling users to obtain batch datasets with known ground-truth policy parameterisations that simulate decision making behaviours with various degrees of Markovianity, bounded rationality, confounding, individual consistency and variation in practice.

Medkit is pip installable - to work with the latest version, we recommend cloning it, optionally creating a virtual env, and installing it (this will automatically install dependencies):

git clone https://github.com/XanderJC/medkit-learn.git

cd medkit-learn

pip install -e .

Alternatively, Medkit is available on PyPI, and can be installed simply with:

pip install medkit-learn

Example usage:

import medkit as mk

synthetic_dataset = mk.batch_generate(
    domain = "Ward",
    environment = "CRN",
    policy = "LSTM",
    size = 1000,
    test_size = 200,
    max_length = 10,
    scale = True
)

static_train, observations_train, actions_train = synthetic_dataset['training']
static_test,  observations_test,  actions_test  = synthetic_dataset['testing']

While medical machine learning is by necessity almost always entirely offline, we also provide an interface through which you can interact online with the environment should you find that useful. For example, you could train a custom RL policy on this environment with a specified reward function, then you can test inference algorithms on their ability to represent the policy.

env = mk.live_simulate(
    domain="ICU",
    environment="SVAE"
)

static_obs, observation, info = env.reset()
observation, reward, info, done = env.step(action)

Citing

If you use this software please cite as follows:

@inproceedings{chan2021medkitlearn,
        title={The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation},
        author={Alex James Chan and Ioana Bica and Alihan H{\"u}y{\"u}k and Daniel Jarrett and Mihaela van der Schaar},
        booktitle={Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks},
        year={2021},
        url={https://openreview.net/forum?id=Ayf90B1yESX}
}

About

The Medkit-Learn(ing) Environment. An open-source library for offline sequential decision making with a focus on medicine.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%