Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compose migration #84

Open
sarahmish opened this issue Mar 30, 2021 · 0 comments · May be fixed by #92
Open

Compose migration #84

sarahmish opened this issue Mar 30, 2021 · 0 comments · May be fixed by #92
Labels
enhancement New feature or request new feature New feature

Comments

@sarahmish
Copy link
Collaborator

Prediction Engineering

How to use compose to write the problem definition component in cardea.

Compose is a machine learning tool for automated prediction engineering. It allows you to structure prediction problems and generate labels for supervised learning. We can use compose to search for the cutoff times for a specific prediction problem (e.g. los) and return label_times.

The component should be easily adaptable to support multiple prediction problems:

  • appointment no show
  • mortality prediction
  • length of stay
  • etc

Design

There are two main parts that we need to define:

  • Class with main function of generating label times
  • Functions defining the prediction problem in mind
  • We also require helper functions to create the prediction problem

Design of data_laber.py

class DataLabeler:
    """Class that defines the prediction problem.

    This class supports the generation of `label_times` which 
    is fundamental to the feature generation phase as well 
    as specifying the target labels.

    Args:
        function (method):
            function that defines the labeling function, it should return a
            tuple of labeling function, the dataframe, and the name of the
            target entity.
    """
    def __init__(self, function):
        self.function = function

    def generate_label_times(self, es, *args, **kwargs):
        """Searches the data to calculate label times.

          Args:
              df (pandas.DataFrame): 
                  Data frame to search and extract labels.

          Returns:
              composeml.LabelTimes: 
                  Calculated labels with cutoff times.
        """
        pass

Design of a prediction function (e.g. appointment_no_show.py)

def appointment_no_show(es):
    def missed(ds, **kwargs):
        return True if 'noshow' in ds["status"].values else False

    meta = {
        # values to define prediction task
        "entity": "appointment",
        "time_index": "created",
        "type": "classification",
        "num_examples_per_instance": 1
    }

    df = denormalize(es, entities=['Appointment'])
    
    return missed, df, meta
@sarahmish sarahmish added enhancement New feature or request new feature New feature labels Mar 30, 2021
@sarahmish sarahmish linked a pull request Apr 20, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request new feature New feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant