Skip to content

Automatic finding of most interesting split models #120

Open
@sandervh14

Description

@sandervh14

The issue of finding split models that work better than one single model presents itself again and again at clients (we discussed it last week again at one of our current clients).
Usually we ask clients whether they know which splits of their population would have a different feature-target relation. (*)

It would be very interesting if we could find the population (basetable) splits automatically in a data-driven way.
For example, by segmenting the population basetable and:

  • either evaluating whether a prefitted simple generic model fails to capture the feature-target relation (bigger evaluation error),
  • or by evaluating per population segment which features have very different PIG tables than those found on the generic model.

We could leave the option as well for people to specify in a list argument which basetable splits they want to create, or letting them provide the basetable splits already, based on business knowledge instead (see * above).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions