Open
Description
The issue of finding split models that work better than one single model presents itself again and again at clients (we discussed it last week again at one of our current clients).
Usually we ask clients whether they know which splits of their population would have a different feature-target relation. (*)
It would be very interesting if we could find the population (basetable) splits automatically in a data-driven way.
For example, by segmenting the population basetable and:
- either evaluating whether a prefitted simple generic model fails to capture the feature-target relation (bigger evaluation error),
- or by evaluating per population segment which features have very different PIG tables than those found on the generic model.
We could leave the option as well for people to specify in a list argument which basetable splits they want to create, or letting them provide the basetable splits already, based on business knowledge instead (see * above).
Metadata
Metadata
Assignees
Labels
No labels