Remove the need to define a split when running `predict` #138

drewoldag · 2024-12-12T17:26:26Z

The current code expects that a split will be defined in the [predict] table in the config file. The user can then set the specified split to be 1.0 (100%) of the data in the data directory.

But this feels like an odd requirement for the user to update the [data_set] split definitions in order to run inference.

I would advocate for an approach that doesn't require the user to specify a split in the [predict] table, and just runs all the data found through the trained model.

The text was updated successfully, but these errors were encountered:

aritraghsh09 · 2024-12-12T19:20:52Z

This will be the expected default user behavior in unsupervised learning scenarios.

In supervised scenarios, users might want to run predict only on the "test" set (i.e., data - train - validation)

I don't think there is a "right" answer here.

drewoldag · 2024-12-13T00:11:21Z

Proposing a solution - what is the default value for the split value in [predict] is false, which would indicate "no split, just use all the data in the data directory". And if the value is not "false" then we would expect it to be one of ["train", "validate", "test"].

I think that would suppose both use cases relatively easily:

I just want to use an existing trained model with a bunch of data - no mods required to the config file.
I want to test my trained model on a subset of my input data - the user would expect to have to define the subset in the config.

drewoldag added the enhancement New feature or request label Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the need to define a split when running `predict` #138

Remove the need to define a split when running `predict` #138

drewoldag commented Dec 12, 2024

aritraghsh09 commented Dec 12, 2024

drewoldag commented Dec 13, 2024

Remove the need to define a split when running predict #138

Remove the need to define a split when running predict #138

Comments

drewoldag commented Dec 12, 2024

aritraghsh09 commented Dec 12, 2024

drewoldag commented Dec 13, 2024

Remove the need to define a split when running `predict` #138

Remove the need to define a split when running `predict` #138