You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current code expects that a split will be defined in the [predict] table in the config file. The user can then set the specified split to be 1.0 (100%) of the data in the data directory.
But this feels like an odd requirement for the user to update the [data_set] split definitions in order to run inference.
I would advocate for an approach that doesn't require the user to specify a split in the [predict] table, and just runs all the data found through the trained model.
The text was updated successfully, but these errors were encountered:
Proposing a solution - what is the default value for the split value in [predict] is false, which would indicate "no split, just use all the data in the data directory". And if the value is not "false" then we would expect it to be one of ["train", "validate", "test"].
I think that would suppose both use cases relatively easily:
I just want to use an existing trained model with a bunch of data - no mods required to the config file.
I want to test my trained model on a subset of my input data - the user would expect to have to define the subset in the config.
The current code expects that a split will be defined in the
[predict]
table in the config file. The user can then set the specified split to be 1.0 (100%) of the data in the data directory.But this feels like an odd requirement for the user to update the
[data_set]
split definitions in order to run inference.I would advocate for an approach that doesn't require the user to specify a split in the
[predict]
table, and just runs all the data found through the trained model.The text was updated successfully, but these errors were encountered: