You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current k-fold cross-validation assumes that the supplied sample data is uniformly randomized, hence, performs simple slicing of the array for individual folds. We should partition the data in a way that the proportion of various classes are maintained in each fold. This can be the default or the only option or partition or alternatively an optional boolean parameter can be provided for stratification.
The text was updated successfully, but these errors were encountered:
To enforce this, we will have to first prepare buckets of each class from the supplied sample set and then partition each subset into k equal parts. Finally, pick one chunk from each subset to make data for each of the k sets. It is not difficult to do. I can take care of it when I get a chance to play with the code again. However, for now we are shuffling the sample data before splitting, which would theoretically have the similar effect, except not very precise, depending on the randomness.
Current k-fold cross-validation assumes that the supplied sample data is uniformly randomized, hence, performs simple slicing of the array for individual folds. We should partition the data in a way that the proportion of various classes are maintained in each fold. This can be the default or the only option or partition or alternatively an optional boolean parameter can be provided for stratification.
The text was updated successfully, but these errors were encountered: