You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, descriptive stats are quite inconsistent. This leads problems e.g. if we want to calculate the number of characters pr. task to estimate the number of compute tokens needed.
All of these calculations could be automated and are in the _calculate_metrics_from_split, however, it is not calculated for all datasets. It would be great to have a test that tests that these are calculated consistently across all tasks.
Additionally, this data is currently included in the metadata, which might not be ideal (often requiring copy-paste, which could lead to potential errors). A solution could be to write it to a json from which the data is fetched when needed. Tests can then fail if this cache is not full.
The text was updated successfully, but these errors were encountered:
Currently, descriptive stats are quite inconsistent. This leads problems e.g. if we want to calculate the number of characters pr. task to estimate the number of compute tokens needed.
All of these calculations could be automated and are in the
_calculate_metrics_from_split
, however, it is not calculated for all datasets. It would be great to have a test that tests that these are calculated consistently across all tasks.Additionally, this data is currently included in the metadata, which might not be ideal (often requiring copy-paste, which could lead to potential errors). A solution could be to write it to a json from which the data is fetched when needed. Tests can then fail if this cache is not full.
The text was updated successfully, but these errors were encountered: