-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The number of time series formats is too damn high! #1
Comments
On this point, I can just inform about the choice made for tslearn (probably not ideal). Our time series are 3d numpy arrays (to allow for multidimensional time series). As for now, we do not tackle the irregular sampling case. And to allow for time series of varying lengths to be stored in a single array, we pad shorter time series with nans. Then, in tslearn, we have a ts_size function that takes a time series and returns its actual size (when ignoring padded nans). |
I hear you loud and clear, having just put together a few examples. Every package has its own ideas. No two are the same it seems. |
@microprediction we'd welcome your feedback on sktime which tries to remedy some of these issues. |
Thanks, I'm taking a look. Btw my own most recent ruminations are embodied at https://github.com/microprediction/timemachines but I hasten to add that my scope is probably smaller. I'll try to bring in some more people to this thread, unless someone thinks there is a better place to discuss time series and canonical representations of data and models. |
I started a file at
https://github.com/MaxBenChrist/awesome_time_series_in_python/blob/master/standardize_time_series_formats.md
to trigger a discussion regarding the time series formats for python packages to analyze time series data.
This issue should be used to discuss the file.
The text was updated successfully, but these errors were encountered: