Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lexicographical sort of column "time" after compression #7

Open
nikhase opened this issue Aug 12, 2017 · 3 comments
Open

Lexicographical sort of column "time" after compression #7

nikhase opened this issue Aug 12, 2017 · 3 comments

Comments

@nikhase
Copy link
Collaborator

nikhase commented Aug 12, 2017

The "time" shows bins and is encoded as bin_0.0. This makes it hard to sort by the column and make plot. What about renaming "time" to "bin" and providing bin numbers?

In general, one would like to pass the dataframe to tsfresh, so the "time" column should be ordered accordingly.

id feature_agg_autocorrelation_f_agg_"mean" feature_agg_autocorrelation_f_agg_"median" feature_agg_autocorrelation_f_agg_"var" time
0 -0.006695 -0.031946 0.031041 bin_0.0
0 0.003307 0.002723 0.015377 bin_1.0
0 -0.019875 -0.020356 0.016519 bin_10.0
0 -0.010753 -0.026369 0.021735 bin_100.0
0 0.011816 0.019509 0.010336 bin_101.0
0 -0.012836 -0.012418 0.038740 bin_102.0
0 -0.013034 -0.008422 0.008983 bin_103.0
0 -0.015615 -0.015442 0.022139 bin_104.0
0 -0.011075 0.006340 0.018839 bin_105.0
0 -0.012528 -0.002204 0.014608 bin_106.0
0 0.003264 -0.012552 0.012001 bin_107.0
0 -0.008267 -0.013056 0.031777 bin_108.0
0 -0.014031 -0.026050 0.011954 bin_109.0
0 -0.027372 -0.028189 0.012125 bin_11.0
0 -0.006538 -0.016846 0.020991 bin_110.0
0 0.028912 -0.002320 0.018458 bin_111.0
0 -0.011757 -0.021368 0.040606 bin_112.0
0 -0.014773 -0.022101 0.013958 bin_113.0
0 -0.010944 -0.001797 0.028481 bin_114.0
0 -0.016143 -0.028406 0.007117 bin_115.0
0 -0.013865 -0.021711 0.011233 bin_116.0
0 -0.009488 0.007354 0.008971 bin_117.0
0 -0.014187 -0.017223 0.044131 bin_118.0
0 -0.013005 -0.005250 0.011614 bin_119.0
0 -0.011601 0.010453 0.016970 bin_12.0
0 -0.012738 -0.004333 0.012729 bin_120.0
0 -0.013266 -0.016564 0.007020 bin_121.0
0 -0.015038 -0.042097 0.024701 bin_122.0
0 -0.012776 -0.004399 0.016492 bin_123.0
0 -0.012934 -0.018298 0.017719 bin_124.0
... ... ... ... ...
9 -0.017292 -0.010434 0.007727 bin_72.0
9 -0.009239 0.000410 0.007263 bin_73.0
9 -0.050343 -0.035553 0.016307 bin_74.0
9 -0.016550 -0.019668 0.007808 bin_75.0
9 -0.015879 -0.034310 0.014253 bin_76.0
9 -0.019754 -0.037949 0.018174 bin_77.0
9 -0.016839 -0.005070 0.016695 bin_78.0
9 -0.015295 -0.005584 0.012654 bin_79.0
9 -0.015647 -0.016262 0.008907 bin_8.0
9 -0.010676 -0.014450 0.010222 bin_80.0
9 -0.003566 0.010439 0.009648 bin_81.0
9 0.008290 0.015121 0.009266 bin_82.0
9 -0.004448 -0.014874 0.007668 bin_83.0
9 -0.012481 -0.017615 0.012226 bin_84.0
9 -0.018334 -0.007268 0.009883 bin_85.0
9 -0.017429 -0.029421 0.009856 bin_86.0
9 -0.000159 0.010534 0.008968 bin_87.0
9 -0.003924 -0.022100 0.018910 bin_88.0
9 0.008415 0.019052 0.020014 bin_89.0
9 -0.012393 -0.000086 0.010260 bin_9.0
9 0.006285 0.020495 0.012573 bin_90.0
9 -0.010193 -0.008106 0.008721 bin_91.0
9 -0.016792 -0.009178 0.012188 bin_92.0
9 0.008476 0.020195 0.010278 bin_93.0
9 0.005893 0.007117 0.008789 bin_94.0
9 -0.008254 -0.010829 0.017784 bin_95.0
9 0.004660 0.014164 0.009694 bin_96.0
9 0.011764 -0.004501 0.010030 bin_97.0
9 -0.017136 -0.026493 0.011077 bin_98.0
9 0.013644 0.033041 0.008518 bin_99.0
@nikhase
Copy link
Collaborator Author

nikhase commented Aug 12, 2017

Renaming "time" to "bin" and with numericals in the column, then passing to tsfresh:

extract_features(compressed_df, column_id="id", column_sort="bin")

@MaxBenChrist
Copy link
Owner

I am fine with changing the naming of the bins if we also change the name of the id column to bin column afterwards.

@nikhase
Copy link
Collaborator Author

nikhase commented Aug 14, 2017

Tiny correction: The id column stays the same, "time" is changed to "bin".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants