Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux_percentile_ratio Index off by one error for small time series #8

Open
stevenstetzler opened this issue Nov 28, 2018 · 6 comments
Labels

Comments

@stevenstetzler
Copy link

I am running a full feature extraction on a time series with 14 measurements (using something like):

import feets
fs = feets.FeatureSpace(data = ['time', 'magnitude', 'error'])
features, values = fs.extract(*ts)

and I receive the following error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-11-7060bccca8c7> in <module>
----> 1 feets_all_features(ts_lists[2116])

<ipython-input-9-c2ebdeac3e58> in feets_all_features(ts_pbs)
     11 
     12     for pb in range(0, 6):
---> 13         features, values = fs.extract(*ts_pbs[pb][0])
     14 
     15         features = [f + f"_{pb}" for f in features]

/epyc/projects/plasticc-ml/envs/PLASTICC/lib/python3.6/site-packages/feets/core.py in extract(self, time, magnitude, error, magnitude2, aligned_time, aligned_magnitude, aligned_magnitude2, aligned_error, aligned_error2)
    290         features = {}
    291         for fextractor in self._execution_plan:
--> 292             result = fextractor.extract(features=features, **kwargs)
    293             features.update(result)
    294 

/epyc/projects/plasticc-ml/envs/PLASTICC/lib/python3.6/site-packages/feets/extractors/core.py in extract(self, **kwargs)
    290             # setup & run te extractor
    291             self.setup()
--> 292             result = self.fit(**fit_kwargs)
    293 
    294             # validate if the extractors generates the expected features

/epyc/projects/plasticc-ml/envs/PLASTICC/lib/python3.6/site-packages/feets/extractors/ext_flux_percentile_ratio.py in fit(self, magnitude)
    203 
    204         F_10_90 = sorted_data[F_90_index] - sorted_data[F_10_index]
--> 205         F_5_95 = sorted_data[F_95_index] - sorted_data[F_5_index]
    206         F_mid80 = F_10_90 / F_5_95
    207 

IndexError: index 14 is out of bounds for axis 0 with size 14

This is due to the fact that ext_flux_percentile_ratio.py calculates the location of the flux percentiles as:

F_10_index = int(math.ceil(0.10 * lc_length))
F_90_index = int(math.ceil(0.90 * lc_length))
F_5_index = int(math.ceil(0.05 * lc_length))
F_95_index = int(math.ceil(0.95 * lc_length))

For my example, lc_length = 14 thus:

F_10_index = 2
F_90_index = 13
F_5_index = 1
F_95_index = 14

but 14 is not a valid index for sorted_data[F_95_index] since sorted_data is of length 14.

Note that this is also an issue in ext_flux_percentile_ratio.py.

I am fixing this right now but just decrementing F_95_index by 1 if it is the same as lc_length, but it would probably be better to do bounds checking on all F_X_index variables or even just redefining lc_length to be lc_length - 1 to make it an index and not a length might work.

@stevenstetzler stevenstetzler changed the title [Bug] Index out of bounds error for small time series [Bug] Index off by one error for small time series Nov 28, 2018
@leliel12
Copy link
Contributor

mfff. @stevenstetzler i just started the next version of feets (to deliver it arround april) can you help me with this bug?

@leliel12 leliel12 added the bug label Feb 4, 2019
@stevenstetzler
Copy link
Author

Yeah, I can help with this. I think the two solutions I proposed are all that can be done:

  1. if the flux index goes past the length of the array, make the index the end of the array
  2. define lc_length to be lc_length - 1 so that when computing flux indices they will always be bounded within the range of indices for the array: [0, lc_length - 1]

The problem with (2) is that making this change might make all results produced by feets different in the new version, so results between the new and old versions would be inconsistent. However, it seems to me that (2) would have been the intended behavior of the script all along, and so this is the change that should be made. I don't know anything about the ext_flux_percentile_ratio.py script or its integration with feets outside of using it as a black box, so I'm not sure which change is preferred by the development team.

@leliel12 leliel12 changed the title [Bug] Index off by one error for small time series flux_percentile_ratio Index off by one error for small time series Feb 20, 2019
@leliel12
Copy link
Contributor

mmm i'am little confused with this bug. mostly because the second proposal. So lets try to figure out this issue.

  1. I think the minimal example that break the code is
import feets
fs = feets.FeatureSpace(only=[
  'FluxPercentileRatioMid20',
  'FluxPercentileRatioMid35',
  'FluxPercentileRatioMid50',,
  'FluxPercentileRatioMid65',
  'FluxPercentileRatioMid80'])
features, values = fs.extract(*ts)

Can you provide the correct ts to reproduce the error?

Also, maybe we can use np.percentile

@stevenstetzler
Copy link
Author

If you pass in a time series with just a single value, it will break:

import feets
fs = feets.FeatureSpace(only=[
  'FluxPercentileRatioMid20',
  'FluxPercentileRatioMid35',
  'FluxPercentileRatioMid50',
  'FluxPercentileRatioMid65',
  'FluxPercentileRatioMid80'])

ts = [[0], [1]]

features, values = fs.extract(*ts)

produces

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-f3865207d38e> in <module>
      8 ts = [[0], [1]]
      9 
---> 10 features, values = fs.extract(*ts)

/epyc/projects/plasticc-ml/envs/FeetsTest/lib/python3.7/site-packages/feets/core.py in extract(self, time, magnitude, error, magnitude2, aligned_time, aligned_magnitude, aligned_magnitude2, aligned_error, aligned_error2)
    290         features = {}
    291         for fextractor in self._execution_plan:
--> 292             result = fextractor.extract(features=features, **kwargs)
    293             features.update(result)
    294 

/epyc/projects/plasticc-ml/envs/FeetsTest/lib/python3.7/site-packages/feets/extractors/core.py in extract(self, **kwargs)
    290             # setup & run te extractor
    291             self.setup()
--> 292             result = self.fit(**fit_kwargs)
    293 
    294             # validate if the extractors generates the expected features

/epyc/projects/plasticc-ml/envs/FeetsTest/lib/python3.7/site-packages/feets/extractors/ext_flux_percentile_ratio.py in fit(self, magnitude)
    114         F_95_index = int(math.ceil(0.95 * lc_length))
    115 
--> 116         F_40_60 = sorted_data[F_60_index] - sorted_data[F_40_index]
    117         F_5_95 = sorted_data[F_95_index] - sorted_data[F_5_index]
    118         F_mid20 = F_40_60 / F_5_95

IndexError: index 1 is out of bounds for axis 0 with size 1

It breaks when you have < 20 points in the time series:

import feets
fs = feets.FeatureSpace(only=[
  'FluxPercentileRatioMid20',
  'FluxPercentileRatioMid35',
  'FluxPercentileRatioMid50',
  'FluxPercentileRatioMid65',
  'FluxPercentileRatioMid80'])

for n in range(1, 30):
    # Make time series of size n
    ts = [[i for i in range(n)], [i for i in range(n)]]

    try:
        features, values = fs.extract(*ts)
    except Exception as e:
        print(e)

produces

index 1 is out of bounds for axis 0 with size 1
index 2 is out of bounds for axis 0 with size 2
index 3 is out of bounds for axis 0 with size 3
index 4 is out of bounds for axis 0 with size 4
index 5 is out of bounds for axis 0 with size 5
index 6 is out of bounds for axis 0 with size 6
index 7 is out of bounds for axis 0 with size 7
index 8 is out of bounds for axis 0 with size 8
index 9 is out of bounds for axis 0 with size 9
index 10 is out of bounds for axis 0 with size 10
index 11 is out of bounds for axis 0 with size 11
index 12 is out of bounds for axis 0 with size 12
index 13 is out of bounds for axis 0 with size 13
index 14 is out of bounds for axis 0 with size 14
index 15 is out of bounds for axis 0 with size 15
index 16 is out of bounds for axis 0 with size 16
index 17 is out of bounds for axis 0 with size 17
index 18 is out of bounds for axis 0 with size 18
index 19 is out of bounds for axis 0 with size 19

This test was done using a fresh install of feets with conda with only ipykernel and feets installed.
python --version output: Python 3.7.1
conda list outputs:

astropy                   3.1.1                     <pip>
atomicwrites              1.3.0                     <pip>
attrs                     18.2.0                    <pip>
backcall                  0.1.0                      py_0    conda-forge
bzip2                     1.0.6             h14c3975_1002    conda-forge
ca-certificates           2018.11.29           ha4d7672_0    conda-forge
certifi                   2018.11.29            py37_1000    conda-forge
chardet                   3.0.4                     <pip>
decorator                 4.3.2                      py_0    conda-forge
feets                     0.4                       <pip>
idna                      2.8                       <pip>
ipykernel                 5.1.0           py37h24bf2e0_1002    conda-forge
ipython                   7.3.0            py37h24bf2e0_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
jedi                      0.13.2                py37_1000    conda-forge
jupyter_client            5.2.4                      py_1    conda-forge
jupyter_core              4.4.0                      py_0    conda-forge
libffi                    3.2.1             hf484d3e_1005    conda-forge
libgcc-ng                 7.3.0                hdf63c60_0    conda-forge
libsodium                 1.0.16            h14c3975_1001    conda-forge
libstdcxx-ng              7.3.0                hdf63c60_0    conda-forge
mock                      2.0.0                     <pip>
more-itertools            6.0.0                     <pip>
ncurses                   6.1               hf484d3e_1002    conda-forge
numpy                     1.16.1                    <pip>
openssl                   1.0.2p            h14c3975_1002    conda-forge
pandas                    0.24.1                    <pip>
parso                     0.3.4                      py_0    conda-forge
patsy                     0.5.1                     <pip>
pbr                       5.1.2                     <pip>
pexpect                   4.6.0                 py37_1000    conda-forge
pickleshare               0.7.5                 py37_1000    conda-forge
pip                       19.0.2                   py37_0    conda-forge
pluggy                    0.8.1                     <pip>
prompt_toolkit            2.0.9                      py_0    conda-forge
ptyprocess                0.6.0                 py37_1000    conda-forge
py                        1.7.0                     <pip>
pygments                  2.3.1                      py_0    conda-forge
pytest                    4.3.0                     <pip>
python                    3.7.1             hd21baee_1001    conda-forge
python-dateutil           2.8.0                      py_0    conda-forge
pytz                      2018.9                    <pip>
pyzmq                     18.0.0           py37h6afc9c9_0    conda-forge
readline                  7.0               hf8c457e_1001    conda-forge
requests                  2.21.0                    <pip>
scipy                     1.2.1                     <pip>
setuptools                40.8.0                   py37_0    conda-forge
six                       1.12.0                py37_1000    conda-forge
sqlite                    3.26.0            h67949de_1000    conda-forge
statsmodels               0.9.0                     <pip>
tk                        8.6.9             h84994c4_1000    conda-forge
tornado                   5.1.1           py37h14c3975_1000    conda-forge
traitlets                 4.3.2                 py37_1000    conda-forge
urllib3                   1.24.1                    <pip>
wcwidth                   0.1.7                      py_1    conda-forge
wheel                     0.33.1                   py37_0    conda-forge
xz                        5.2.4             h14c3975_1001    conda-forge
zeromq                    4.2.5             hf484d3e_1006    conda-forge
zlib                      1.2.11            h14c3975_1004    conda-forge

The change in this commit fixes the error stevenstetzler@5480427.

@leliel12
Copy link
Contributor

sorry the delay i gona check this in a week (phd thesis next 25)

@leliel12
Copy link
Contributor

leliel12 commented Apr 1, 2019

I back i gonna create a branch for this bug and add this test to the suite. Are you still want to help me with the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants