Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas error when saving dummy data in feather format #901

Open
evansd opened this issue Dec 23, 2022 · 0 comments
Open

Pandas error when saving dummy data in feather format #901

evansd opened this issue Dec 23, 2022 · 0 comments

Comments

@evansd
Copy link
Contributor

evansd commented Dec 23, 2022

I haven't looked into this in any detail, but I opened a random GitHub Actions log for a random project just to copy some example log output and spotted this:

     2022-12-19 09:18:33 [info     ] cohortextractor-stats          [cohortextractor.log_utils] description=generate_cohort execution_time=0:00:00.159524 execution_time_secs=0.1595239189999802 index_date=all state=error study_definition=study_definition_controlfinal time=679.542627243 timing=stop timing_id=0
     Exception at 2022-12-19 09:18:33 UTC
     Traceback (most recent call last):
       File "/opt/venv/bin/cohortextractor", line 8, in <module>
         sys.exit(main())
       File "/app/cohortextractor/cohortextractor.py", line 891, in main
         generate_cohort(
       File "/app/cohortextractor/cohortextractor.py", line 167, in generate_cohort
         _generate_cohort(
       File "/app/cohortextractor/cohortextractor.py", line 240, in _generate_cohort
         study.to_file(
       File "/app/cohortextractor/study_definition.py", line 114, in to_file
         dataframe_to_file(df, filename)
       File "/app/cohortextractor/pandas_utils.py", line 17, in dataframe_to_file
         df.to_feather(filename, compression="zstd")
       File "/opt/venv/lib/python3.8/site-packages/pandas/util/_decorators.py", line 207, in wrapper
         return func(*args, **kwargs)
       File "/opt/venv/lib/python3.8/site-packages/pandas/core/frame.py", line 2681, in to_feather
         to_feather(self, path, **kwargs)
       File "/opt/venv/lib/python3.8/site-packages/pandas/io/feather_format.py", line 67, in to_feather
         raise ValueError(
     ValueError: feather does not support serializing <class 'pandas.core.indexes.base.Index'> for the index; you can .reset_index() to make the index into column(s)

That's from:
https://github.com/opensafely/covid-vaccine-effectiveness-seqtrial/actions/runs/3729963891/jobs/6326466775#step:3:6251

Which was running this action:
https://github.com/opensafely/covid-vaccine-effectiveness-seqtrial/blob/11c1d8381b1d3b3ffef643429315baa2df2de13b/project.yaml#L308-L317

With this version of Cohort Extractor:
https://github.com/opensafely-core/cohort-extractor/pkgs/container/cohortextractor/57403067?tag=1.80.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant