Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid deprecated sample_rate in our own code #231

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

DimitriPapadopoulos
Copy link
Contributor

@DimitriPapadopoulos DimitriPapadopoulos commented Jun 28, 2023

Fixes #198.

At least I think it should. Let's wait for feedback.

The idea is that we should be able to read files with sample_rate for backwards compatibility, but not write files with sample_rate. Of course, this means that older versions of pyedflib won't be able to read files produced with newer versions of pyedeflib. Hence, I guess, the CI failure:

            for shead1, shead2 in zip(signal_headers1, signal_headers2):
                # When only 'sample_rate' is present, we use its value to write
                # the file, ignoring 'sample_frequency', which means that when
                # we read it back only the 'sample_rate' value is present.
>               self.assertDictEqual({**shead1, 'sample_frequency': shead1['sample_rate']},
                                     shead2)
E               KeyError: 'sample_rate'

An alternative would be to output warnings when reading files, only when the values of sample_rate and sample_frequency are different.

Thoughts?

* Keep the documentation inconsistent with the actual code on purpose:
  While the default value of `sample_frequency` is documented to be 256,
  it is `None`. That's because we need to make a difference between 256
  as a default value and 256 as an explicit value. If `sample_frequency`
  is explicitly set to 256, we keep that value, as a matter of course.
  On the other hand, if it is `None`, which is the actual default value,
  we switch to the value of the deprecated `sample_rate`, the default
  value of which is 256.
* Output a warning in case of conflicting values only.
* When setting channel headers, change occurrences of `sample_rate` to
  `sample_frequency`.
* Output a warning in case of conflicting values only.
* Set default values for `sample_frequency`, not `sample_rate`!
* Remove documentation of inexistent parameter.
@cbrnr
Copy link
Contributor

cbrnr commented Jun 28, 2023

I still get the same error when running the test. I hope I installed the branch correctly:

pip install -U git+https://github.com/DimitriPapadopoulos/pyedflib@make_signal_header#egg=pyedflib

@DimitriPapadopoulos DimitriPapadopoulos marked this pull request as draft July 2, 2023 19:09
@DimitriPapadopoulos
Copy link
Contributor Author

Not sure how to install a specific branch with pip. I installed myself from a repository clone with python setup.py develop.

@skjerns How do you think compatibility should be handled?

  1. Should we write files with both sample_frequency and sample_rate, so that legacy software using older versions of pyedflib can still read new files?
  2. If the answer to the above is yes, when reading files, shouldn't we emit a deprecation warning only when sample_frequency is missing or if the values of sample_frequency and sample_rate are different? Otherwise, we get a deprecation warning when reading our own files we have just written.

@skjerns
Copy link
Collaborator

skjerns commented Nov 8, 2023

  1. Should we write files with both sample_frequency and sample_rate, so that legacy software using older versions of pyedflib can still read new files?

I think there is a confusion - in all cases, neither sample_rate nor sample_frequency are written to the file. Internally, only a smp_per_record is written and a time-span for the record_duration. If that record_duration is 1 seconds, the resulting smp_per_record equals to Hz. The usage of sample_frequency is only used to avoid confusion, as the definition of smp_per_record/record_duration is not intuitive (if the record_duration is not 1s, the sample_rate is no longer in Hz, which you would intuitively expect). However, afaik, previously, sample_rate was treated as Hz, even though it was no longer the case when record_duration was not 1.

  1. If the answer to the above is yes, when reading files, shouldn't we emit a deprecation warning only when sample_frequency is missing or if the values of sample_frequency and sample_rate are different? Otherwise, we get a deprecation warning when reading our own files we have just written.

I think we should only give this warning if the user explicitly sets sample_rate. If subfunctions write sample_rate, that should be no problem.

That means:

  • Raise DeprecationWarning only if the user explicitly tries to set it.
  • Raise no warning if subfunctions use sample_rate without user interference.

I'll try something in a different branch

@DimitriPapadopoulos
Copy link
Contributor Author

Indeed, an EDF file contains:

8 ascii : duration of a data record, in seconds
4 ascii : number of signals (ns) in data record

ns * 8 ascii : ns * nr of samples in each data record

Not sure I understand how it works out for several non-contiguous recordings. Any way, I will let you figure out an elegant solution, unless you want help/input on the subject.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'sample_rate' is deprecated, but pyedflib still uses it internally
3 participants