Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error storing filter coefficients for sequenced ek80 file #1317

Open
julianblanco opened this issue May 7, 2024 · 12 comments
Open

Error storing filter coefficients for sequenced ek80 file #1317

julianblanco opened this issue May 7, 2024 · 12 comments
Labels
bug Something isn't working data conversion
Milestone

Comments

@julianblanco
Copy link

When trying to open ek80 data (2 channel system 70khz and 200khz) i get the following value error
ValueError: conflicting sizes for dimension 'channel': length 1 on 'WBT_filter_i' and length 2 on {'channel': 'frequency_nominal', 'pulse_length_bin': 'sa_correction'}

calling code
echodata = ep.convert.api.open_raw('/three_ensemble-Phase0-D20240506-T053349-0.raw', sonar_model='EK80')

version 0.8.4 mac, installed via pip in pycharm venv python ver 3.11

link to file: https://drive.google.com/file/d/1doLT8V3CFXNxkulIKtV1-BXMwvm0ln5N/view?usp=sharing

@ctuguinay
Copy link
Collaborator

ctuguinay commented May 9, 2024

@leewujung This is interesting and I was able to get the same error message. What are your thoughts on this? One of the channels is missing the filter values, or they're assumed to be shared for both?
These are the transducer names from the datagram: ['ES70-18CD', 'ES200-7CDK-Split']

@leewujung
Copy link
Member

Hmm, the filters are specific to each channel, so could not be shared. I think I've seen files before that have a subset of filter values missing, and those parsed fine before. It is possible that some later changes we introduced made those not allowable, and we did not have tests covering that.

@julianblanco
Copy link
Author

Thanks for the quick attention. This is from a WBT mini system saving to a usb drive rather than collected via the ek80 software.

Let me know if i can help in any way. I wrote something to filter out some of the stuff and stitch a echogram but id be interested in helping to fix the underlying issue and learn how to make contributions back upstream.

@ctuguinay
Copy link
Collaborator

ctuguinay commented May 10, 2024

@julianblanco That would be great if you could help us out with this!

Here's the doc for how to contribute: https://echopype.readthedocs.io/en/stable/contributing.html
We're currently no longer using the dev branch so point any PR to main instead.

Here's the doc for how data is organized in the raw Echodata object: https://echopype.readthedocs.io/en/stable/data-format-raw.html. Scroll down to the EK80 part and take a look at Vendor_specific. A valid Vendor_specfic dataset should be one such that the number of individual WBT_filter_n arrays matches the number of individual channels/tranducers.

In the code itself, I would first look at the call to create the Vendor_specific dataset and work backward from there:

echopype/echopype/convert/api.py:487:
...
tree_dict["Vendor_specific"] = setgrouper.set_vendor()

It could be the case that the filtering of the original datagrams removed a channel's WBT_filter_n array, but that is just a guess. However, this case of a 'missing' array should be allowed and there should be code somewhere in the Vendor_specific dataset creation that allows for NaN padding of this missing array, and some logger warning when this NaN padding happens.

@ctuguinay
Copy link
Collaborator

Also, once you set up the test data suite, you can see how the Vendor_specific creation should 'run' on a few other EK80 examples. That may clue you into what part(s) are missing and/or need NaN padding.

@leewujung
Copy link
Member

leewujung commented Jun 4, 2024

Ha, I've got some new files that also have this problem.
@julianblanco: were you able to make some head way into this?

@leewujung leewujung added bug Something isn't working data conversion labels Jun 4, 2024
@leewujung leewujung added this to the v0.9.0 milestone Jun 4, 2024
@github-project-automation github-project-automation bot moved this to Todo in Echopype Jun 4, 2024
@leewujung
Copy link
Member

This source of the problem is that the WBT and PC filter coefficients in the file only exist for one of the channels -- that is the channel that is actually ON. But, all other parameters stored in the Vendor_specific group has 2 channels. We can resolve this by padding NaNs for the channel that was not transmitting, somewhere between:

Code section:

# Save decimation factors and filter coefficients
# Param map values
# 1: wide band transceiver (WBT)
# 2: pulse compression (PC)
param_map = {1: WIDE_BAND_TRANS, 2: PULSE_COMPRESS}
coeffs_and_decimation = {
t: {FILTER_IMAG: [], FILTER_REAL: [], DECIMATION: []} for t in list(param_map.values())
}
for ch in self.sorted_channel["all"]:
fil_coeffs = self.parser_obj.fil_coeffs.get(ch, None)
fil_df = self.parser_obj.fil_df.get(ch, None)
if fil_coeffs and fil_df:
# get filter coefficient values
for type_num, values in fil_coeffs.items():
param = param_map[type_num]
coeffs_and_decimation[param][FILTER_IMAG].append(np.imag(values))
coeffs_and_decimation[param][FILTER_REAL].append(np.real(values))
# get decimation factor values
for type_num, value in fil_df.items():
param = param_map[type_num]
coeffs_and_decimation[param][DECIMATION].append(value)
# Assemble everything into a Dataset
ds = xr.merge([ds_table, ds_cal])
# Add the coeffs and decimation
ds = ds.pipe(self._add_filter_params, coeffs_and_decimation)
# Save the entire config XML in vendor group in case of info loss
ds["config_xml"] = self.parser_obj.config_datagram["xml"]

And the function _add_filter_params:
def _add_filter_params(

@leewujung leewujung modified the milestones: v0.9.0, v0.9.1 Jun 15, 2024
@leewujung leewujung changed the title issue opening ek80 file Error storing filter coefficients for sequenced ek80 file Jun 15, 2024
@ctuguinay ctuguinay modified the milestones: v0.9.1, v0.9.0 Jul 4, 2024
@ctuguinay ctuguinay modified the milestones: v0.9.0, v0.9.1 Jul 16, 2024
@ctuguinay
Copy link
Collaborator

@leewujung Push this to 0.9.1? We didn't encounter anything like this on Shimada or Lasker

@leewujung
Copy link
Member

Agreed!

@ctuguinay ctuguinay modified the milestones: v0.9.0, v0.9.1 Jul 23, 2024
@spacetimeengineer
Copy link
Contributor

@leewujung Hi! Do you think NaN padding is the best long-term solution? I’ll need to replicate the issue first before making any recommendations, but I’d love to hear if there’s a 'dream scenario' solution that could address this.

@leewujung
Copy link
Member

Hey @spacetimeengineer! I think NaN padding would work fine, since it’s easy to drop them when needed. These coefficients are also small so space is not an issue (though zarr compressed very well as we’ve found for the backscatter data).

@spacetimeengineer
Copy link
Contributor

Sounds good to me. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data conversion
Projects
Status: Todo
Development

No branches or pull requests

4 participants