Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

respecify length bins #31

Open
sgaichas opened this issue Jan 11, 2023 · 9 comments
Open

respecify length bins #31

sgaichas opened this issue Jan 11, 2023 · 9 comments

Comments

@sgaichas
Copy link
Collaborator

We started with equally spaced length bins throughout the range of observed lengths across fishery and survey data for each species. This made inefficient use of data (smallest and largest bins often missing data).

Options discussed in January 2023 include

  • Start from min observed, not 0 for bin 1
  • Combine bins 1-2? Finer structure thereafter? Bigger last bin? Plus group

A new bin definition algorithm implemented in hydradata first calculates quantiles for each species based on all input lengths aggregated over time. The current implementation uses the smaller of the survey or fishery 10%ile as the minimum size for bin width definition, and the larger of the survey or fishery 90%ile as the maximum size for bin width definition. Equal bin widths within this range are calculated, and then the first and last bin are extended to include 0 for the smallest and the max observed length for the largest bin.

A visualization of bin definitions (black vertical lines) for each species and aggregate dataset is below, based on the current (January 2023) mskeyrun Georges Bank dataset and 5 length bins:

image

image

Thoughts?

@sgaichas
Copy link
Collaborator Author

sgaichas commented Jan 12, 2023

implemented in new datasets called hydra_sim_GB_new5bin_1978_10F commit f62fc60

hydradata commit

@sgaichas
Copy link
Collaborator Author

apologies, latest dat file push now runs with hydra-sim.tpl currently in main branch. I neglected to remove the otherfoodterm that you have been commenting out by hand until now.

@sgaichas
Copy link
Collaborator Author

Diagnostics including raw length comps divided into new bins from a test run "estimate everything": posted here

@sgaichas
Copy link
Collaborator Author

sgaichas commented Jun 7, 2023

@gavinfay when I make the new dat file with the 3 fleet inputs (and incorporating vulnerability), do you want the newer length bins shown here (in this issue) with wider "tails" or do you want the equal spacing bins? The current (May 2023) dat file you posted that adds vulnerability has the original equal spaced bins.

@gavinfay
Copy link
Collaborator

gavinfay commented Jun 7, 2023 via email

@sgaichas
Copy link
Collaborator Author

sgaichas commented Jun 8, 2023

new bins are added in 7550767

@sgaichas
Copy link
Collaborator Author

Update to survey lengths in mskeyrun results in these new aggregate length distributions
image

Corrections to survey length data slightly changed the bin widths based on the rules determining them,
time series of proportion in length bins updated. new dat and -ts.dat files in 883fd3f

the model runs with new input files.

@gavinfay
Copy link
Collaborator

Excellent. Thanks Sarah!
Those are some teeny cod caught in the survey :)
Odd that the 1st and 2nd length bin dividers don't seem to match the '10th percentile' rule, for quite a few of these (e.g. haddock, herring, mackerel, silver hake). Seems particularly odd for the haddock distribution (almost all the survey lengths are in the first bin). I know that the length bin structure is also taking into account fishery lengths so maybe that is influencing this.

@sgaichas
Copy link
Collaborator Author

Yes, correct, the bins are defined based on both the fishery and the survey lengths combined distribution. Here are the fishery data for comparison:
image

So it could be this isn't the best approach for determining bins given the differences in the two datasets? I'm happy to explore alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants