Skip to content
This repository has been archived by the owner on Feb 24, 2023. It is now read-only.

calculation of "Species Most Affected" on report #1197

Open
JChipault opened this issue Apr 29, 2021 · 10 comments
Open

calculation of "Species Most Affected" on report #1197

JChipault opened this issue Apr 29, 2021 · 10 comments
Assignees
Labels
UI User Interface

Comments

@JChipault
Copy link
Collaborator

I'm pretty sure Neil already talked to Blake about this one and they decided it was a lower priority fix, but we realized it wasn't logged in github so putting it in as lower priority github issue....

Undesired Design or Confusing behavior
The "Species Most Affected" on the front page of the search summary report seems to be based on frequency of species, but we want it based on total affected by species (see metadata at the back of the report)

If Possible, Describe a Solution
Change calculation for "Species Most Affected" to reflect the species with most affected based on sick and dead numbers. Neil found code here that calculates based on sum of affected, might be helpful (maybe not?): 5bd7121

Screenshots
image

If you run the search in the screenshot above, the 'White-tailed deer' was found in the most events (20 when search run in early April). The most 'affected' by numbers was the American Coot (5,054).

@JChipault JChipault added the UI User Interface label Apr 29, 2021
@BlakeDraper
Copy link
Contributor

sharing the email thread here for posterity:

Blake to Neil:

Neil,

I looked into this and I think I know what is going on. My best guess is your memory is pointing to a discussion about calculation of affected count that happened with Aaron a while back, which only affected the server side. I remember that too. But this search results summary report is entirely composed on the client side. Looking at the code, the "Species Most affected" is indeed being calculated based on the number of times it appears separately in an event at a different location, irrespective of the affected count at the location. So the report is behaving as expected, or at least as written. I see no evidence this was previously calculated in a different way on the client. So it was either intentional at the time based on discussion, or erroneously written, but either way it seems this is the first we are catching it.

Moving forward, I assume you want this to be updated to calculate based on the affected count for each species?

Neil response:

Got it. Thanks for taking a look.
We'd eventually like to change the calculation to be based on species' affected count, but in the interest of keeping our limited budget to critical bug fixes, I wonder if updating the report label and metadata text would be much easier/quicker at this time. Would you agree? Would this be something like a 4 hr vs 15 min fix?

@BlakeDraper BlakeDraper self-assigned this May 24, 2021
@BlakeDraper
Copy link
Contributor

BlakeDraper commented Jun 8, 2021

The reason this is more like a 4 hour fix is because the EventSummaries response does not currently have the data to make this calculation possible. It only contains the total affected count for the whole event, and has no species level affected count. This was by design at the time. Obviously that species level affected count can be found at the event details page and the eventdetails endpoint, but it was not included here. Since we are using the species object (generic) in EventSummaries reponse and not the locationspecies (specific), the affected count is not naturally a part of this response at all. We may want to pursue calculating that in the serializer and including it as a bonus top-level field rather than totally revamping the response structure. @aaronstephenson your opinion on that?

So for a quick immediate fix, the best thing is to rename the field on the report. I am thinking "Species with highest occurrence" but will make it whatever you want @nbaertlein or @JChipault .

@aaronstephenson
Copy link

Yeah, adding it as a serializer field in the back end is fine with me.

@JChipault
Copy link
Collaborator Author

Let's do "Most Frequent Species" to match "Most Frequent Event Diagnosis"

@BlakeDraper
Copy link
Contributor

@aaronstephenson if that is a quick and soon possible change for you, then I will wait on this on my end. If it may take you a few weeks to get to it, I will make this label change in the meantime.

@aaronstephenson
Copy link

I could probably get to it this week, but I'll need to know the specifics of what needs to be done.

@BlakeDraper
Copy link
Contributor

I will create an issue in the backend repo for this.

@BlakeDraper
Copy link
Contributor

BlakeDraper commented Jun 11, 2021

So Aaron went ahead and implemented something on the backend that I asked for (see here: USGS-WiM/whispersservices#489) but I now realize that it will not meet our needs. This mistake is my fault entirely.

Here is the problem - the Summary Report, by definition and design, does not include Location Species level affected numbers. It does sort of incidentally in the cases where there is one species at a location, and that can be paired with the '# of Animals Affected' column to determine how many of that species were affected. But anywhere there are multiple species, the summary response puts all of those number into a single sum, with no information on the respective count for each.

That data is available at the Event Details page and the eventdetails response. In order to calculate the species with highest affected count, the server will need to query the locationspecies data for every location in the search response and append a custom object with the count per species so that 'Species Most Affected' can be calculated for the report. This is a significant change for the summary response which was deigned to be lighter weight and more performant to get the 'gist' before user drills down for more details. It feels like a scope creep to task it with querying that additional data to serve this single field on the report.

For the time being, I am going to make the field label change to "Most Frequent Species" as specified above so as not to hold up the publishing of any further features. We need to discuss this decision more as a team to proceed with a material change to what that field is showing.

attn: @nbaertlein @JChipault @aaronstephenson

@JChipault
Copy link
Collaborator Author

@BlakeDraper I should've pointed this out more explicitly, sorry, but we'll need the metadata for this report updated too. Can it read "Top species based on the number of events with that species reported."

@BlakeDraper
Copy link
Contributor

This is done in an update to v2.16.9. Now available on test, soon available in Prod.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
UI User Interface
Projects
None yet
Development

No branches or pull requests

3 participants