Skip to content
This repository has been archived by the owner on May 28, 2024. It is now read-only.

Inspect hidden states for process understanding #45

Open
jsadler2 opened this issue Jan 26, 2022 · 8 comments
Open

Inspect hidden states for process understanding #45

jsadler2 opened this issue Jan 26, 2022 · 8 comments
Assignees
Labels
experiment Something we want to try out process-guidance having to do with adding (or gleaning) process understanding to/from the model

Comments

@jsadler2
Copy link
Collaborator

An idea that has come up is to inspect the hidden states to see if they are behaving as we would expect some state or flux in the process would behave.

The two examples that have come up are:

  1. a biomass: some representation of a biomass that accumulates in the summer, is lower in the winter, and maybe has a sharp decrease after a scouring event
  2. discharge: we aren't giving the baseline model discharge as an input. Does the model have a discharge-like hidden state?

We can look into answering these questions with the baseline LSTM model (#40).

@jsadler2 jsadler2 added experiment Something we want to try out process-guidance having to do with adding (or gleaning) process understanding to/from the model labels Jan 26, 2022
@jsadler2 jsadler2 self-assigned this Feb 11, 2022
@jsadler2
Copy link
Collaborator Author

I started looking at this a little bit and it's interesting! A little mysterious, but interesting.

Here are a few plots so far:

Randomized weights

I first ran the model to get the "DO" predictions and the hidden states with randomized weights to get something to compare them with:

"DO" preds

do_rand

hidden states

h_rand

Trained weights

DO preds

do

hidden states

h

@lekoenig
Copy link
Collaborator

Awesome, @jsadler2 - thanks for sharing these! 🚀

I'm wondering about how/whether I should compare the h time series in the randomized weights plots above versus the trained weights plots below. The weights more or less determine how the inputs are mapped to outputs, right? So in the model run with randomized weights, it seems like the model is learning the relationship between air temperature and DO (e.g. I'm looking at h_2), but that dynamic isn't really maintained in h_2 on the trained weights. It's also interesting that predicted daily mean DO is often greater than the predicted daily max DO for this model run (and that min DO doesn't really have much seasonality) - does that just point to the importance of model training?

For the model run with trained weights, h_3 and h_9 jump out to me as the model appears to be inferring some dynamic that is relatively higher from Oct - May compared to the rest of the year. h_1 looks sort of like a hydrograph (see below), and I can imagine h_0 as some concatenation of the h_3 and h_1. Mysterious but interesting, indeed!

Also interested in your thoughts/interpretation so far, and whether I'm reading these tea leaves appropriately with regard to the randomized versus trained weights.

discharge

@jsadler2
Copy link
Collaborator Author

does that just point to the importance of model training?

Yeah. The randomized weight output is just that - it's totally random. So the seasonal trend we are seeing in h_2 is basically just the trend in the temperature inputs that is randomly coming through. The model hasn't seen any DO data, so it's all just noise. .. and yes. That is especially apparent in the "DO" predictions. They are all over the place and unrealistic (mean > max).

h_3 and h_9 jump out to me

Those ones stood out to me too. So interesting that there are these two very distinct seasons. That is most apparent in h_3, but I think you could argue that there are two distinct regimes in most all of the states. For example

  • h_7 as the negative times and the positive times,
  • h_8 has times with very little variability (winter) and very volatile times (spring - fall).

h_1 looks sort of like a hydrograph

I agree. It would be interesting to plot those in the same figure

whether I'm reading these tea leaves appropriately with regard to the randomized versus trained weights.

That is how I'm reading the tea leaves too :)

One other thing that stood out to me is the sudden drops in h_0. For example, what was it it mid-April 2018 that caused the sudden drop in that state... and there are many similar patterns in h_0, but mid-Apr '18 is the ~biggest magnitude.

@lekoenig
Copy link
Collaborator

The model hasn't seen any DO data

Thanks for this explanation, Jeff - that makes a lot more sense. I didn't realize the model hadn't seen any DO data in the randomized version (and that's probably why you had DO in quotes 😃 )

@lekoenig
Copy link
Collaborator

One other thing that stood out to me is the sudden drops in h_0. For example, what was it it mid-April 2018 that caused the sudden drop in that state... and there are many similar patterns in h_0, but mid-Apr '18 is the ~biggest magnitude.

Yeah, that's interesting. I haven't looked at the input variable time series, but it does look like there was a storm during mid-April 2018. It's not the biggest storm in the record (or even in that year), but we might expect some storms to be more consequential than others if they occur during windows of time that are conducive to relatively high biological activity. That'd be pretty cool if the model could pick up on that.

discharge_april

dataRetrieval::readNWISuv(
  siteNumbers = "01480870",parameterCd=c("00060","00300"),startDate = "2018-01-01",endDate = "2018-05-01",tz = "UTC") %>%
  dataRetrieval::renameNWISColumns(p00300="DO",p00060="Q") %>%
  ggplot() + geom_line(aes(x=dateTime,y=Q_Inst))+
  labs(x="")+scale_x_datetime(date_labels="%Y-%m") + 
  theme_classic()

@galengorski
Copy link
Collaborator

This is a really interesting conversation. It makes me wonder:

  1. how consistent are these states from model run to model run at the same site and from site to site
  2. is there a way to label or group these in a such a way so that when you run the model again and the new h_7 looks like the old h_0 (for example) we have a way to track that, maybe with entropy or a spectral signature or something...? If I were to group the hidden states of the trained model naively, I would say that h_3, h_4, h_5, and h_9 are dominated by lower frequency signals, while h_2, h_6 and h_7 are dominated by higher frequency, while h_0, h_1, and h_8 are kind of a mix. Maybe at a new site or new model realization, the groupings would be similar even if the numbers are mixed up.
  3. what about also tracking the weights of the hidden states? Wouldn't that tell us when these hidden states are more or less important for DO?

@jsadler2
Copy link
Collaborator Author

  1. how consistent are these states from model run to model run at the same site and from site to site

Good question. I'll do a couple runs today to see how they compare across model runs. I can also look across sites.

  1. is there a way to label or group these

I also am wondering if dynamic time warping would be an good way to measure simliarity.

  1. what about also tracking the weights of the hidden states? Wouldn't that tell us when these hidden states are more or less important for DO?

Good idea. If I'm understanding this, though, the weights will be just a static 10x3 (or 3x10?) matrix, so I don't think those would tell us anything about the importance in time, but I do think they'd be worth looking at too.

jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Feb 23, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Feb 23, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Feb 23, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Feb 23, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Feb 23, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Feb 23, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Feb 23, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Feb 23, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Mar 16, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Mar 16, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Mar 16, 2022
jsadler2 added a commit to jsadler2/drb-do-ml that referenced this issue Mar 16, 2022
@lekoenig
Copy link
Collaborator

Jeff has some useful python code in his forked version of the repo for plotting the hidden states. @jsadler2, do you have anything to add here, or any steps you think we should take based on the commits referenced above? Or can we close this issue?

@lekoenig lekoenig reopened this Jul 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
experiment Something we want to try out process-guidance having to do with adding (or gleaning) process understanding to/from the model
Projects
None yet
Development

No branches or pull requests

3 participants