PVNet Question: Inquiry on Processing Y Value (PVgen Value) #10

Sukh-P · 2024-10-15T15:48:34Z

Sukh-P
Oct 15, 2024
Maintainer

To reduce duplication in answering questions and in accordance with OCF's plan to move discussions which aren't repo specific to a more centralised place I am moving a discussion from HuggingFace over to here:

Answering the questions from there:

[1] about your Y-value that you are studying the model, are you accumulating all PV generation that exist within that GSP-region? or is it just ONE PV generator per GSP?

We use a data source from Sheffield Solar/PVLive which gives an estimate for total generation for a GSP region from all generators in it which is what we are trying to forecast.

[2] excerpt from paper: We train our model to predict the PV outturn divided by the effective capacity...Normalising by effective capacity ensures that... does this mean each timestep of output from your deep learning model will be between [0,1]? If you were to use it to predict actual PV generation, would you simply multiply by the effective capacity?

Yes in most cases the target values will be between [0,1], some rare cases the generation value may be higher than the stated capacity so you can get values slightly > 1. Yes after we get the outputs from our PVNet model which multiply by effective capacity to get the forecasted generation in MW.

[2-1] If there are multiple pv-generators in a GSP-region, how do you deal with "effective capacity" for Y-value (and later back to MW of PV generation?)

The sum total for effective capacity for a GSP region at each timestamp is also given by PVLive, it is an estimate for the total effective capacity for all generators in that region.

[2-2] May I please ask for link from your code whether from PVNet or OCF-Data?

What sort of code are you looking for? I think I will need more detail to point you to the right place. The code to open data related to gsp-regions is here ocf-data-sampler is where we are moving our data processing code to (from ocf_datapipes) it should be easier to go through.

On the second set of questions on the HF thread:

Did you aggregate all the PV generation within each GSP region and treat it as if it were a single generator?

Please see answer to 1 above.

How did you handle changes in capacity within each GSP region over time (as capacity either increased or decreased)? (mainly if they add or disappear, how would that not affect your model…?)

So in the data source we use a capacity value is given for each timestamp which we have generation data so the yield/ normalised PV output should account for capacity changes over time (of course no data source is perfect and this is unlikely to be 100% accurate) when we make forecasts now/live we try to use the most up to date estimates of capacity to get a MW forecast.

I hope these answers are useful! Thanks

Answered by Sukh-P

Oct 22, 2024

Hi @kwon-encored

So a single optimizer is used for the whole model and to clarify the model is made up of different networks/elements but these are all trained together e.g. their weights are updated with the same training steps, it is just that all data input modalities, e.g each NWP input and Satellite go through their own encoders before they are concatenated together which you can see being done in the multimodal file you shared in each commented section.

In terms of where that optimizer is in code you can find that it is called in the initialisation of the BaseModel which is the base class from which the multimodal class Model you were looking at inherits from. This BaseModel class …

View full answer

kwon-encored · 2024-10-21T08:46:08Z

kwon-encored
Oct 21, 2024

@Sukh-P
Hi Sukh-P,

As I was reviewing your machine learning layers, I got confused about how to train each NWP (IFS, ECMWF) and Satellite separately, and then combine them into a single vector.

In your code, I noticed you were using EmbAdamWReduceLROnPlateau, but when I checked this multimodal code, I couldn't find where the optimizer is applied.

Should we use a separate optimizer for each of the IFS, ECMWF, and EUMETSAT models, or do we perform optimization just once at the end?

5 replies

Sukh-P Oct 22, 2024
Maintainer Author

Hi @kwon-encored

So a single optimizer is used for the whole model and to clarify the model is made up of different networks/elements but these are all trained together e.g. their weights are updated with the same training steps, it is just that all data input modalities, e.g each NWP input and Satellite go through their own encoders before they are concatenated together which you can see being done in the multimodal file you shared in each commented section.

In terms of where that optimizer is in code you can find that it is called in the initialisation of the BaseModel which is the base class from which the multimodal class Model you were looking at inherits from. This BaseModel class actually inherits from the pytorch Lightning pl.LightningModule class which you can find documentation for here, pytorch lightning is the library we used to abstract away some boiler plate machine learning code, if you aren't planning on using pytorch lightning yourselves you will need to modify the code a bit.

Hope this helps!

Answer selected by Sukh-P

kwon-encored Oct 23, 2024

@Sukh-P
Thank you, as always, for your consideration!
I’m a bit confused about this line. Why isn’t the model explicitly declared in other Python files when running the code? Is it because the Model class is being called or instantiated via the config.yaml file through the Hydra system?

I was also hoping to understand how the Model() is used to compute the next neural layers after fusion, since—if I understood correctly—it’s just combining the multimodal inputs into one dictionary variable.

Adding on, is there reason why I can't find any reference of "ResFCNet2" or "ResConv3DNet2" or "DefaultPVNet" within the Model() python code...

Sincere thanks in advance 🙏

Sukh-P Oct 23, 2024
Maintainer Author

Is it because the Model class is being called or instantiated via the config.yaml file through the Hydra system?

Yes that is correct, you can see that in the examples main config file here it specifies that the model is the multimodal.yaml file and within that file the target when instantiated is the Model class

I was also hoping to understand how the Model() is used to compute the next neural layers after fusion, since—if I understood correctly—it’s just combining the multimodal inputs into one dictionary variable.

In the Model class you see an output network is used, this output network is a class which inherits from the AbstractLinearNetwork class which has this function called cat_modes, this is the function which takes that dictionary and flattens/concats all the items in the dictionary into a 1D vector which is then passed as the input to the forward method of whichever type of output network being used

Adding on, is there reason why I can't find any reference of "ResFCNet2" or "ResConv3DNet2" or "DefaultPVNet" within the Model() python code...

This it related to the previous answer, this is because they are all types of the AbstractLinearNetwork class which is the type of the output network in the Model class here so you specify in config which of the output networks you are using ("ResFCNet2" or "ResConv3DNet2" etc.) and those output networks are defined here

kwon-encored Oct 24, 2024

@Sukh-P Thank you for all your references. I believe I now have a solid understanding of the general logic behind the machine learning process. Below, I’ve illustrated my understanding as a flowchart (leaning more toward the concepts presented in the paper rather than the updated code). Could you let me know if my understanding is valid?

Within the code, I saw PV Data is being used as input. But in terms of the paper, I should not worry about this PV Data and just continue with ECMWF, UKV, Satellite, GSP, and Solar Coordinate, correct?

Furthermore, in machine learning, we typically have input X and target Y. In this study, similar to model.fit(train_X, target_Y), where can I find the dimensions and parameters you set for the model's output that are comparable to the target variable-Y?

Sukh-P Oct 24, 2024
Maintainer Author

Really nice diagram! Yes this looks right to me.

On PV data yes you don't need to add that in if you are just following what we did in the paper (how we do things now) this is because we don't have a reliable PV data input currently that helps decrease forecast accuracy, there are plans to work on this in the future.

In terms of where you can find the parameters/model dimensions for the output for the target variable-Y you can find that here in the case of using quantile regression (having quantile outputs) you can find other functions here for if you are just creating a point estimate, you can also find all the loss functions we use in that file depending on whether we are creating quantiles or just a point prediction.

Hope that helps!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open Climate Fix

PVNet Question: Inquiry on Processing Y Value (PVgen Value) #10

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 5 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Open Climate Fix

PVNet Question: Inquiry on Processing Y Value (PVgen Value) #10

Sukh-P Oct 15, 2024 Maintainer

Replies: 1 comment · 5 replies

kwon-encored Oct 21, 2024

Sukh-P Oct 22, 2024 Maintainer Author

kwon-encored Oct 23, 2024

Sukh-P Oct 23, 2024 Maintainer Author

kwon-encored Oct 24, 2024

Sukh-P Oct 24, 2024 Maintainer Author

Sukh-P
Oct 15, 2024
Maintainer

Replies: 1 comment 5 replies

kwon-encored
Oct 21, 2024

Sukh-P Oct 22, 2024
Maintainer Author

Sukh-P Oct 23, 2024
Maintainer Author

Sukh-P Oct 24, 2024
Maintainer Author