Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the use of OpenOA #302

Open
xiaolan0195 opened this issue Nov 28, 2024 · 6 comments
Open

the use of OpenOA #302

xiaolan0195 opened this issue Nov 28, 2024 · 6 comments
Labels

Comments

@xiaolan0195
Copy link

Hello! Part of my research needs to calculate the power loss of each wind turbine in a wind farm composed of 67 wind turbines due to wake interference. I have the second-level SCADA data of each wind turbine. I saw an example of Estimate Operational Wake Losses based on SCADA Data. Can this example support the calculation of the large amount of data in my research? I would be very grateful if you can answer!

@RHammond2
Copy link
Collaborator

Hi @xiaolan0195 OpenOA should absolutely be able to handle 67 turbines! Depending on the resolution of your data, the runtimes may differ quite a bit from our examples, but they will still work. One thing we can't guarantee is whether or not the analysis settings in our examples will work for your data. We have seen instance where we have to deviate from the examples on data cleansing, uncertainty ranges, and etc. in order to get a working analysis.

@xiaolan0195
Copy link
Author

Thank you for answering! The frequency of my SCADA data is 1s, but the frequency of the example is 10min, Can my data be directly put into this case? Or does it need to be converted into 10-minute data type? I mainly use the 06_wake_loss_analysis.ipynb example.I directly input the original SCADA data in the format of la-haute-borne-data-2014-2015.csv. 06_wake_loss_analysis.ipynb Does this case need to pre-process the data before putting it into the csv file format? In this case, I tried to draw a power curve and found that there were still many scattered points in the power curve.
Image
I used the reanalyzable data sets of erra2 and era5 in the example, and the following error occurred when executing the code shown in the picture:
Image
Is this caused by a mismatch with the reanalyzable data or is there another reason?Could you tell me more details ?

@RHammond2
Copy link
Collaborator

Most frequencies should work just fine with OpenOA. You will want to change the frequency metadata settings, such as in our own example (https://github.com/NREL/OpenOA/blob/main/examples/data/plant_meta.yml#L38), or as seen in the docs (https://openoa.readthedocs.io/en/latest/api/types.html#openoa.schema.SCADAMetaData).

On preprocessing, it sounds like your SCADA data would need to be cleaned up a bit to have cleaner power curves. I would recommend checking out our examples, particularly the utils example (https://openoa.readthedocs.io/en/latest/examples/01_utils_examples.html) for how OpenOA can be utilized to clean things up further. The first example (00) can be helpful as well, but the one I've linked to is probably more relevant to your immediate concerns. This process is typically a good amount of trial and error, but let me know if you hit any snags.

For the error produced, I would need to see more of the traceback to understand what exactly has gone wrong. Is it possible to share the full output?

@xiaolan0195
Copy link
Author

xiaolan0195 commented Feb 10, 2025

Thanks a lot.I used the six-month data from January to June for two wind turbines, namely "60026014" and "60026015".When I use the WakeLosses function,it appears error.The detail information is listed as follow.

---------------------------------------------------------------------------
UnsortedIndexError                        Traceback (most recent call last)
Cell In[13], line 1
----> 1 wl = WakeLosses(
      2     plant=project,
      3     wind_direction_col="WMET_HorWdDir",
      4     wind_direction_data_type="scada",
      5     #wind_direction_asset_ids=["R80711", "R80721", "R80736"],
      6     wind_direction_asset_ids=["60026014", "60026015"],
      7     start_date=None,
      8     #end_date="2015-11-25 00:00",
      9     end_date="2024-07-01 00:00:00",
     10     reanalysis_products=["merra2","era5"],
     11     end_date_lt=None,
     12     UQ=False
     13 )

File <attrs generated init openoa.analysis.wake_losses.WakeLosses>:39, in __init__(self, plant, wind_direction_col, wind_direction_data_type, wind_direction_asset_ids, UQ, num_sim, start_date, end_date, reanalysis_products, end_date_lt, wd_bin_width, freestream_sector_width, freestream_power_method, freestream_wind_speed_method, correct_for_derating, derating_filter_wind_speed_start, max_power_filter, wind_bin_mad_thresh, wd_bin_width_LT_corr, ws_bin_width_LT_corr, num_years_LT, assume_no_wakes_high_ws_LT_corr, no_wakes_ws_thresh_LT_corr, min_ws_bin_lin_reg, bin_count_thresh_lin_reg)
     37     __attr_validator_num_years_LT(self, __attr_num_years_LT, self.num_years_LT)
     38     __attr_validator_bin_count_thresh_lin_reg(self, __attr_bin_count_thresh_lin_reg, self.bin_count_thresh_lin_reg)
---> 39 self.__attrs_post_init__()

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\openoa\logging.py:33, in logged_method_call.<locals>._wrapper(self, *args, **kwargs)
     31 logger = logging.getLogger(the_method.__module__)
     32 logger.debug(f"{self.__class__.__name__}#{id(self)}.{the_method.__name__}: {msg}")
---> 33 return the_method(self, *args, **kwargs)

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\openoa\analysis\wake_losses.py:371, in WakeLosses.__attrs_post_init__(self)
    366     self.end_date_lt = min(
    367         [self.plant.reanalysis[product].index.max() for product in self.reanalysis_products]
    368     ).replace(minute=30)
    370 # Run preprocessing steps
--> 371 self._calculate_aggregate_dataframe()

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\openoa\logging.py:33, in logged_method_call.<locals>._wrapper(self, *args, **kwargs)
     31 logger = logging.getLogger(the_method.__module__)
     32 logger.debug(f"{self.__class__.__name__}#{id(self)}.{the_method.__name__}: {msg}")
---> 33 return the_method(self, *args, **kwargs)

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\openoa\analysis\wake_losses.py:998, in WakeLosses._calculate_aggregate_dataframe(self)
    995 if self.wind_direction_data_type == "scada":
    996     scada_cols.insert(1, self.wind_direction_col)
--> 998 self.aggregate_df = self.plant.scada.loc[
    999     self.start_date : self.end_date, scada_cols
   1000 ].unstack()
   1002 # Calculate reference mean wind direction
   1003 self._calculate_mean_wind_direction()

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexing.py:1184, in _LocationIndexer.__getitem__(self, key)
   1182     if self._is_scalar_access(key):
   1183         return self.obj._get_value(*key, takeable=self._takeable)
-> 1184     return self._getitem_tuple(key)
   1185 else:
   1186     # we by definition only have the 0th axis
   1187     axis = self.axis or 0

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexing.py:1368, in _LocIndexer._getitem_tuple(self, tup)
   1366 with suppress(IndexingError):
   1367     tup = self._expand_ellipsis(tup)
-> 1368     return self._getitem_lowerdim(tup)
   1370 # no multi-index, so validate all of the indexers
   1371 tup = self._validate_tuple_indexer(tup)

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexing.py:1041, in _LocationIndexer._getitem_lowerdim(self, tup)
   1039 # we may have a nested tuples indexer here
   1040 if self._is_nested_tuple_indexer(tup):
-> 1041     return self._getitem_nested_tuple(tup)
   1043 # we maybe be using a tuple to represent multiple dimensions here
   1044 ax0 = self.obj._get_axis(0)

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexing.py:1153, in _LocationIndexer._getitem_nested_tuple(self, tup)
   1150     axis -= 1
   1151     continue
-> 1153 obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
   1154 axis -= 1
   1156 # if we have a scalar, we are done

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexing.py:1411, in _LocIndexer._getitem_axis(self, key, axis)
   1409 if isinstance(key, slice):
   1410     self._validate_key(key, axis)
-> 1411     return self._get_slice_axis(key, axis=axis)
   1412 elif com.is_bool_indexer(key):
   1413     return self._getbool_axis(key, axis=axis)

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexing.py:1443, in _LocIndexer._get_slice_axis(self, slice_obj, axis)
   1440     return obj.copy(deep=False)
   1442 labels = obj._get_axis(axis)
-> 1443 indexer = labels.slice_indexer(slice_obj.start, slice_obj.stop, slice_obj.step)
   1445 if isinstance(indexer, slice):
   1446     return self.obj._slice(indexer, axis=axis)

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexes\base.py:6662, in Index.slice_indexer(self, start, end, step)
   6618 def slice_indexer(
   6619     self,
   6620     start: Hashable | None = None,
   6621     end: Hashable | None = None,
   6622     step: int | None = None,
   6623 ) -> slice:
   6624     """
   6625     Compute the slice indexer for input labels and step.
   6626 
   (...)
   6660     slice(1, 3, None)
   6661     """
-> 6662     start_slice, end_slice = self.slice_locs(start, end, step=step)
   6664     # return a slice
   6665     if not is_scalar(start_slice):

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexes\multi.py:2904, in MultiIndex.slice_locs(self, start, end, step)
   2852 """
   2853 For an ordered MultiIndex, compute the slice locations for input
   2854 labels.
   (...)
   2900                       sequence of such.
   2901 """
   2902 # This function adds nothing to its parent implementation (the magic
   2903 # happens in get_slice_bound method), but it adds meaningful doc.
-> 2904 return super().slice_locs(start, end, step)

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexes\base.py:6879, in Index.slice_locs(self, start, end, step)
   6877 start_slice = None
   6878 if start is not None:
-> 6879     start_slice = self.get_slice_bound(start, "left")
   6880 if start_slice is None:
   6881     start_slice = 0

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexes\multi.py:2848, in MultiIndex.get_slice_bound(self, label, side)
   2846 if not isinstance(label, tuple):
   2847     label = (label,)
-> 2848 return self._partial_tup_index(label, side=side)

File c:\Users\lan\.conda\envs\openoa-env\lib\site-packages\pandas\core\indexes\multi.py:2908, in MultiIndex._partial_tup_index(self, tup, side)
   2906 def _partial_tup_index(self, tup: tuple, side: Literal["left", "right"] = "left"):
   2907     if len(tup) > self._lexsort_depth:
-> 2908         raise UnsortedIndexError(
   2909             f"Key length ({len(tup)}) was greater than MultiIndex lexsort depth "
   2910             f"({self._lexsort_depth})"
   2911         )
   2913     n = len(tup)
   2914     start, end = 0, len(self)

UnsortedIndexError: 'Key length (1) was greater than MultiIndex lexsort depth (0)'

@RHammond2
Copy link
Collaborator

@xiaolan0195 I think this might be stemming from a change in how Pandas operates under the hood because it appears that the lack of an explicit index sorting in PlantData (our examples do this in the preprocessing) may be causing this.

Would you be able to add .sort_index() to all of the timeseries data before creating project in your example and let me know if that fixes your issue? If that works, I can make a patch to PlantData so this is done internally.

@xiaolan0195
Copy link
Author

Thanks!I tried to modify the project_ENGIE file before importing the data.In the prepare function, multiple data frames are loaded (such as scada_df, meter_df, curtail_df, reanalysis_merra2_df, reanalysis_era5_df), and .sort_index() is explicitly called after loading.I go to where I load the dataframe and add a .sort_index() after loading.For example, after scada_df = clean_scada(path / "Yangjiang_scada_data_WT14_15.csv"), add
scada_df = scada_df.sort_index().When running the WakeLosses function module of Example 6, this error still exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants