Merge pull request #391 from j-bryan/additive_templating

Refactoring template driver to dynamically create RAVEN workflows
idaholab · Jan 29, 2025 · dcaa3a9 · dcaa3a9
2 parents bb70736 + fccf760
commit dcaa3a9
Show file tree

Hide file tree

Showing 85 changed files with 8,526 additions and 2,673 deletions.
diff --git a/doc/developers/heron_templates.png b/doc/developers/heron_templates.png
diff --git a/doc/developers/templates.md b/doc/developers/templates.md
@@ -0,0 +1,62 @@
+# The HERON Template Driver: A Guide for the HERON Developer
+
+The HERON template driver is the portion of HERON which constructs RAVEN workflows given the information provided in the HERON input file.
+In HERON, a template consists of an XML file which outlines a RAVEN workflow and a Python class which configures that workflow with the provided case information.
+The template driver determines which template is most appropriate for each case.
+The goal of this guide to explain the design philosophy of the template driver and give guidance on how it should be modified when adding features to HERON.
+An illustrative diagram of the major object types and their interactions is given in the figure below.
+![HERON template diagram](heron_templates.png)
+
+The current template system was developed with a few guiding principles in mind:
+1. XML nodes in the template XML should never be removed by the template's Python class. If a node is present in the XML, you can count in being able to access it anywhere in the template class.
+2. Subclasses of `RavenSnippet` should handle all XML operations within the block of XML described by the snippet class. Subclasses of `RavenTemplate` should handle orchestrating the connections among these snippets.
+3. Use a flat workflow whenever possible. While any valid HERON case can be run with the bilevel template, the overhead of starting a new instance of RAVEN for each inner workflow iteration can add significantly slow down analyses.
+
+Also, if you're editing anything in the template drivers:
+- Use type hints THOROUGHLY
+- Favor properties over getter/setter methods
+
+## Templates
+There are currently three main "flavors" of templates in the HERON templating system:
+- Bilevel templates: workflows with an outer workflow for varying system capacity and economic variables and an inner workflow for evaluating the system dispatch over multiple time histories. The bilevel templates are further broken down by case mode ("opt" or "sweep") and time history source (sythetic or static).
+- "Flat" templates: workflows which can be collapsed to either an inner or outer workflow.
+- Debug template: a special template for HERON's "debug" mode.
+
+## Flat Template Limitations
+Some cases which mathematically could be flat workflows cannot currently be implemented as such due to implementation issues in RAVEN or HERON.
+- "opt" mode workflows with a single time history. The limitation is the RAVEN MultiRun step accepts either an Optimizer or a Sampler, but not both. To sample the time history (static or synthetic) requires the use of a sampler in the MultiRun step.
+- Some workflows with uncertain economic parameters could be flat workflows, but the cashflows they describe are quantified in the HERON dispatch manager. There is currently no way to sample an uncertain economic parameter without running the dispatch optimization.
+
+## Should I make a new template?
+The templating system is designed to make creating new templates a somewhat easy task.
+However, a balance must be struck between configuring existing templates and creating new templates.
+When is it appropriate to make a new template?
+
+Don't make a new template if...
+- Substituting one algorithm for another (e.g. Bayesian optimization vs. gradient descent for optimization)
+- Exposing options for an algorithm or entity that is already used by one or more templates
+
+Make a new template if...
+- Adding significant new functionality, like new workflow types or new HERON case modes
+- There is little overlap between the desired workflow and existing templates
+- Adding a feature affects many parts of the template XML
+
+The final decision is left to the best judgement of the developer.
+However, creating a new template likely represents a signficant development effort and would benefit from consultation with the core HERON development team.
+
+## So you want to...
+An approximate guide on steps to take to implement new features.
+
+### Expose an existing RAVEN feature to the HERON user
+1. Create a new RavenSnippet subclass for the feature if one does not yet exist. Expose subelement and attribute options as class properties.
+2. Add unit tests for the snippet class.
+3. Determine which templates can make use of the feature. If using the feature would require removing a node from the template XML, the template node should be removed and the feature should be added to the workflow from the python class.
+
+### Add a new HERON case mode
+1. If the case mode will be run as a bilevel workflow, a new template file and class will likely need to be made for the bilevel outer template (currently split out by different modes).
+2. If some cases of the mode could be run as a flat template, implement that as appropriate. This could be modifying the existing `FlatMultiConfigTemplate` template or creating a new template. Add this new template to the `TemplateDriver` as appropriate.
+
+### Make new kind of workflow
+1. A new template very likely needs to be made. Create one or more template XML files and their corresponding `RavenTemplate` classes to configure them.
+2. Consider which features of the workflow are useful in the other templates. Refactor as necessary.
+3. Add these templates to the `TemplateDriver`.
diff --git a/doc/guide/heron_guide.md b/doc/guide/heron_guide.md
@@ -156,9 +156,9 @@ Note that in a typical HERON analysis, on the order of two million dispatch opti
 
 ### Custom User Specified Functions
 
-HERON allows users to create their own functions that perform computations during simulation runtime. 
+HERON allows users to create their own functions that perform computations during simulation runtime.
 
-Currently, these functions can only deal with computations that do not occur during the dispatch optimization. For example, a user can write a function that determines the `<reference_price>` parameter of a component's cashflow because cashflows are not computed during the inner dispatch optimization. 
+Currently, these functions can only deal with computations that do not occur during the dispatch optimization. For example, a user can write a function that determines the `<reference_price>` parameter of a component's cashflow because cashflows are not computed during the inner dispatch optimization.
 
 Currently, a user would _not_ be able to write a custom transfer function that informs the dispatcher on how resources are transformed while moving between components of the specified system. This is because transfer functions are required during the dispatch of the system and would require the user to write the function in a way that could be interpreted by our underlying optimization library. To be more specific, a user would **not** be able to use a custom function within a `<transfer>` XML node in the HERON input file. **While this feature is not currently available, it may be made available in the future.**
 
@@ -171,9 +171,9 @@ Users can write custom functions, but they must follow the API conventions to en
 A custom function utilized in a HERON input file requires two input parameters that are always returned by the function:
 
 * `data`: A Python dictionary containing information related to associated component that is calling the function.
-* `meta`: A Python dictionary containing information pertaining to the case as a whole. 
+* `meta`: A Python dictionary containing information pertaining to the case as a whole.
 
-It is possible to specify ancillary functions in the python file that do not follow the API conventions, but understand that functions called from the HERON input file will require this specification. 
+It is possible to specify ancillary functions in the python file that do not follow the API conventions, but understand that functions called from the HERON input file will require this specification.
 
 For example, suppose a user wanted to write a function that computed the reference price for a particular component based the current year of the project. In the input file, under the `<reference_price>` node, the user would write:
 
@@ -200,13 +200,13 @@ def get_price(data, meta):
     year = meta['HERON']['active_index']['year']
     if year <=10:
         multiplier = 3
-    else: 
+    else:
         multiplier = 1.5
     result = 1000 * multiplier
     return {"reference_price": result}, meta
 ```
 
-In the above code block, the function starts by accessing data from the `meta` parameter to determine what the current year is within the simulation. Then the function determines the multiplier based on the current year of the simulation. If the simulation is within the first ten years of the project timeline, then it sets a higher multiplier, otherwise it sets the multiplier lower. Finally, the function stores the newly computed `reference_price` into a dictionary that is returned by the function. This value will then be used as the `<reference_price>` within the component that this function is called from within the input file.  
+In the above code block, the function starts by accessing data from the `meta` parameter to determine what the current year is within the simulation. Then the function determines the multiplier based on the current year of the simulation. If the simulation is within the first ten years of the project timeline, then it sets a higher multiplier, otherwise it sets the multiplier lower. Finally, the function stores the newly computed `reference_price` into a dictionary that is returned by the function. This value will then be used as the `<reference_price>` within the component that this function is called from within the input file.
 
 
 

diff --git a/src/Cases.py b/src/Cases.py
@@ -45,22 +45,22 @@ class Case(Base):
   #    > 'optimization_default' - 'min' or 'max' for optimization
   #    > 'percent' (only for percentile) - list of percentiles to return
   #    > 'threshold' (only for sortinoRatio, gainLossRatio, expectedShortfall, valueAtRisk) - threshold value for calculation
-  stats_metrics_meta = {'expectedValue': {'prefix': 'mean', 'optimization_default': 'max'},
-                           'minimum': {'prefix': 'min', 'optimization_default': 'max'},
-                           'maximum': {'prefix': 'max', 'optimization_default': 'max'},
-                           'median': {'prefix': 'med', 'optimization_default': 'max'},
-                           'variance': {'prefix': 'var', 'optimization_default': 'min'},
-                           'sigma': {'prefix': 'std', 'optimization_default': 'min'},
-                           'percentile': {'prefix': 'perc', 'optimization_default': 'max', 'percent': ['5', '95']},
-                           'variationCoefficient': {'prefix': 'varCoeff', 'optimization_default': 'min'},
-                           'skewness': {'prefix': 'skew', 'optimization_default': 'min'},
-                           'kurtosis': {'prefix': 'kurt', 'optimization_default': 'min'},
-                           'samples': {'prefix': 'samp'},
-                           'sharpeRatio': {'prefix': 'sharpe', 'optimization_default': 'max'},
-                           'sortinoRatio': {'prefix': 'sortino', 'optimization_default': 'max', 'threshold': 'median'},
-                           'gainLossRatio': {'prefix': 'glr', 'optimization_default': 'max', 'threshold': 'median'},
-                           'expectedShortfall': {'prefix': 'es', 'optimization_default': 'min', 'threshold': ['0.05']},
-                           'valueAtRisk': {'prefix': 'VaR', 'optimization_default': 'min', 'threshold': ['0.05']}}
+  stats_metrics_meta = {'expectedValue':        {'prefix': 'mean',     'optimization_default': 'max'},
+                        'minimum':              {'prefix': 'min',      'optimization_default': 'max'},
+                        'maximum':              {'prefix': 'max',      'optimization_default': 'max'},
+                        'median':               {'prefix': 'med',      'optimization_default': 'max'},
+                        'variance':             {'prefix': 'var',      'optimization_default': 'min'},
+                        'sigma':                {'prefix': 'std',      'optimization_default': 'min'},
+                        'percentile':           {'prefix': 'perc',     'optimization_default': 'max', 'percent': ['5', '95']},
+                        'variationCoefficient': {'prefix': 'varCoeff', 'optimization_default': 'min'},
+                        'skewness':             {'prefix': 'skew',     'optimization_default': 'min'},
+                        'kurtosis':             {'prefix': 'kurt',     'optimization_default': 'min'},
+                        'samples':              {'prefix': 'samp'},
+                        'sharpeRatio':          {'prefix': 'sharpe',   'optimization_default': 'max'},
+                        'sortinoRatio':         {'prefix': 'sortino',  'optimization_default': 'max', 'threshold': 'median'},
+                        'gainLossRatio':        {'prefix': 'glr',      'optimization_default': 'max', 'threshold': 'median'},
+                        'expectedShortfall':    {'prefix': 'es',       'optimization_default': 'min', 'threshold': ['0.05']},
+                        'valueAtRisk':          {'prefix': 'VaR',      'optimization_default': 'min', 'threshold': ['0.05']}}
 
   # creating a similar dictionary, this time with the optimization defaults flipped
   #    (Levelized Cost does the opposite optimization for all of these stats)
@@ -76,21 +76,21 @@ class Case(Base):
   # economic metrics that can be returned by sweep results OR alongside optimization results
   #    TODO: might be important to index the stats_metrics_meta... does VaR of IRR make sense?
   #    NOTE: the keys for this meta dictionary are the XML Input names
-  economic_metrics_meta =  {'NPV': {'output_name': 'NPV',
-                                   'TEAL_in_name': 'NPV',
+  economic_metrics_meta =  {'NPV': {'output_name':  'NPV',
+                                   'TEAL_in_name':  'NPV',
                                    'TEAL_out_name': 'NPV',
                                    'stats_map': stats_metrics_meta},
-                            'PI': {'output_name': 'PI',
-                                   'TEAL_in_name': 'PI',
+                            'PI': {'output_name':   'PI',
+                                   'TEAL_in_name':  'PI',
                                    'TEAL_out_name': 'PI',
                                    'stats_map': stats_metrics_meta},
-                            'IRR': {'output_name': 'IRR',
-                                    'TEAL_in_name': 'IRR',
+                            'IRR': {'output_name':   'IRR',
+                                    'TEAL_in_name':  'IRR',
                                     'TEAL_out_name': 'IRR',
                                     'stats_map': stats_metrics_meta},
-                            'LC': {'output_name': 'LC_Mult',      #this is how it will appear in CSV
-                                   'TEAL_in_name': 'NPV_search',  #this is how TEAL recognizes it
-                                   'TEAL_out_name': 'NPV_mult',   #this is how TEAL outputs it (don't know why)
+                            'LC': {'output_name':   'LC_Mult',     #this is how it will appear in CSV
+                                   'TEAL_in_name':  'NPV_search',  #this is how TEAL recognizes it
+                                   'TEAL_out_name': 'NPV_mult',    #this is how TEAL outputs it (don't know why)
                                    'stats_map': flipped_stats_metrics_meta}}
   # the keys of the meta dictionary are the names used in XML input
   economic_metrics_input_names = list(em_name for em_name,_ in economic_metrics_meta.items())
@@ -958,8 +958,15 @@ def _read_optimization_settings(self, node):
         # add other information to opt_settings dictionary (type is only information implemented)
         opt_settings[sub_name] = sub.value
 
-    if 'stats_metric' not in list(opt_settings):
-      opt_settings['stats_metric'] = {'name':self._default_stats_metric, 'tol':1e-4}
+    if 'stats_metric' not in opt_settings:
+      opt_settings['stats_metric'] = {'name': self._default_stats_metric, 'tol': 1e-4}
+
+    # Set optimization type ("min" or "max") based on default by economic metric if not provided
+    if 'type' not in opt_settings:
+      opt_metric = opt_settings['opt_metric']
+      stats_metric = opt_settings['stats_metric']['name']
+      opt_settings['type'] = self.economic_metrics_meta[opt_metric]['stats_map'][stats_metric]['optimization_default']
+
     return opt_settings
 
   def _read_result_statistics(self, node):
@@ -1067,14 +1074,9 @@ def _append_econ_metrics(self, new_metric, first=False):
       self._econ_metrics[new_metric] = self.economic_metrics_meta[new_metric]
     else:
       # we are updating the stored economic metric dictionary with new entries via an ordered dict
+      self._econ_metrics[new_metric] = self.economic_metrics_meta[new_metric]
       if first:
-        # there has to be a better way, but OrderedDict has no "prepend" method
-        new_dict = OrderedDict()
-        new_dict[new_metric] = self.economic_metrics_meta[new_metric]
-        new_dict.update(self._econ_metrics)
-        self._econ_metrics = new_dict
-      else:
-        self._econ_metrics[new_metric] = self.economic_metrics_meta[new_metric]
+        self._econ_metrics.move_to_end(new_metric, last=False)  # last=False means move to beginning
 
   def determine_inner_objective(self, components):
     """
@@ -1353,35 +1355,31 @@ def npv_target(self):
     return self._npv_target
 
   #### API ####
-  def write_workflows(self, components, sources, loc):
+  def write_workflows(self, components, sources, dest_dir):
     """
       Writes workflows for this case to XMLs on disk.
       @ In, components, HERON components, components for the simulation
       @ In, sources, HERON sources, sources for the simulation
-      @ In, loc, str, location in which to write files
+      @ In, dest_dir, str, directory in which to write files
       @ Out, None
     """
-    # load templates
-    template_class = self._load_template()
-    inner, outer = template_class.createWorkflow(self, components, sources)
-
-    template_class.writeWorkflow((inner, outer), loc)
+    # Load templates, create RAVEN workflows, and write those workflows using a TemplateDriver
+    driver = self._make_template_driver()
+    driver.create_workflow(self, components, sources)
+    driver.write_workflow(dest_dir, self, components, sources)
 
   #### UTILITIES ####
-  def _load_template(self):
+  def _make_template_driver(self):
     """
       Loads template files for modification
       @ In, None
-      @ Out, template_class, RAVEN Template, instantiated Template class
+      @ Out, template_class, TemplateDriver, instantiated TemplateDriver class
     """
     src_dir = os.path.dirname(os.path.realpath(__file__))
-    heron_dir = os.path.abspath(os.path.join(src_dir, '..'))
-    template_dir = os.path.abspath(os.path.join(heron_dir, 'templates'))
-    template_name = 'template_driver'
+    heron_dir = os.path.abspath(os.path.join(src_dir, ".."))
     # import template module
     sys.path.append(heron_dir)
-    module = importlib.import_module(f'templates.{template_name}', package="HERON")
+    module = importlib.import_module("templates.template_driver", package="HERON")
     # load template, perform actions
-    template_class = module.Template(messageHandler=self.messageHandler)
-    template_class.loadTemplate(template_dir)
-    return template_class
+    driver = module.TemplateDriver(messageHandler=self.messageHandler)
+    return driver
diff --git a/src/Components.py b/src/Components.py
@@ -331,6 +331,17 @@ def set_levelized_cost_meta(self, cashflows):
       resource = cf.get_driver()._vp.get_resource()
       self.levelized_meta[cf.name] = {tracker:resource}
 
+  def get_uncertain_cashflow_params(self):
+    """
+      Get all uncertain economic parameters
+      @ In, None
+      @ Out, params, dict, the uncertain parameters
+    """
+    params = {}
+    for cf in self.get_cashflows():
+      uncertain = cf.get_uncertain_params()
+      params |= {f"{self.name}_{k}": v for k, v in uncertain.items()}
+    return params
 
 
 class Interaction(Base):

diff --git a/src/DispatchManager.py b/src/DispatchManager.py
@@ -900,5 +900,3 @@ def run(self, raven, raven_dict):
       runner.override_time(override_time) # TODO setter
     dispatch, metrics, tot_activity = runner.run(raven_vars)
     runner.save_variables(raven, dispatch, metrics, tot_activity)
-
-
Original file line number	Diff line number	Diff line change
Expand Up		@@ -900,5 +900,3 @@ def run(self, raven, raven_dict):
		runner.override_time(override_time) # TODO setter
		dispatch, metrics, tot_activity = runner.run(raven_vars)
		runner.save_variables(raven, dispatch, metrics, tot_activity)