Add "other" industry demand #355

tud-mchen6 · 2024-04-11T13:50:11Z

Fixes #309 .

Adding "other" industry in the industry module.

#340 is a prerequisite for this PR.

Checklist

Any checks which are not relevant to the PR can be pre-checked by the PR creator. All others should be checked by the reviewer. You can add extra checklist items here if required by the PR.

CHANGELOG updated
Minimal workflow tests pass
Tests added to cover contribution (not relevant)
Documentation updated (not relevant)
Configuration schema updated (not relevant)

…en/add-chemical-industry

…into add-other-industry-demand

…ther-industry-demand

brynpickering

@tud-mchen6 could you confirm that the only change in this PR compared to #340 is the chemicals_industry.py and other_industry.py files?

modules/industry/src/chemicals_industry.py

brynpickering

Just realised that the PR title mentions only "other", so feel free to ignore my "chemical_industry.py" comments.

modules/industry/src/other_industry.py

tud-mchen6 · 2024-04-16T07:44:58Z

@tud-mchen6 could you confirm that the only change in this PR compared to #340 is the chemicals_industry.py and other_industry.py files?

If compared to the commit of #340 at the same time point, the only change is indeed that other_industry.py is added and relevant rule is added in the industry.smk file.

Now the branch of #340 has developed quite far (e.g. integrating the JRC data processing script), and these branches need to be merged with some effort. However, #340 has never developed the functionality of chemical industry or "other" industry, so functionality wise they are still separated.

irm-codebase · 2024-04-27T10:28:30Z

This PR now integrates the JRC module code and xarray processing.
Also, I've added some quality of life processing for "other industries":

You can now "turn off" the processing of specific sectors via the configuration. All non specific sectors will be parsed through the "other" rule.
You can now select which carriers to extract at a final energy level and at an end-use level.
You can select the method of extraction. For now only "priority" is available, which reflects how SCEC does it.

brynpickering · 2024-05-14T15:35:55Z

@irm-codebase ready to rebase onto develop now that #340 is merged in!

…-demand

irm-codebase

@brynpickering ready for review!

CHE pre-processing was leading to some funky issues, so I've added comments on how I dealt with it.

irm-codebase · 2024-05-15T15:35:48Z

modules/industry/scripts/utils/filling.py

@@ -90,8 +81,11 @@ def fill_missing_countries_years(
    _to_fill = _to_fill.bfill(dim="year")
    all_filled = _to_fill.ffill(dim="year")

-    all_filled = jrc.ensure_standard_coordinates(all_filled)
-    all_filled = all_filled.assign_attrs(units="twh")
+    # TODO: CHE has no values for "Wood and wood products" and "Transport Equipment".


CHE was triggering assert failures due to some missing data. For now I am just assuming those values are 0 (same as SCEC).

This is mostly because the CHE processing above this module does not seem to provide data for these sectors, meaning they are filled with nan in all years, so none of our filling methods work.

Let me know what you think.

Yeah, we don't have much choice on that. I assume they are in "other industry" in the CHE data so we can't extract them.

brynpickering

Looks good, just some minor changes to make. I'll go through it again tomorrow by running the code and checking the results at different points. Would be good if a data consistency check existed somewhere, i.e. that no energy demand is lost / added.

brynpickering · 2024-05-29T16:30:32Z

modules/industry/config.yaml

@@ -9,5 +9,10 @@ industry:
        placeholder-out1:
        placeholder-out2:
    params:
+        specific-industries: ["Iron and steel", "Chemicals Industry"]


I'd rename this param. "specific" isn't very descriptive and "industries" might be better as "subsectors". subsectors-to-decarbonise, subsectors-to-electrify, electrified-subsectors, ... ? I don't like any of those particularly but maybe they can trigger a better idea 😅

separated-subsectors, subsectors-to-process-individually?

I changed it to non-generic-categories and changed other to generic-config.
Hopefully this makes processing clearer.

brynpickering · 2024-05-29T16:35:35Z

modules/industry/industry.smk

-    output:
-        path_output = f"{BUILD_PATH}/annual_demand_steel.nc"
-    script: f"{SCRIPT_PATH}/steel_industry.py"
+if "Iron and steel" in config["params"]["specific-industries"]:


Is this the best way to add this conditionality? @timtroendle maybe you can comment.

The other approach would be to use a conditional list as inputs in a later rule, e.g.:

rule merge_industry_demands: input: specific_industries = expand(f"{BUILD_PATH}/annual_demand_{subsector}.nc", subsector=subsector_translator(config["params"]["specific-industries"]))

where subsector_translator is a helper function to map e.g. Steel and Iron to steel.

Not sure I have something useful to say as I am not fully aware of the context. I wonder: Why would steel industry be excluded? Is there a use-case for that? If not, then there is no conditionality needed.

If it is not a "specific" industry then it gets automatically lumped in as "other". So it's possible to overhaul the steel industry to decarbonise feedstocks or to just pipe all demands without converting any processes

Is this the best way to add this conditionality? @timtroendle maybe you can comment.

The other approach would be to use a conditional list as inputs in a later rule, e.g.:

rule merge_industry_demands: input: specific_industries = expand(f"{BUILD_PATH}/annual_demand_{subsector}.nc", subsector=subsector_translator(config["params"]["specific-industries"]))

where subsector_translator is a helper function to map e.g. Steel and Iron to steel.

I like this approach because it's pretty easy to follow and does not add much complexity.
For now I'll keep it as-is since the merging step needs development (SCEC also has a scaling step there).

modules/industry/industry.smk

brynpickering · 2024-05-29T16:37:27Z

modules/industry/industry.smk

    conda: CONDA_PATH
    params:
+        config_params = config["params"],


This is a moment to separate out params. Pass specific-industries and other to the script individually so that e.g. steel param changes don't re-trigger this rule.

brynpickering · 2024-05-29T16:38:32Z

modules/industry/industry.smk

    input:
-    output: f"{BUILD_PATH}/other_industry.csv"
+        path_energy_balances = config["inputs"]["path-energy-balances"],


Since inputs are all paths, I tend to prefer not prepending with path_ here and path- in the config.

I'd like to keep path to avoid confusion between variables holding data and those holding strings.
A bit more explicit.

but if all inputs hold strings...? How about config["input-paths"]["energy-balances"] etc?

brynpickering · 2024-05-29T16:44:30Z

modules/industry/scripts/other_industry.py

+    path_jrc_industry_production: str,
+    path_output: Optional[str] = None,
+) -> xr.DataArray:
+    """Execute the default data processing pipeline all non-specific industries.


Suggested change

"""Execute the default data processing pipeline all non-specific industries.

"""Merge all industries not selected for individual processing into a single `other` subsector using a default data processing pipeline.

brynpickering · 2024-05-29T16:45:30Z

modules/industry/scripts/other_industry.py

+    # Process data:
+    # Extract useful dem. -> remove useful dem. from rest -> extract final dem.
+    selected_useful = config_params["other"]["useful-demands"]
+    other_useful_demand = jrc.convert_subsec_demand_to_carrier(


subsec -> subsector. Not worth the lack of readibility caused by removal of letters

brynpickering · 2024-05-29T16:46:04Z

modules/industry/scripts/other_industry.py

+
+    final_method = config_params["other"]["final-energy-method"]
+    jrc_energy = jrc_energy.drop_sel(subsection=selected_useful)
+    if final_method == "priority":


Are there other methods on the nearterm horizon? If not, it doesn't seem worth introducing this feature.

I added another method that just keeps all the final demand without assumptions. There are probably better methods to process this part, but these two are good enough for now.

I changed the check to a match-case style so this can grow over time.

brynpickering · 2024-05-29T16:56:41Z

modules/industry/scripts/other_industry.py

+    )
+
+    # Fix the naming
+    for carrier in JRC_TO_CALLIOPE:


This is quite verbose. Something like:

other_demand.coords["carrier_name"] = other_demand["carrier_name"].to_series().rename(index=JRC_TO_CALLIOPE).index

I've actually removed the renaming step from this file. It makes more sense to do it once we merge all category files into one and re-scale it (if necessary).

brynpickering · 2024-05-29T16:57:23Z

modules/industry/scripts/other_industry.py

+def transform_final_demand_by_priority(
+    jrc_energy: xr.Dataset, carrier_priority: list[str]
+) -> xr.DataArray:
+    """Transform final demand of all sectors by giving priority to certain carriers.


Changed to "category" (see prev. comment).

…-demand

irm-codebase · 2024-06-11T14:50:09Z

@brynpickering I've implemented your comments, and a couple of extras.

The biggest updates are:

improved names of stuff to reduce ambiguity
standardized naming from "sector" to "category" to match how JRC names things
carrier names are no longer modified (better do it when aggregating/rescaling files)
added an additional processing option for final energy demand

brynpickering

I'm almost happy for this one to go through! Just a couple of minor comments on naming and unit checking.

brynpickering · 2024-06-26T06:58:44Z

modules/industry/config.yaml

@@ -9,5 +9,10 @@ industry:
        placeholder-out1:
        placeholder-out2:
    params:
-        steel:
+        non-generic-categories: ["Iron and steel", "Chemicals Industry"]


I'm still concerned that these will cause confusion down the line. How about "discrete-categories" and "merged-categories"?

what about "explicit-categories" or "explicit-subsectors" or even "explicitly-modelled-subsectors"?

I thought of "explicit" initially too. I then had a look in the dictionary and decided it wasn't quite right (it's more about leaving nothing implied). I'm not sure that "discrete" is right either. "separate"? "independent"?

Hmmm... I also struggled with names. A good name has to do two things:

Specify that these are specific / individual to a given category.

Imply that all the rests will go to the "other" / generic / merged.

Here are words with the antonym, based on your comments.

Specific / generic categories

Separate / combined categories < --- I like this one the best.

Independent / dependent categories

Explicit / implicit categories

Discrete / joined categories

I suggest "separate-category-processing" and "combined-category-processing".

I personally think "subsectors" is clearer, because "category" is generic. "category" is also an implementation detail: should we move to a different data source than JRC, that name would change. Anyways, "category" is fine by me.

I like the "separate" and "combined" distinction!

@timtroendle : I also do not like category, but keeping the code close to the data helps in this case because JRC-IDEES is already hard to wrap your head around due to the amount of columns it has (sector/category/subcategory/process/energy type...)

A different data source is going to require a different approach. This (and the JRC module above it) are just very tied to how the data was constructed. I think this is unavoidable for things as heterogeneous as industry.

Agreed, "Separate" / "combined" sounds good!

brynpickering · 2024-06-26T06:58:55Z

modules/industry/config.yaml

            recycled-steel-share: 0.5  # % of recycled scrap steel for H-DRI
+        generic-config:


"merged-categories-config"?

brynpickering · 2024-06-26T07:00:28Z

modules/industry/industry.smk

    input:
-    output: f"{BUILD_PATH}/other_industry.csv"
+        path_energy_balances = config["inputs"]["path-energy-balances"],


but if all inputs hold strings...? How about config["input-paths"]["energy-balances"] etc?

brynpickering · 2024-06-26T07:01:10Z

modules/industry/industry.smk

-    output: f"{BUILD_PATH}/other_industry.csv"
-    script: f"{SCRIPT_PATH}/other_industry.py"
+        path_output = f"{BUILD_PATH}/annual_demand_generic.nc"
+    script: f"{SCRIPT_PATH}/generic_processing.py"


"merged_category_processing.py"? I would always have category added to whatever adjective we choose!

Agree. Why not use "subsector" instead of "category"?

@irm-codebase is using "category" as it aligns with the JRC-IDEES naming convention. I don't really mind if it's subsector or category, so long as it remains the same throughout the whole submodule.

@timtroendle: Bryn is right on this one. Category is not a word we use often, but it makes sense int the context of JRC-IDEES data. I'd like to keep it that way.

Will it become subsector at some point in the module, as it gets closer to being a Calliope model input?

that would make sense! the idea is:
jrc categories -> specific/combined scripts -> combine/scale/disaggregate scripts -> calliope subsector

brynpickering · 2024-06-26T07:02:03Z

modules/industry/scripts/generic_processing.py

+    jrc_prod = jrc_prod.drop_sel(cat_name=non_generic_categories)
+
+    # Process data:
+    # Extract useful dem. -> remove useful dem. from rest -> extract final dem.


dem -> demand

brynpickering · 2024-06-26T07:02:27Z

modules/industry/scripts/generic_processing.py

+            other_final_demand = transform_final_demand_by_priority(
+                jrc_energy, generic_config["final-energy-carriers"]
+            )
+        case "keep everything":


I assume this doesn't lead to any double counting?

An assert statement would be helpful.

@brynpickering it shouldn't. It's basically "assume nothing, do nothing with the data and just give me the final demand.

If a useful demand is requested, the lines above should remove it. So no double counting is possible.

brynpickering · 2024-06-26T07:03:57Z

modules/industry/scripts/utils/filling.py

@@ -90,8 +81,11 @@ def fill_missing_countries_years(
    _to_fill = _to_fill.bfill(dim="year")
    all_filled = _to_fill.ffill(dim="year")

-    all_filled = jrc.ensure_standard_coordinates(all_filled)
-    all_filled = all_filled.assign_attrs(units="twh")
+    # TODO: CHE has no values for "Wood and wood products" and "Transport Equipment".


Yeah, we don't have much choice on that. I assume they are in "other industry" in the CHE data so we can't extract them.

brynpickering · 2024-06-26T07:06:26Z

modules/industry/scripts/utils/jrc_idees_parser.py

@@ -64,16 +92,15 @@ def get_subsection_final_intensity(
    final_intensity = useful_intensity / carrier_eff

    # Prettify
-    final_intensity = ensure_standard_coordinates(final_intensity)
-    final_intensity = final_intensity.assign_attrs(units="twh/kt")
+    final_intensity = standardize(final_intensity, "twh/kt", name="final_intensity")


is it worth checking the unit of the incoming data before setting this to twh/kt. I.e., useful_demand should have the unit twh.

I'm actually setting it because xarray, for some horrible reason, decided to actually delete attributes by default.
The library is limited to handling only twh as inputs.

I'll add a check at the start to ensure this is specified.

Another option would be to use pint (https://xarray.dev/blog/introducing-pint-xarray), but without standardising this on the JRC side there is not much benefit...

I'm actually setting it because xarray, for some horrible reason, decided to actually delete attributes by default.

At what point in the process?

Operations in general will delete xarray attributes: https://docs.xarray.dev/en/stable/getting-started-guide/faq.html#what-is-your-approach-to-metadata

So, basically everywhere?

I can change this in the xarray options, but I do not think it's a good idea unless we implement an actual unit checker. Better to lose them and mindfully set them than risk leaving invalid attributes during our operations.

brynpickering · 2024-06-26T07:06:33Z

modules/industry/scripts/utils/jrc_idees_parser.py

-    useful_intensity = ensure_standard_coordinates(useful_intensity)
-    useful_intensity = useful_intensity.assign_attrs(units="twh/kt")
-    useful_intensity.name = "useful_intensity"
+    useful_intensity = standardize(useful_intensity, "twh/kt", name="useful_intensity")


same as above RE checking unit

timtroendle

Looks good to me overall. I have a few minor comments.

timtroendle · 2024-06-26T07:40:15Z

CHANGELOG.md

@@ -4,7 +4,7 @@

 ### Added (models)

-* **ADD** industry module and steel industry energy demand processing. NOT CONNECTED TO THE MAIN WORKFLOW. Industry sectors pending: chemical, "other". (Fixes #308, #310, #347, #345 and #346)
+* **ADD** industry module and steel industry energy demand processing. NOT CONNECTED TO THE MAIN WORKFLOW. Industry sectors pending: chemical. (Fixes #308, #309, #310, #347, #345 and #346)


Should this be "ADD industry module including steel and other industry energy demand"?

Updated it with a simpler message. I'll add chemical industry once that is done.

timtroendle · 2024-06-26T07:41:24Z

modules/industry/config.yaml

@@ -9,5 +9,10 @@ industry:
        placeholder-out1:
        placeholder-out2:
    params:
-        steel:
+        non-generic-categories: ["Iron and steel", "Chemicals Industry"]


what about "explicit-categories" or "explicit-subsectors" or even "explicitly-modelled-subsectors"?

timtroendle · 2024-06-26T07:50:06Z

modules/industry/industry.smk

@@ -13,36 +13,46 @@ validate(config, "./schema.yaml")

 # Ensure rules are defined in order.
 # Otherwise commands like "rules.rulename.output" won't work!
-rule steel_industry:
-    message: "Calculate energy demand for the 'Iron and steel' sector in JRC-IDEES."
+if "Iron and steel" in config["params"]["non-generic-categories"]:


The condition doesn't seem necessary to me. Instead of making the rule conditional, make the inputs to a rule downstream conditional.

So, if "iron and steel" is in the config, then a downstream rule will require the file f"{BUILD_PATH}/annual_demand_steel.nc". You will need that anyways, right? Would be good to see how this is integrated eventually.

timtroendle · 2024-06-26T07:50:47Z

modules/industry/industry.smk

+            path_output = f"{BUILD_PATH}/annual_demand_steel.nc"
+        script: f"{SCRIPT_PATH}/steel_processing.py"
+
+if "Chemicals Industry" in config["params"]["non-generic-categories"]:


timtroendle · 2024-06-26T07:52:57Z

modules/industry/industry.smk

-    output: f"{BUILD_PATH}/other_industry.csv"
-    script: f"{SCRIPT_PATH}/other_industry.py"
+        path_output = f"{BUILD_PATH}/annual_demand_generic.nc"
+    script: f"{SCRIPT_PATH}/generic_processing.py"


Agree. Why not use "subsector" instead of "category"?

timtroendle · 2024-06-26T09:41:49Z

modules/industry/scripts/utils/jrc_idees_parser.py

+    carrier_eff = carrier_tot["useful"] / carrier_tot["final"]
+
+    # Fill NaNs (where there is demand, but no consumption in that country)
+    # First by country avg. (all years), then by year avg. (all countries).


This comment should have a ASSUME statement so we can find it.

Done. I've also added it to other parts.

timtroendle · 2024-06-26T09:44:35Z

modules/industry/scripts/utils/jrc_idees_parser.py

+    })
+
+    # Prettify
+    new_carrier_useful_dem = standardize(new_carrier_useful_dem, "twh")


Could be inlined with line 188.

timtroendle · 2024-06-26T09:56:41Z

modules/industry/scripts/generic_processing.py

+    jrc_prod = xr.open_dataarray(path_jrc_industry_production)
+
+    # Remove data from all specifically processed industries
+    cat_names_df = cat_names_df[~cat_names_df["jrc_idees"].isin(non_generic_categories)]


Would it make sense to explicitly list the "generic_categories"/"generically_modelled_subsectors" instead of using all that are non_generic?

This would (1) document better which subscectors are included here, (2) safe-guard that list to possible changes in the list in the future.

I'd rather avoid that, because it introduces the risk of not processing an added category.
Besides, JRC-IDESS is unfortunately quite old (2015), and this processing is very tied to it. Do not see this is a future issue.

I have my sights on this other dataset: https://iopscience.iop.org/article/10.1088/2753-3751/ad4e39
but it is going to require a different module with its own steps.

timtroendle · 2024-06-26T09:57:43Z

modules/industry/scripts/generic_processing.py

+            other_final_demand = transform_final_demand_by_priority(
+                jrc_energy, generic_config["final-energy-carriers"]
+            )
+        case "keep everything":


An assert statement would be helpful.

timtroendle · 2024-06-26T09:59:52Z

modules/industry/scripts/generic_processing.py

+
+    other_demand = jrc.standardize(other_demand, "twh")
+
+    if path_output:


What is this for? If it's needed, than the typt hint of the return type in the function signature must be updated.

if you mean path_output, because the function just returns an xarray (to aid in testing). The file is saved only as an option.

Otherwise, the function would always return a None, making testing it harder.

I needed for testing purposes, I'd recommend to let the function return a dataset which is stored somewhere else, as discussed in the last dev call. Otherwise you increase complexity of the function signature.

I'm a bit puzzled. That is already what this is doing: the return other_demand is an xarray. You could choose to save it or evaluate it directly in a test.

I'll just remove it.

irm-codebase

Let me know if more fixes are needed

irm-codebase · 2024-07-04T12:36:07Z

modules/industry/config.yaml

-        path-carrier-names: config/energy-balances/energy-balance-carrier-names.csv
-        path-jrc-industry-energy: build/data/jrc-idees/industry/processed-energy.nc
-        path-jrc-industry-production: build/data/jrc-idees/industry/processed-production.nc
+    input-paths:


Name change requested by Bryn. This change is also reflected in rule scripts.

irm-codebase · 2024-07-04T12:37:10Z

modules/industry/industry.smk

    conda: CONDA_PATH
    params:
-        config_steel = config["params"]["steel"]
+        non_generic_categories = config["params"]["non-generic-categories"],
+        generic_config = config["params"]["generic-config"],
    input:


This was added to the "combine" rule.

irm-codebase · 2024-07-04T12:38:39Z

modules/industry/industry.smk

-#     output:
-#     script:
+SUFFIXES = [i.lower().replace(" ", "_") for i in config["params"]["specific-categories"]]
+rule combine_and_scale:


This new rule makes sure that both "combined" and "specific" JRC categories are processed.
In the future they will be combined and scaled (chemicals industry must be finished first).

irm-codebase · 2024-07-04T12:39:18Z

modules/industry/schema.yaml

All the requested name changes show here, too.

irm-codebase · 2024-07-04T12:40:21Z

modules/industry/scripts/generic_processing.py

    # Combine and fill missing countries
    other_demand = xr.concat(
        [other_useful_demand, other_final_demand], dim="carrier_name"
    )

+    assert other_demand.sum() < jrc_energy["final"].sum(), "Potential double counting!"


Simple assert to avoid double counting.

Shouldn't this be <=?

You are right! I'll make a quick update.

irm-codebase · 2024-07-04T12:41:49Z

modules/industry/scripts/utils/jrc_idees_parser.py

@@ -6,6 +6,13 @@
 STANDARD_COORDS = ["cat_name", "year", "country_code", "carrier_name"]


+def check_units(jrc_energy: xr.Dataset, jrc_prod: xr.DataArray) -> None:


New method to ensure JRC data has the right format.

irm-codebase · 2024-07-04T12:42:12Z

modules/industry/scripts/utils/jrc_idees_parser.py

+    carrier_eff = carrier_tot["useful"] / carrier_tot["final"]
+
+    # Fill NaNs (where there is demand, but no consumption in that country)
+    # First by country avg. (all years), then by year avg. (all countries).


Done. I've also added it to other parts.

timtroendle · 2024-07-05T13:18:19Z

modules/industry/industry.smk

    conda: CONDA_PATH
    params:
-        config_steel = config["params"]["steel"]
+        non_generic_categories = config["params"]["non-generic-categories"],
+        generic_config = config["params"]["generic-config"],
    input:


For the sake of being explicit rather than implicit, I would add them here, too.

timtroendle · 2024-07-05T13:20:40Z

modules/industry/industry.smk

    conda: CONDA_PATH
    params:
-        config_steel = config["params"]["steel"]
+        non_generic_categories = config["params"]["non-generic-categories"],
+        generic_config = config["params"]["generic-config"],
    input:


Also, the "combine" rule is downstream of this, so this won't work, right? I would definitely make the dependencies explicit, then you do not even have to think about this at all.

timtroendle · 2024-07-05T13:22:26Z

modules/industry/scripts/generic_processing.py

    # Combine and fill missing countries
    other_demand = xr.concat(
        [other_useful_demand, other_final_demand], dim="carrier_name"
    )

+    assert other_demand.sum() < jrc_energy["final"].sum(), "Potential double counting!"


Shouldn't this be <=?

irm-codebase · 2024-07-05T17:32:17Z

@timtroendle quick fix for the assert case you mentioned. I also updated the name of the files, to make them match the category (did not do it before to avoid breaking the review relations).

The aggregated rule runs with no problems!

irm-codebase · 2024-07-05T17:32:58Z

No idea why the docs are not working. It's not related to my changes, for what I can see.

tud-mchen6 · 2024-07-25T11:25:39Z

modules/industry/scripts/utils/jrc_idees_parser.py

+    carrier_eff = carrier_eff.fillna(carrier_eff.mean(dim="year"))
+    carrier_eff = carrier_eff.fillna(carrier_eff.mean(dim="country_code"))
+
+    carrier_final_demand = useful_dem_tot / carrier_eff


This basically means [all carriers demand added together] divided by [specific carrier efficiency]? But this formula only works for the entries that relate to the defined carrier and any other entries with different carriers will become null.
This then becomes an issue for line 165. useful_dem_tot.sum() will likely be larger than carrier_final_demand.sum() because the previous one contains demand of all carriers, while the latter only 'transformed' demand of one carrier. At least that is what I get from processing chemicals industry.

I must admit that I found this function very confusing when converting it from SCEC's code. @brynpickering please correct if I say something wrong here.

This basically means [all carriers demand added together] divided by [specific carrier efficiency]? But this formula only works for the entries that relate to the defined carrier and any other entries with different carriers will become null.

This is intended. Basically, if a carrier was never used to meet a specific useful demand, it's efficiency (carrier_eff) should be full of nan values.

This then becomes an issue for line 165. useful_dem_tot.sum() will likely be larger than carrier_final_demand.sum() because the previous one contains demand of all carriers, while the latter only 'transformed' demand of one carrier. At least that is what I get from processing chemicals industry.

I see your point. How about this?

assert jrc_energy["useful].sel(carrier_name=carrier) < carrier_final_demand.sum(), "Creating energy!"

@tud-mchen6 that being said, I think this function is only used in the "other"/combined industries processing, even in the old SCEC code. Just please make sure that this is the actual function you wish to use.

@tud-mchen6 that being said, I think this function is only used in the "other"/combined industries processing, even in the old SCEC code. Just please make sure that this is the actual function you wish to use.

As I understand, that is not true. The old SCEC code uses electrical_consumption in the function get_chem_energy_consumption, and electrical_consumption is generated by the function get_carrier_demand, which is a similar version to get_carrier_demand in jrc_idees_parser.py in the old industry module.

I must admit that I found this function very confusing when converting it from SCEC's code. @brynpickering please correct if I say something wrong here.

This basically means [all carriers demand added together] divided by [specific carrier efficiency]? But this formula only works for the entries that relate to the defined carrier and any other entries with different carriers will become null.

This is intended. Basically, if a carrier was never used to meet a specific useful demand, it's efficiency (carrier_eff) should be full of nan values.

This then becomes an issue for line 165. useful_dem_tot.sum() will likely be larger than carrier_final_demand.sum() because the previous one contains demand of all carriers, while the latter only 'transformed' demand of one carrier. At least that is what I get from processing chemicals industry.

I see your point. How about this?

assert jrc_energy["useful].sel(carrier_name=carrier) < carrier_final_demand.sum(), "Creating energy!"

Your suggestion at the end looks good, except that you may need to add a .sum() to the left hand side?

Ivan Ruiz Manuel and others added 11 commits April 10, 2024 15:47

Initial addition of 'other' industries

a2b7d35

Add chemicals industry subsector

547f66d

Merge remote-tracking branch 'ivachen/add-industry-module' into ivach…

3d4c3ca

…en/add-chemical-industry

Pre-commit formatting

0904879

Add to CHANGELOG

1a02a58

Merge changes in the main industry method

bb9b95f

Merge remote-tracking branch 'ivachen/ivachen/add-chemical-industry' …

7acc8c8

…into add-other-industry-demand

Merge remote-tracking branch 'ivachen/add-industry-module' into add-o…

368be03

…ther-industry-demand

fix configuration file merge conflicts

657b50f

Fixed other industry, now it is working

34f5c45

Add to CHANGELOG

bffdfbf

tud-mchen6 requested a review from brynpickering April 11, 2024 13:50

brynpickering reviewed Apr 12, 2024

View reviewed changes

modules/industry/src/other_industry.py Outdated Show resolved Hide resolved

modules/industry/src/other_industry.py Outdated Show resolved Hide resolved

modules/industry/src/other_industry.py Outdated Show resolved Hide resolved

brynpickering mentioned this pull request Apr 12, 2024

Add chemicals industry (old) #348

Closed

5 tasks

irm-codebase assigned irm-codebase and tud-mchen6 Apr 16, 2024

irm-codebase added the Industry Industrial energy demand label Apr 23, 2024

irm-codebase added 2 commits April 23, 2024 17:06

Merge branch 'add-industry-module' into add-other-industry-demand

16b5995

Convert to xarray and add flexible config.

e23a4a4

irm-codebase mentioned this pull request Apr 28, 2024

Add industry JRC data processing #354

Merged

5 tasks

irm-codebase added 2 commits May 14, 2024 18:50

Merge remote-tracking branch 'origin/develop' into add-other-industry…

dd446c2

…-demand

Update JRC processing. Add general industry module improvements.

e6068a0

irm-codebase reviewed May 15, 2024

View reviewed changes

brynpickering requested changes May 30, 2024

View reviewed changes

irm-codebase added 2 commits June 11, 2024 11:48

Merge remote-tracking branch 'origin/develop' into add-other-industry…

7f0165c

…-demand

Improve configuration naming, improve generic processing

67bcc6d

Fixed script names

56306e0

irm-codebase requested a review from brynpickering June 11, 2024 14:50

Naming fixes

de2ba7f

irm-codebase requested a review from timtroendle June 25, 2024 08:52

brynpickering reviewed Jun 26, 2024

View reviewed changes

timtroendle requested changes Jun 26, 2024

View reviewed changes

irm-codebase added 2 commits July 4, 2024 12:00

PR fixes: better naming, better rules, more assert checks

f734975

ensure scripts are re-run for utils changes

18fd2c9

irm-codebase requested review from brynpickering and timtroendle July 4, 2024 12:35

irm-codebase reviewed Jul 4, 2024

View reviewed changes

timtroendle approved these changes Jul 5, 2024

View reviewed changes

Small assert fix, updated filenames

1fdc6ea

brynpickering approved these changes Jul 9, 2024

View reviewed changes

brynpickering merged commit 5ad4e0f into calliope-project:develop Jul 9, 2024
4 checks passed

tud-mchen6 commented Jul 25, 2024

View reviewed changes

irm-codebase mentioned this pull request Jul 25, 2024

Add chemicals industry (new) #418

Open

5 tasks

	"""Execute the default data processing pipeline all non-specific industries.
	"""Merge all industries not selected for individual processing into a single `other` subsector using a default data processing pipeline.

		recycled-steel-share: 0.5 # % of recycled scrap steel for H-DRI
		generic-config:


		other_demand = jrc.standardize(other_demand, "twh")

		if path_output:

		@@ -6,6 +6,13 @@
		STANDARD_COORDS = ["cat_name", "year", "country_code", "carrier_name"]


		def check_units(jrc_energy: xr.Dataset, jrc_prod: xr.DataArray) -> None:

Add "other" industry demand #355

Add "other" industry demand #355

Conversation

tud-mchen6 commented Apr 11, 2024

Checklist

brynpickering left a comment • edited Loading

Choose a reason for hiding this comment

brynpickering left a comment

Choose a reason for hiding this comment

tud-mchen6 commented Apr 16, 2024

irm-codebase commented Apr 27, 2024

brynpickering commented May 14, 2024

irm-codebase left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brynpickering left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

irm-codebase Jun 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

irm-codebase commented Jun 11, 2024

brynpickering left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

irm-codebase Jun 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timtroendle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brynpickering left a comment •

edited

Loading

irm-codebase Jun 11, 2024 •

edited

Loading

irm-codebase Jun 30, 2024 •

edited

Loading