Refactor app with new predictor/prescriptor #87

danyoungday · 2024-05-30T21:21:59Z

Deleted old demo predictor/prescriptor code and updated app to use current predictor/prescriptors. Additionally refactored torch prescriptor so that the prescriptor object was standalone from the trainer.

Still todo after this PR:

Automate data downloading for the app. Currently spinning up the app requires you to run the process_data.py script however this script downloads the entire dataset before processing it. We should upload a demo dataset so that no processing has to be done by the app.
Currently predictor list is hard-coded and we will need to make it easier to add user models.
Prescriptors still follow ESP convention where we have a single directory with lots of models saved from a training run. Since we want to be able to add heuristics and other users' trained prescriptors we need to change the scope of the Prescriptor object from a group of prescriptors to a single model with context input and actions output.

…till lots of work to do.

…nces to new predictor/prescriptor

…moment

…with new cand id format

danyoungday · 2024-05-30T21:23:39Z

use_cases/eluc/app/app.py

-df = pd.read_csv(constants.DATA_FILE_PATH, index_col=constants.INDEX_COLS)
-countries_df = regionmask.defined_regions.natural_earth_v5_0_0.countries_110.to_dataframe()
+df = pd.read_csv(app_constants.DATA_FILE_PATH, index_col=app_constants.INDEX_COLS)
+df.rename(columns={col + ".1": col for col in app_constants.INDEX_COLS}, inplace=True)


This is a little hacky because since we need time/lat/lon as context as well as indices we have 2 time/lat/lon columns: time, lat, lon, time.1, lat.1, lon.1 which we need to separate.

danyoungday · 2024-05-30T21:24:33Z

use_cases/eluc/app/app.py

+    "hidden_size": 16,
+    "out_size": len(constants.RECO_COLS)
+}
+prescriptor = TorchPrescriptor(None, encoder, None, 1, candidate_params)


This is currently how we handle the prescriptor loading. We save a snapshot of the encoder and hard-code the candidate params (which could also theoretically be snapshotted after training later).

danyoungday · 2024-05-30T21:25:48Z

use_cases/eluc/app/app.py

    # If we have no prescription just return an empty chart
    if all(slider == 0 for slider in sliders):
-        return utils.create_treemap(pd.Series([]), type_context=False, year=year)
+        return utils.create_treemap(pd.Series([], dtype=float), type_context=False, year=year)


Fixed a warning that the default pandas datatype is not float

danyoungday · 2024-05-30T21:26:34Z

use_cases/eluc/app/process_data.py

Extremely simple data processing script. In the future we may want to handle time/lat/lon being an index and context in a more elegant fashion.

danyoungday · 2024-05-30T21:27:09Z

use_cases/eluc/app/utils.py

-from . import Predictor
-
-
-class Encoder:


We don't need this anymore because we have data.ELUCEncoder now

danyoungday · 2024-05-30T21:28:42Z

use_cases/eluc/app/utils.py

@@ -106,27 +72,17 @@ def create_check_options(values: list) -> list:
             "value": val})
    return options

-
-def compute_percent_change(context: pd.Series, presc: pd.Series) -> float:
+def context_presc_to_df(context: pd.Series, presc: pd.Series) -> pd.DataFrame:


This is a new function to get the demo to work with the new prescriptors. Since the prescriptor wants a context_actions dataframe, we have to convert the context which we get from our dataset and the prescriptions which we get from our sliders into a context_actions_df by computing the diff and processing the zero change columns.

danyoungday · 2024-05-30T21:29:31Z

use_cases/eluc/app/utils.py

                "Primary Vegetation", "primf", "primn", 
                "Secondary Vegetation", "secdf", "secdn",
                "Urban",
                "Fields", "pastr", "range"]
        parents = ["", title,
-                title, "Crops", "Crops", "C3", "C3", "C3", "C4", "C4",
+                title,


Had to redo the treemap code because we no longer have specific crop types.

danyoungday · 2024-05-30T21:29:53Z

use_cases/eluc/app/utils.py

+    plo = px.colors.qualitative.Plotly
+    dar = px.colors.qualitative.Dark24
+    #['crop', 'pastr', 'primf', 'primn', 'range', 'secdf', 'secdn', 'urban', 'nonland]
+    colors = [plo[4], plo[0], plo[2], dar[14], plo[5], plo[7], dar[2], plo[3], plo[1]]


Redid colors for treemap because there are no longer crop types

danyoungday · 2024-05-30T21:30:43Z

use_cases/eluc/app/utils.py

    if type_context:
        fig.update_layout(showlegend=False)
-        # To make up for the hidden legend
-        fig.update_layout(margin={"t": 50, "b": 50, "l": 50, "r": 50})


Had to remove this code that added margins to the left pie chart because with the new size of the pie charts (since the extra crop sliders got removed so the chart is smaller) it's no longer necessary

danyoungday · 2024-05-30T21:31:37Z

use_cases/eluc/app/utils.py

+    predictors["Global Random Forest"] = global_rf
+
+    return predictors


This function is currently hard-coded to load these predictors. We can handle this more elegantly but for now users aren't submitting models and ultimately we want to be in control of the models that get added, we aren't doing it automatically, so this in theory should be fine.

Yes it's fine

danyoungday · 2024-05-30T21:32:10Z

use_cases/eluc/data/eluc_data.py

+        """
+        with open(path, "r", encoding="utf-8") as file:
+            fields = json.load(file)
+        return cls(fields)


Allows us to load an encoder from a fields file. This enables us to use a pretrained one in the demo.

It's important we use the same encoder as the one that was used for training indeed.

danyoungday · 2024-05-30T21:32:32Z

use_cases/eluc/experiments/predictor_significance.py

Just some linting changes

danyoungday · 2024-05-30T21:33:15Z

use_cases/eluc/predictors/neural_network/neural_net_predictor.py

Mostly linting changes but there are some modifications to how candidate ids are saved to align with ESP's method

danyoungday · 2024-05-30T21:34:40Z

use_cases/eluc/prescriptors/nsga2/candidate.py

+                      "parents": self.parents,
+                      "NSGA-II_rank": self.rank, # Named this to match ESP
+                      "distance": self.distance,
+        }


Now we save our candidate id as "id" in our logs. Not a huge fan of this change since "id" is a python builtin so we can't name our variables that but it matches ESP. This is still nicer than what I had before which was using gen and id as a double index.

I don't think we have to match ESP anymore.
distance is the NSGA-II_distance though, right? Maybe we
id could become cid for candidate id if needed. But dictionary keys don't conflict with python buildins so as long as we don't call our variables id we're fine.

danyoungday · 2024-05-30T21:35:34Z

use_cases/eluc/prescriptors/nsga2/torch_prescriptor.py

Removed all the training code from this class to prescriptors/nsga2/trainer.py

danyoungday · 2024-05-30T21:35:58Z

use_cases/eluc/prescriptors/nsga2/train_prescriptors.py

@@ -30,7 +30,7 @@
    print("Initializing prescription...")
    if "seed_dir" in config["evolution_params"].keys():
        config["evolution_params"]["seed_dir"] = Path(config["evolution_params"]["seed_dir"])
-    tp = TorchPrescriptor(
+    tp = TorchTrainer(


Uses new trainer object instead of directly training a TorchPrescriptor

danyoungday · 2024-05-30T21:36:22Z

use_cases/eluc/prescriptors/nsga2/trainer.py

Just cut and pasted from TorchPrescriptor.

danyoungday · 2024-05-30T21:36:37Z

use_cases/eluc/tests/test_app.py

Unit tests for some functions in the app. Still not nearly robust enough.

danyoungday · 2024-05-30T21:37:10Z

use_cases/eluc/tests/test_prescriptor.py

Old tests that tested the compute_percent_change function from the demo now adapted to our new compute_percent_change in Prescriptor class.

…ions

ofrancon

lgtm

ofrancon · 2024-06-04T22:52:21Z

use_cases/eluc/.pylintrc


 recursive=y

-fail-under=9.0
+fail-under=9.65


Thanks 👍
Little by little we can get to 10.

ofrancon · 2024-06-04T22:59:54Z

use_cases/eluc/app/process_data.py

+    """
+    Main function that loads the data and saves it.
+    """
+    dataset = ELUCData(APP_START_YEAR-1, APP_START_YEAR, 2022)


It's hard to know, without looking at the code, that the params are start_year, test_year and end_year. Maybe you can name the params here.
Should 2022 be a constant too? It's hard to know what where loading here

Updated this to be more clear with comments and pass the args in as kwargs

ofrancon · 2024-06-05T01:08:57Z

use_cases/eluc/app/utils.py

-    :param presc: Prescribed land use data
-    :return: Percent land use change
+    Takes a context with all columns and a presc with RECO_COLS and returns an updated context actions df.
+    This df takes the difference between the RECO_COLS in presc and context and sets the DIFF_RECO_COLS to that.


The parameters should still be documented, but we can take care of that later

ofrancon · 2024-06-05T01:10:48Z

use_cases/eluc/app/utils.py

+    predictors = {}
+    nn_path = "danyoung/eluc-global-nn"


Could move to constants

ofrancon · 2024-06-05T01:11:13Z

use_cases/eluc/app/utils.py

+    predictors["Global Random Forest"] = global_rf
+
+    return predictors


Yes it's fine

ofrancon · 2024-06-05T01:12:04Z

use_cases/eluc/data/conversion.py

@@ -17,7 +17,7 @@
    "RUS": 643,
    "N": 578,
    "F": 250,
-    "J": 388,
+    # "J": 388,


What was wrong with "J"?

There are multiple countries with J in the converter I found. At some point the country code to name conversion scheme may have to be overhauled because it doesn't work for some countries.

ofrancon · 2024-06-05T01:13:15Z

use_cases/eluc/data/eluc_data.py

+        """
+        with open(path, "r", encoding="utf-8") as file:
+            fields = json.load(file)
+        return cls(fields)


It's important we use the same encoder as the one that was used for training indeed.

ofrancon · 2024-06-05T01:19:04Z

use_cases/eluc/prescriptors/nsga2/candidate.py

+                      "parents": self.parents,
+                      "NSGA-II_rank": self.rank, # Named this to match ESP
+                      "distance": self.distance,
+        }


I don't think we have to match ESP anymore.
distance is the NSGA-II_distance though, right? Maybe we
id could become cid for candidate id if needed. But dictionary keys don't conflict with python buildins so as long as we don't call our variables id we're fine.

danyoungday added 21 commits May 20, 2024 08:49

Moved demo things into app folder. Refactored data download script. S…

5034f51

…till lots of work to do.

Tweaked constants and got rid of old predictor/prescriptor

b2b7189

Merge branch 'fix-distance' into refactor-app

cc3a43e

Hacked app to launch without predictor/prescriptor and pointed refere…

f9ad3ef

…nces to new predictor/prescriptor

Changed record state function in candidate to not throw pylint error

f156f25

Moved prescriptor to its own class and moved trainer out of prescriptor

867ffdd

Got prescription and treemaps working with new prescriptor

db56310

Got new predictors working but predictors list is hard-coded for the …

49189aa

…moment

removed some prints

22008a8

Fixed trainer to run

900fcd7

Updated experiment and prescriptor code to work with new prescriptor …

e62c78e

…with new cand id format

Got pareto chart to work with new prescriptors

cb729ea

Changed cand id saving to match esp

951668e

Updated trivia CO2 value to reflect true tCO2 rather than tC

e9c7d3c

Updated unittests

c94a0f5

Linted project to 9.65

620c1b0

Fixed pie chart size with new UI layout

3327ddc

Fixed slider overlap issue on first load

2b0eb0c

tweaked UI size issues

cbb4554

Removed transfer prescriptors notebook which is not necessary

50fb26b

Removed old demo folder from gitignore

e5bfbfd

danyoungday requested a review from ofrancon May 30, 2024 21:22

danyoungday self-assigned this May 30, 2024

danyoungday commented May 30, 2024

View reviewed changes

use_cases/eluc/experiments/predictor_significance.py Outdated

Copy link

Collaborator Author

danyoungday May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some linting changes

danyoungday commented May 30, 2024

View reviewed changes

use_cases/eluc/prescriptors/nsga2/trainer.py Outdated

Copy link

Collaborator Author

danyoungday May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just cut and pasted from TorchPrescriptor.

danyoungday commented May 30, 2024

View reviewed changes

use_cases/eluc/tests/test_app.py Outdated

Copy link

Collaborator Author

danyoungday May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests for some functions in the app. Still not nearly robust enough.

danyoungday commented May 30, 2024

View reviewed changes

danyoungday added 3 commits May 30, 2024 15:00

Removed outdated reference to pareto front

39a0dc5

Removed old import for pareto

4e4b290

Removed old data loading that was causing tests to fail in github act…

e5d757e

…ions

ofrancon approved these changes Jun 5, 2024

View reviewed changes

Updated data processing to be more clear with comments

8afe5e9

danyoungday merged commit 8a6cfa5 into main Jun 6, 2024
1 check passed

danyoungday deleted the refactor-app branch June 6, 2024 22:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor app with new predictor/prescriptor #87

Refactor app with new predictor/prescriptor #87

danyoungday commented May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

ofrancon Jun 5, 2024

danyoungday May 30, 2024

ofrancon Jun 5, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

ofrancon Jun 5, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

danyoungday May 30, 2024

ofrancon left a comment

ofrancon Jun 4, 2024

ofrancon Jun 4, 2024

danyoungday Jun 6, 2024

ofrancon Jun 5, 2024

ofrancon Jun 5, 2024

ofrancon Jun 5, 2024

ofrancon Jun 5, 2024

danyoungday Jun 6, 2024

ofrancon Jun 5, 2024

ofrancon Jun 5, 2024

		predictors["Global Random Forest"] = global_rf

		return predictors

Refactor app with new predictor/prescriptor #87

Refactor app with new predictor/prescriptor #87

Conversation

danyoungday commented May 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ofrancon left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment