Redo experiments with fixed dataset #71

danyoungday · 2024-03-13T17:04:26Z

Previously the dataset overlapped the year 2012 in the train and test set. This redoes all the results in the notebooks with the updated dataset. Additionally, predictor significance is moved to a script rather than notebook.

…riments to be run tonight

… on linux machine

danyoungday · 2024-03-13T17:17:42Z

use_cases/eluc/data/conversion.py

        if old_abbrev in MANUAL_MAP.keys() and MANUAL_MAP[old_abbrev] in codes_df["Numeric code"].unique():
-            countries_df.iloc[i]["abbrevs"] = codes_df[codes_df["Numeric code"] == MANUAL_MAP[old_abbrev]]["Alpha-2 code"].iloc[0]
+            countries_df.loc[i, "abbrevs"] = codes_df[codes_df["Numeric code"] == MANUAL_MAP[old_abbrev]]["Alpha-2 code"].iloc[0]


Was giving me a pandas warning because the old way will be deprecated soon.

danyoungday · 2024-03-13T17:17:57Z

use_cases/eluc/data/eluc_data.py

@@ -169,6 +169,7 @@ def __init__(self, start_year=1851, test_year=2012, end_year=2022, countries=Non

        self.train_df = df.loc[start_year:test_year-1]
        self.test_df = df.loc[test_year:end_year-1]
+        assert self.train_df['time'].max() == self.test_df["time"].min() - 1


Some assertions to make sure the same mistake doesn't happen again

danyoungday · 2024-03-13T17:18:09Z

use_cases/eluc/experiments/predictor_experiments.ipynb

Reran experiments with fixed dataset

danyoungday · 2024-03-13T17:18:26Z

use_cases/eluc/experiments/predictor_significance.py

Moved predictor significance portion of notebook to its own python script

…updating with new trained TorchPrescriptors

ofrancon

lgtm

ofrancon · 2024-05-14T23:09:11Z

use_cases/eluc/experiments/predictor_experiments.ipynb

   "metadata": {},
   "outputs": [],
   "source": [
    "# Note: The original paper trains from 1982 onwards but this is too slow and large for the\n",
    "# purpose of this example.\n",
-    "forest.fit(dataset.train_df.loc[2002:][constants.NN_FEATS], dataset.train_df.loc[2002:][\"ELUC\"])\n",
-    "forest.save(\"predictors/sklearn/trained_models/experiment_rf\")"
+    "forest_year = 1982\n",


In the comment above you say training from 1982 "is too slow and large for the purpose of this example". But now you do it. Should you remove the comment?

ofrancon · 2024-05-14T23:09:54Z

use_cases/eluc/experiments/predictor_experiments.ipynb

     ]
    }
   ],
   "source": [
-    "forest.load(\"predictors/sklearn/trained_models/experiment_rf\")\n",
+    "# TODO: I don't think we can possibly load a model this big\n",
+    "# forest.load(\"predictors/sklearn/trained_models/no_overlap_rf\")\n",


You saved the model a few cells above. So it's not too big to be loaded. Should you remove the TODO?

Fully trained torch prescriptors and updated experiments

… so it should be fine

danyoungday added 5 commits February 20, 2024 13:08

Reran all experiments using fixed dataset

4121283

Added assert to make sure train and test don't overlap. Prepared expe…

31df2aa

…riments to be run tonight

Set up experiments so I can run them over a few days

40662b7

Fixed conversion to fit pandas 3. Added significance script to be run…

9514ad0

… on linux machine

Updated predictor significance with fixed dataset!

a241979

danyoungday added the bug Something isn't working label Mar 13, 2024

danyoungday requested a review from ofrancon March 13, 2024 17:04

danyoungday self-assigned this Mar 13, 2024

Fixed merge conflicts with main

439aaa0

danyoungday commented Mar 13, 2024

View reviewed changes

use_cases/eluc/experiments/predictor_experiments.ipynb

Copy link

Collaborator Author

danyoungday Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reran experiments with fixed dataset

danyoungday commented Mar 13, 2024

View reviewed changes

danyoungday added 2 commits March 14, 2024 11:15

Changed neural net to allow different devices

1726c9e

Fixed prescriptor experiments to work with new prescriptor API after …

39a2d3f

…updating with new trained TorchPrescriptors

ofrancon approved these changes May 14, 2024

View reviewed changes

danyoungday and others added 3 commits May 15, 2024 08:56

Merge pull request #72 from Project-Resilience/train-prescriptors-script

56cb1ea

Fully trained torch prescriptors and updated experiments

merged main into branch

6e6d9e9

Removed warning from rf training. It's the same model as in the paper…

0279962

… so it should be fine

danyoungday merged commit 87f5f97 into main May 15, 2024
1 check passed

danyoungday deleted the redo-results branch May 15, 2024 16:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redo experiments with fixed dataset #71

Redo experiments with fixed dataset #71

danyoungday commented Mar 13, 2024

danyoungday Mar 13, 2024

danyoungday Mar 13, 2024

danyoungday Mar 13, 2024

danyoungday Mar 13, 2024

ofrancon left a comment

ofrancon May 14, 2024

ofrancon May 14, 2024

Redo experiments with fixed dataset #71

Redo experiments with fixed dataset #71

Conversation

danyoungday commented Mar 13, 2024

danyoungday Mar 13, 2024

Choose a reason for hiding this comment

danyoungday Mar 13, 2024

Choose a reason for hiding this comment

danyoungday Mar 13, 2024

Choose a reason for hiding this comment

danyoungday Mar 13, 2024

Choose a reason for hiding this comment

ofrancon left a comment

Choose a reason for hiding this comment

ofrancon May 14, 2024

Choose a reason for hiding this comment

ofrancon May 14, 2024

Choose a reason for hiding this comment