Merge pull request #221 from mir-group/develop

0.5.5
mir-group · Jun 20, 2022 · 41d6b2d · 41d6b2d
2 parents 9bd9e30 + 9d6dfe0
commit 41d6b2d
Show file tree

Hide file tree

Showing 46 changed files with 1,325 additions and 303 deletions.
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -16,7 +16,7 @@ jobs:
     strategy:
       matrix:
         python-version: [3.7, 3.9]
-        torch-version: [1.8.0, 1.11.0]
+        torch-version: [1.10.0, 1.11.0]
 
     steps:
     - uses: actions/checkout@v2

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,7 +7,35 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 Most recent change on the bottom.
 
 
-## [Unreleased] - 0.5.4
+## [Unreleased] - 0.5.6
+
+## [0.5.5] - 2022-06-20
+### Added
+- BETA! Support for stress in training and inference
+- `EMTTestDataset` for quick synthetic fake PBC data
+- multiprocessing for ASE dataset loading/processing
+- `nequip-benchmark` times dataset loading, model creation, and compilation
+- `validation_batch_size`
+- support multiple metrics on same field with different `functional`s
+- allow custom metrics names
+- allow `e3nn==0.5.0`
+- `--verbose` option to `nequip-deploy`
+- print data statistics in `nequip-benchmark`
+- `normalized_sum` reduction in `AtomwiseReduce`
+
+### Changed
+- abbreviate `node_features`->`h` in loss titles
+- failure of permutation equivariance tests no longer short-circuts o3 equivariance tests
+- `NequIPCalculator` now stores all relevant properties computed by the model regardless of requested `properties`, and does not try to access those not computed by the model, allowing models that only compute energy or forces but not both
+
+### Fixed
+- Equivariance testing correctly handles output cells
+- Equivariance testing correctly handles one-node or one-edge data
+- `report_init_validation` now runs on validation set instead of training set
+- crash when unable to find `os.sched_getaffinity` on some systems
+- don't incorrectly log per-species scales/shifts when loading model (such as for deployment)
+- `nequip-benchmark` now picks data frames deterministically
+- useful error message for `metrics_key: training_*` with `report_init_validation: True` (#213)
 
 ## [0.5.4] - 2022-04-12
 ### Added

diff --git a/README.md b/README.md
@@ -135,7 +135,7 @@ For installation instructions, please see the [`pair_nequip` repository](https:/
 
 The theory behind NequIP is described in our preprint (1). NequIP's backend builds on e3nn, a general framework for building E(3)-equivariant neural networks (2). If you use this repository in your work, please consider citing NequIP (1) and e3nn (3):
 
- 1. https://arxiv.org/abs/2101.03164
+ 1. https://www.nature.com/articles/s41467-022-29939-5
  2. https://e3nn.org
  3. https://doi.org/10.5281/zenodo.3724963
 

diff --git a/configs/example.yaml b/configs/example.yaml
@@ -78,7 +78,8 @@ n_train: 100
 n_val: 50                                                                          # number of validation data
 learning_rate: 0.005                                                               # learning rate, we found values between 0.01 and 0.005 to work best - this is often one of the most important hyperparameters to tune
 batch_size: 5                                                                      # batch size, we found it important to keep this small for most applications including forces (1-5); for energy-only training, higher batch sizes work better
-max_epochs: 100                                                                    # stop training after _ number of epochs, we set a small number here to have an example that finished within a few minutes, but in practice we recommend using a very large number, as e.g. 1million and then to just use early stopping and not train the full number of epochs
+validation_batch_size: 10                                                          # batch size for evaluating the model during validation. This does not affect the training results, but using the highest value possible (<=n_val) without running out of memory will speed up your training.
+max_epochs: 100000                                                                 # stop training after _ number of epochs, we set a very large number, as e.g. 1million and then just use early stopping and not train the full number of epochs
 train_val_split: random                                                            # can be random or sequential. if sequential, first n_train elements are training, next n_val are val, else random, usually random is the right choice
 shuffle: true                                                                      # if true, the data loader will shuffle the data, usually a good idea
 metrics_key: validation_loss                                                       # metrics used for scheduling and saving best model. Options: `set`_`quantity`, set can be either "train" or "validation, "quantity" can be loss or anything that appears in the validation batch step header, such as f_mae, f_rmse, e_mae, e_rmse

diff --git a/configs/full.yaml b/configs/full.yaml
@@ -165,6 +165,7 @@ n_train: 100
 n_val: 50                                                                          # number of validation data
 learning_rate: 0.005                                                               # learning rate, we found values between 0.01 and 0.005 to work best - this is often one of the most important hyperparameters to tune
 batch_size: 5                                                                      # batch size, we found it important to keep this small for most applications including forces (1-5); for energy-only training, higher batch sizes work better
+validation_batch_size: 10                                                          # batch size for evaluating the model during validation. This does not affect the training results, but using the highest value possible (<=n_val) without running out of memory will speed up your training.
 max_epochs: 100000                                                                 # stop training after _ number of epochs, we set a very large number here, it won't take this long in practice and we will use early stopping instead
 train_val_split: random                                                            # can be random or sequential. if sequential, first n_train elements are training, next n_val are val, else random, usually random is the right choice
 shuffle: true                                                                      # If true, the data loader will shuffle the data, usually a good idea

diff --git a/configs/minimal.yaml b/configs/minimal.yaml
@@ -46,6 +46,7 @@ wandb: false
 n_train: 5
 n_val: 5
 batch_size: 1
+validation_batch_size: 5
 max_epochs: 10
 
 # loss function

diff --git a/configs/minimal_stress.yaml b/configs/minimal_stress.yaml
@@ -0,0 +1,58 @@
+# general
+root: results/w-14
+run_name: minimal
+seed: 123
+dataset_seed: 456
+
+# network
+model_builders:
+  - SimpleIrrepsConfig
+  - EnergyModel
+  - PerSpeciesRescale
+  - StressForceOutput
+  - RescaleEnergyEtc
+
+num_basis: 8
+r_max: 4.0
+l_max: 2
+parity: true
+num_features: 16
+
+# data set
+dataset: ase                                                                       # type of data set, can be npz or ase
+dataset_url: https://qmml.org/Datasets/w-14.zip             # url to download the npz. optional
+dataset_file_name: ./benchmark_data/w-14.xyz                        # path to data set file
+dataset_key_mapping:
+  force: forces
+dataset_include_keys:
+  - virial
+# A mapping of chemical species to type indexes is necessary if the dataset is provided with atomic numbers instead of type indexes.
+chemical_symbols:
+  - W
+# only early frames have stress
+dataset_include_frames: !!python/object/apply:builtins.range
+  - 0
+  - 100
+  - 1
+
+global_rescale_scale: dataset_total_energy_std
+per_species_rescale_shifts: dataset_per_atom_total_energy_mean
+per_species_rescale_scales: dataset_per_atom_total_energy_std
+
+# logging
+wandb: false
+# verbose: debug
+
+# training
+n_train: 90
+n_val: 10
+batch_size: 1
+max_epochs: 10
+
+# loss function
+loss_coeffs:
+ - virial
+ - forces
+
+# optimizer
+optimizer_name: Adam
diff --git a/configs/minimal_toy_emt.yaml b/configs/minimal_toy_emt.yaml
@@ -0,0 +1,46 @@
+# general
+root: results/toy-emt
+run_name: minimal
+seed: 123
+dataset_seed: 456
+
+# network
+model_builders:
+  - EnergyModel
+  - PerSpeciesRescale
+  - StressForceOutput
+  - RescaleEnergyEtc
+num_basis: 8
+r_max: 4.0
+irreps_edge_sh: 0e + 1o
+conv_to_output_hidden_irreps_out: 16x0e
+feature_irreps_hidden: 16x0o + 16x0e + 16x1o + 16x1e
+
+# data set
+dataset: EMTTest                                                                       # type of data set, can be npz or ase
+dataset_element: Cu
+dataset_num_frames: 100
+chemical_symbols:
+  - Cu
+
+global_rescale_scale: dataset_total_energy_std
+per_species_rescale_shifts: dataset_per_atom_total_energy_mean
+per_species_rescale_scales: dataset_per_atom_total_energy_std
+
+# logging
+wandb: false
+# verbose: debug
+
+# training
+n_train: 90
+n_val: 10
+batch_size: 1
+max_epochs: 100
+
+# loss function
+loss_coeffs:                                                                       # different weights to use in a weighted loss functions
+  forces: 1                                                                        # for MD applications, we recommed a force weight of 100 and an energy weight of 1
+  stress: 1
+
+# optimizer
+optimizer_name: Adam
diff --git a/examples/custom_dataset.py b/examples/custom_dataset.py
@@ -0,0 +1,116 @@
+from typing import Dict, List, Callable, Union, Optional
+import numpy as np
+import logging
+
+import torch
+
+from nequip.data import AtomicData
+from nequip.utils.savenload import atomic_write
+from nequip.data.transforms import TypeMapper
+from nequip.data import AtomicDataset
+
+
+class ExampleCustomDataset(AtomicDataset):
+    """
+    See https://pytorch-geometric.readthedocs.io/en/latest/notes/create_dataset.html#creating-larger-datasets.
+
+    If you don't need downloading or pre-processing, just don't define any of the relevant methods/properties.
+    """
+
+    def __init__(
+        self,
+        root: str,
+        custom_option1,
+        custom_option2="default",
+        type_mapper: Optional[TypeMapper] = None,
+    ):
+        # Initialize the AtomicDataset, which runs .download() (if present) and .process()
+        # See https://pytorch-geometric.readthedocs.io/en/latest/notes/create_dataset.html#creating-larger-datasets
+        # This will only run download and preprocessing if cached dataset files aren't found
+        super().__init__(root=root, type_mapper=type_mapper)
+
+        # if the processed paths don't exist, `self.process()` has been called at this point
+        # (if it is defined)
+        # but otherwise you need to load the data from the cached pre-processed dir:
+        if self.mydata is None:
+            self.mydata = torch.load(self.processed_paths[0])
+        # if you didn't define `process()`, this is where you would unconditionally load your data.
+
+    def len(self) -> int:
+        """Return the number of frames in the dataset."""
+        return 42
+
+    @property
+    def raw_file_names(self) -> List[str]:
+        """Return a list of filenames for the raw data.
+
+        Need to be simple filenames to be looked for in `self.raw_dir`
+        """
+        return ["data.dat"]
+
+    @property
+    def raw_dir(self) -> str:
+        return "/path/to/dataset-folder/"
+
+    @property
+    def processed_file_names(self) -> List[str]:
+        """Like `self.raw_file_names`, but for the files generated by `self.process()`.
+
+        Should not be paths, just file names. These will be stored in `self.processed_dir`,
+        which is set by NequIP in `AtomicDataset` based on `self.root` and a hash of the
+        dataset options provided to `__init__`.
+        """
+        return ["processed-data.pth"]
+
+    # def download(self):
+    #     """Optional method to download raw data before preprocessing if the `raw_paths` do not exist."""
+    #     pass
+
+    def process(self):
+        # load things from the raw data:
+        # whatever is appropriate for your format
+        data = np.load(self.raw_dir + "/" + self.raw_file_names[0])
+
+        # if any pre-processing is necessary, do it and cache the results to
+        # `self.processed_paths` as you defined above:
+        with atomic_write(self.processed_paths[0], binary=True) as f:
+            # e.g., anything that takes a file `f` will work
+            torch.save(data, f)
+            # ^ use atomic writes to avoid race conditions between
+            # different trainings that use the same dataset
+            # since those separate trainings should all produce the same results,
+            # it doesn't matter if they overwrite each others cached'
+            # datasets. It only matters that they don't simultaneously try
+            # to write the _same_ file, corrupting it.
+
+        logging.info("Cached processed data to disk")
+
+        # optionally, save the processed data on the Dataset object
+        # to avoid a roundtrip from disk in `__init__` (see above)
+        self.mydata = data
+
+    def get(self, idx: int) -> AtomicData:
+        """Return the data frame with a given index as an `AtomicData` object."""
+        build_an_AtomicData_here = None
+        return build_an_AtomicData_here
+
+    def statistics(
+        self,
+        fields: List[Union[str, Callable]],
+        modes: List[str],
+        stride: int = 1,
+        unbiased: bool = True,
+        kwargs: Optional[Dict[str, dict]] = {},
+    ) -> List[tuple]:
+        """Optional method to compute statistics over an entire dataset.
+
+        This must correctly handle `self._indices` for subsets!!!
+
+        If not provided, options like `avg_num_neighbors: auto`, `per_species_rescale_scales: dataset_*`,
+        and others that compute dataset statistics will not work. This only needs to support the statistics
+        modes that are necessary for what you need to run (i.e. if you do not use `dataset_per_species_*`
+        statistics, you do not need to implement them).
+
+        See `AtomicInMemoryDataset` for full documentation and example implementation.
+        """
+        raise NotImplementedError
diff --git a/nequip/_version.py b/nequip/_version.py
@@ -2,4 +2,4 @@
 # See Python packaging guide
 # https://packaging.python.org/guides/single-sourcing-package-version/
 
-__version__ = "0.5.4"
+__version__ = "0.5.5"
diff --git a/nequip/ase/nequip_calculator.py b/nequip/ase/nequip_calculator.py
@@ -4,6 +4,7 @@
 
 import ase.data
 from ase.calculators.calculator import Calculator, all_changes
+from ase.stress import full_3x3_to_voigt_6_stress
 
 from nequip.data import AtomicData, AtomicDataDict
 from nequip.data.transforms import TypeMapper
@@ -24,7 +25,7 @@ class NequIPCalculator(Calculator):
 
     """
 
-    implemented_properties = ["energy", "energies", "forces"]
+    implemented_properties = ["energy", "energies", "forces", "stress", "free_energy"]
 
     def __init__(
         self,
@@ -115,18 +116,18 @@ def calculate(self, atoms=None, properties=["energy"], system_changes=all_change
 
         # predict + extract data
         out = self.model(data)
-        forces = out[AtomicDataDict.FORCE_KEY].detach().cpu().numpy()
-        energy = (
-            out[AtomicDataDict.TOTAL_ENERGY_KEY].detach().cpu().numpy().reshape(tuple())
-        )
-
-        # store results
-        self.results = {
-            "energy": energy * self.energy_units_to_eV,
-            # force has units eng / len:
-            "forces": forces * (self.energy_units_to_eV / self.length_units_to_A),
-        }
-
+        self.results = {}
+        # only store results the model actually computed to avoid KeyErrors
+        if AtomicDataDict.TOTAL_ENERGY_KEY in out:
+            self.results["energy"] = self.energy_units_to_eV * (
+                out[AtomicDataDict.TOTAL_ENERGY_KEY]
+                .detach()
+                .cpu()
+                .numpy()
+                .reshape(tuple())
+            )
+            # "force consistant" energy
+            self.results["free_energy"] = self.results["energy"]
         if AtomicDataDict.PER_ATOM_ENERGY_KEY in out:
             self.results["energies"] = self.energy_units_to_eV * (
                 out[AtomicDataDict.PER_ATOM_ENERGY_KEY]
@@ -135,3 +136,16 @@ def calculate(self, atoms=None, properties=["energy"], system_changes=all_change
                 .cpu()
                 .numpy()
             )
+        if AtomicDataDict.FORCE_KEY in out:
+            # force has units eng / len:
+            self.results["forces"] = (
+                self.energy_units_to_eV / self.length_units_to_A
+            ) * out[AtomicDataDict.FORCE_KEY].detach().cpu().numpy()
+        if AtomicDataDict.STRESS_KEY in out:
+            stress = out[AtomicDataDict.STRESS_KEY].detach().cpu().numpy()
+            stress = stress.reshape(3, 3) * (
+                self.energy_units_to_eV / self.length_units_to_A**3
+            )
+            # ase wants voigt format
+            stress_voigt = full_3x3_to_voigt_6_stress(stress)
+            self.results["stress"] = stress_voigt