Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permutation in analytical score #113

Merged
merged 24 commits into from
Jan 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
439d7d8
Compute log wrapped gaussians.
rousseab Dec 24, 2024
12d7871
Permutation over atomic indices.
rousseab Dec 24, 2024
83e99b8
Code a trivial factorial function to avoid importing something else.
rousseab Dec 24, 2024
58c3734
Improved analytical score network.
rousseab Dec 24, 2024
ecfe3ce
Refactor name of datamodule module.
rousseab Dec 24, 2024
b1f7255
Rename lammps preprocessor code.
rousseab Dec 24, 2024
51a612c
Better name.
rousseab Dec 24, 2024
7f10189
Align test name.
rousseab Dec 24, 2024
d6fc645
Better naming and clean data module instantiation.
rousseab Dec 24, 2024
14825ca
A new data module for on-the-fly Gaussian datasets.
rousseab Dec 24, 2024
8d54ef4
The ability to instantiate a Gaussian data module.
rousseab Dec 25, 2024
bfb8506
Make it easier to instantiate the analytical score network from a con…
rousseab Dec 25, 2024
99bbaf4
Add analytical score network to possible models to instantiate.
rousseab Dec 25, 2024
778b4d9
Fix exp details.
rousseab Dec 25, 2024
27c566a
Cleaner data module instantiation.
rousseab Dec 25, 2024
714b583
More comments for debug server.
rousseab Dec 25, 2024
bb7e1f3
Make it possible to turn off the optimizer.
rousseab Dec 25, 2024
89fb2ad
Class to plot the scores along a direction.
rousseab Dec 25, 2024
9da231e
Fancier score viewer.
rousseab Dec 26, 2024
d0d0b23
Bring scores back to cpu.
rousseab Dec 26, 2024
f335322
New callback to show scores along a path.
rousseab Dec 26, 2024
4071d3a
Fix assert error comment.
rousseab Dec 26, 2024
98afd87
Ensure that the EGNN score network works in 1D, 2D and 3D.
rousseab Dec 26, 2024
fd25a21
an example using the on-the-fly data module
rousseab Dec 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions data/process_lammps_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
import logging
import tempfile

from diffusion_for_multi_scale_molecular_dynamics.data.diffusion.data_loader import (
LammpsForDiffusionDataModule, LammpsLoaderParameters)
from diffusion_for_multi_scale_molecular_dynamics.data.diffusion.lammps_for_diffusion_data_module import (
LammpsDataModuleParameters, LammpsForDiffusionDataModule)
from diffusion_for_multi_scale_molecular_dynamics.utils.logging_utils import \
setup_analysis_logger
from diffusion_for_multi_scale_molecular_dynamics.utils.main_utils import \
Expand All @@ -31,7 +31,7 @@ def main():
logger.info(f" --processed_datadir : {args.processed_datadir}")
logger.info(f" --config: {args.config}")

data_params = LammpsLoaderParameters(**hyper_params)
data_params = LammpsDataModuleParameters(**hyper_params)

with tempfile.TemporaryDirectory() as tmp_work_dir:
data_module = LammpsForDiffusionDataModule(lammps_run_dir=lammps_run_dir,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
#================================================================================
# Configuration file for a diffusion experiment for 2 pseudo-atoms in 1D.
#
# An 'on-the-fly' Gaussian dataset is created and used for training.
#================================================================================
exp_name: egnn_2_atoms_in_1D
run_name: run1
max_epoch: 1000
log_every_n_steps: 1
gradient_clipping: 0.0
accumulate_grad_batches: 1 # make this number of forward passes before doing a backprop step

elements: [A]

# set to null to avoid setting a seed (can speed up GPU computation, but
# results will not be reproducible)
seed: 1234

# On-the-fly Data Module that creates a Gaussian dataset.
data:
data_source: gaussian
random_seed: 42
number_of_atoms: 2
sigma_d: 0.01
equilibrium_relative_coordinates:
- [0.25]
- [0.75]

train_dataset_size: 8_192
valid_dataset_size: 1_024

batch_size: 64
num_workers: 0
max_atom: 2
spatial_dimension: 1


spatial_dimension: 1

model:
loss:
coordinates_algorithm: mse
atom_types_ce_weight: 0.0
atom_types_lambda_weight: 0.0
relative_coordinates_lambda_weight: 1.0
lattice_lambda_weight: 0.0
score_network:
architecture: egnn
spatial_dimension: 1
num_atom_types: 1
n_layers: 4
coordinate_hidden_dimensions_size: 128
coordinate_n_hidden_dimensions: 4
coords_agg: "mean"
message_hidden_dimensions_size: 128
message_n_hidden_dimensions: 4
node_hidden_dimensions_size: 128
node_n_hidden_dimensions: 4
attention: False
normalize: True
residual: True
tanh: False
edges: fully_connected
noise:
total_time_steps: 100
sigma_min: 0.001
sigma_max: 0.2

# optimizer and scheduler
optimizer:
name: adamw
learning_rate: 0.001
weight_decay: 5.0e-8


scheduler:
name: CosineAnnealingLR
T_max: 1000
eta_min: 0.0

# early stopping
early_stopping:
metric: validation_epoch_loss
mode: min
patience: 1000

model_checkpoint:
monitor: validation_epoch_loss
mode: min

score_viewer:
record_every_n_epochs: 1

score_viewer_parameters:
sigma_min: 0.001
sigma_max: 0.2
number_of_space_steps: 100
starting_relative_coordinates:
- [0.0]
- [1.0]
ending_relative_coordinates:
- [1.0]
- [0.0]
analytical_score_network:
architecture: "analytical"
spatial_dimension: 1
number_of_atoms: 2
num_atom_types: 1
kmax: 5
equilibrium_relative_coordinates:
- [0.25]
- [0.75]
sigma_d: 0.01
use_permutation_invariance: True

logging:
- tensorboard
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
from equilibrium_structure import create_equilibrium_sige_structure
from torch_geometric.data import DataLoader

from diffusion_for_multi_scale_molecular_dynamics.data.diffusion.data_loader import \
LammpsLoaderParameters
from diffusion_for_multi_scale_molecular_dynamics.data.diffusion.lammps_for_diffusion_data_module import \
LammpsDataModuleParameters
from diffusion_for_multi_scale_molecular_dynamics.data.element_types import \
ElementTypes
from diffusion_for_multi_scale_molecular_dynamics.namespace import (
Expand All @@ -24,7 +24,7 @@ def __init__(
self,
lammps_run_dir: str, # dummy
processed_dataset_dir: str,
hyper_params: LammpsLoaderParameters,
hyper_params: LammpsDataModuleParameters,
working_cache_dir: Optional[str] = None, # dummy
):
"""Init method."""
Expand Down Expand Up @@ -99,7 +99,7 @@ def clean_up(self):
elements = ["Si", "Ge"]
processed_dataset_dir = Path("/experiments/atom_types_only_experiments")

hyper_params = LammpsLoaderParameters(
hyper_params = LammpsDataModuleParameters(
batch_size=64,
train_batch_size=1024,
valid_batch_size=1024,
Expand Down
6 changes: 3 additions & 3 deletions experiments/dataset_analysis/dataset_covariance.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@

from diffusion_for_multi_scale_molecular_dynamics import (ANALYSIS_RESULTS_DIR,
DATA_DIR)
from diffusion_for_multi_scale_molecular_dynamics.data.diffusion.data_loader import (
LammpsForDiffusionDataModule, LammpsLoaderParameters)
from diffusion_for_multi_scale_molecular_dynamics.data.diffusion.lammps_for_diffusion_data_module import (
LammpsDataModuleParameters, LammpsForDiffusionDataModule)
from diffusion_for_multi_scale_molecular_dynamics.utils.basis_transformations import \
map_relative_coordinates_to_unit_cell
from diffusion_for_multi_scale_molecular_dynamics.utils.logging_utils import \
Expand All @@ -40,7 +40,7 @@

cache_dir = lammps_run_dir / "cache"

data_params = LammpsLoaderParameters(batch_size=2048, max_atom=max_atom)
data_params = LammpsDataModuleParameters(batch_size=2048, max_atom=max_atom)

if __name__ == "__main__":
setup_analysis_logger()
Expand Down
6 changes: 3 additions & 3 deletions experiments/dataset_analysis/energy_consistency_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@
PLEASANT_FIG_SIZE, PLOT_STYLE_PATH)
from diffusion_for_multi_scale_molecular_dynamics.callbacks.sampling_visualization_callback import \
SamplingVisualizationCallback
from diffusion_for_multi_scale_molecular_dynamics.data.diffusion.data_loader import (
LammpsForDiffusionDataModule, LammpsLoaderParameters)
from diffusion_for_multi_scale_molecular_dynamics.data.diffusion.lammps_for_diffusion_data_module import (
LammpsDataModuleParameters, LammpsForDiffusionDataModule)
from diffusion_for_multi_scale_molecular_dynamics.oracle.lammps import \
get_energy_and_forces_from_lammps
from diffusion_for_multi_scale_molecular_dynamics.utils.logging_utils import \
Expand All @@ -38,7 +38,7 @@

cache_dir = str(EXPERIMENT_ANALYSIS_DIR / "cache" / dataset_name)

data_params = LammpsLoaderParameters(batch_size=64, max_atom=8)
data_params = LammpsDataModuleParameters(batch_size=64, max_atom=8)

sample_size = 1000

Expand Down
Loading
Loading