Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev atom types diffusion #106

Merged
merged 272 commits into from
Dec 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
272 commits
Select commit Hold shift + click to select a range
7162db2
No more dump.yaml after test.
rousseab Oct 25, 2024
fcdfe7c
egnn with AXL thing
sblackburn-mila Oct 26, 2024
50faeea
various namedtupl snafu
sblackburn-mila Oct 26, 2024
b86ac2d
more fixes
sblackburn-mila Oct 26, 2024
aff150d
Merge most issues.
rousseab Oct 26, 2024
b20bf38
Merged conflicts in the NoiseScheduler class.
rousseab Oct 26, 2024
f4c9420
Noisers.
rousseab Oct 26, 2024
ffec518
Noisers
rousseab Oct 26, 2024
d690970
Fix the instantiate diffusion model.
rousseab Oct 26, 2024
108bf85
Update the imports in the main PL model.
rousseab Oct 26, 2024
0d7f3d9
Merge pull request #88 from mila-iqia/bruno_refactor
sblackburn86 Oct 27, 2024
25c621d
mace score network axl-ization
sblackburn-mila Oct 27, 2024
251bce9
mlp score network axl-ization
sblackburn-mila Oct 27, 2024
964fb0b
axl & diffusion mace axl fixes
sblackburn-mila Oct 27, 2024
1563b15
variance sampler unit tests
sblackburn-mila Oct 27, 2024
d91bd41
fixing test_exploding_variance
sblackburn-mila Oct 27, 2024
993badd
axl diffusion model unit test and related fixes
sblackburn-mila Oct 27, 2024
9997721
minimal fixes for generators
sblackburn-mila Oct 27, 2024
63c7729
fixing generators unit tests
sblackburn-mila Oct 27, 2024
4cc04cb
main entry point unit test fix and related fixes - partial
sblackburn-mila Oct 27, 2024
52769eb
sample_diffusion unit tests fix
sblackburn-mila Oct 27, 2024
d103546
fixing most test_score_network except DiffusionMace
sblackburn-mila Oct 28, 2024
26f810c
force field augmented unit test
sblackburn-mila Oct 28, 2024
7817582
test egnn and rm fokker planck
sblackburn-mila Oct 28, 2024
3820266
fixing most of diffusion mace
sblackburn-mila Oct 28, 2024
a78e8db
bug fix in diffusion mace score network
sblackburn-mila Oct 28, 2024
530dd14
analytical score
sblackburn-mila Oct 28, 2024
754d32e
dataloader renaming to namespace for atom types
sblackburn-mila Oct 28, 2024
05598eb
fixing data preprocessing tests
sblackburn-mila Oct 28, 2024
6d03d69
Merge remote-tracking branch 'origin/main' into refactor_for_atom_typ…
sblackburn-mila Oct 28, 2024
a8f01c6
adding unit tests for d3pm loss
sblackburn-mila Oct 29, 2024
00f61d9
variance samplers unit tests
sblackburn-mila Oct 29, 2024
2c35f85
atom type noiser tests
sblackburn-mila Oct 29, 2024
69a6930
d3pm utils tests
sblackburn-mila Oct 29, 2024
b1f0c6f
tensor utils unit tests
sblackburn-mila Oct 29, 2024
56f6339
test egnn upgrades
sblackburn-mila Oct 29, 2024
1e8b869
testscorenetwork upgrade
sblackburn-mila Oct 29, 2024
e5d2f7e
fixing score network unit tests
sblackburn-mila Oct 30, 2024
72ae772
A different seed, and no more nans.
rousseab Oct 30, 2024
023508a
fixing cartesion positions in mace - solving the unit test issue...
sblackburn-mila Oct 31, 2024
df29fee
code review part 1
sblackburn-mila Nov 2, 2024
6e72b76
code review part 2
sblackburn-mila Nov 2, 2024
cb2520c
code review part 3
sblackburn-mila Nov 2, 2024
db026b7
first pass complete on code review
sblackburn-mila Nov 2, 2024
d1122a2
saturday morning breakfast cereal comments
sblackburn-mila Nov 2, 2024
1255222
fixing variance_sampler and various unit tests
sblackburn-mila Nov 3, 2024
f4ef300
Merge pull request #90 from mila-iqia/refactor_for_atom_type_diffusion
sblackburn86 Nov 4, 2024
cb544d8
Sorting imports, passing linting tests.
rousseab Nov 4, 2024
8a82129
More granularity in the loss modules. Also, more isort.
rousseab Nov 4, 2024
926eeca
Linting.
rousseab Nov 4, 2024
0535839
More granular loss testing.
rousseab Nov 4, 2024
9cfd069
More granular loss testing.
rousseab Nov 4, 2024
3a625b0
Revamped atomic type loss function.
rousseab Nov 4, 2024
78b9200
Fixed bug where X loss was overaggregated.
rousseab Nov 5, 2024
771bba9
Introduce 'num_atom_types' fixture in the generation of fake data.
rousseab Nov 5, 2024
267e604
sample trajectory update & unit test
sblackburn-mila Nov 5, 2024
a19f18f
sde generator & test
sblackburn-mila Nov 5, 2024
d1e2ac1
predictor_corrector axl generator and tests
sblackburn-mila Nov 5, 2024
bf0bd1f
Fix linting issue.
rousseab Nov 5, 2024
887caf2
ode generator
sblackburn-mila Nov 5, 2024
3b3dcb0
langevin generator & unit tests
sblackburn-mila Nov 5, 2024
17d7c31
constrained langevin generator and tests
sblackburn-mila Nov 5, 2024
da5f7ee
fixing more generators tests
sblackburn-mila Nov 5, 2024
4ff5796
more sampling scripts fixes
sblackburn-mila Nov 5, 2024
ae72f0e
fixing sampling tests in axl_diffusion_lm
sblackburn-mila Nov 5, 2024
cdd0f7e
fixing sample diffusion & tests
sblackburn-mila Nov 5, 2024
ebd8b7e
fixing test_diffusion_sampling
sblackburn-mila Nov 5, 2024
43c1547
Merge pull request #92 from mila-iqia/AXL_atom_type_loss_review
rousseab Nov 6, 2024
6afb9ab
Merge pull request #93 from mila-iqia/fix_broken_tests
rousseab Nov 6, 2024
070ecd6
Fixed bug where X loss was overaggregated.
rousseab Nov 5, 2024
30f784f
Introduce 'num_atom_types' fixture in the generation of fake data.
rousseab Nov 5, 2024
cb239c1
Moving useful method to a better place.
rousseab Nov 5, 2024
4420034
Pedantic tests of equivariance + rotation for MACE architecture.
rousseab Nov 5, 2024
6df10dd
Stack the point group symmetries.
rousseab Nov 5, 2024
3144923
Remove repetitive code.
rousseab Nov 5, 2024
d096ed0
fixing import conflicts
sblackburn-mila Nov 6, 2024
0b339c7
test for map_axl_to_unit_cell
sblackburn-mila Nov 6, 2024
e0f121a
refactor p(at-1 given at) in atom_loss and in langevin generator
sblackburn-mila Nov 6, 2024
1334af3
refactor p(atm1 given at)
sblackburn-mila Nov 6, 2024
4bc3709
refactor p(atm1 given at2) part 2
sblackburn-mila Nov 6, 2024
44429b7
add a unit test for atom type diffusion'=
sblackburn-mila Nov 6, 2024
968df94
More comment.
rousseab Nov 7, 2024
4e34a08
New test battery explicitly for Equivariance.
rousseab Nov 7, 2024
c1d176d
BUG FIX in DIFFUSION MACE.
rousseab Nov 7, 2024
7500b1a
Cast the node_attrs to the correct kind of float.
rousseab Nov 8, 2024
fead727
Systematic testing of Equivariance for score networks.
rousseab Nov 8, 2024
f25e425
Correct bug in definition of output score.
rousseab Nov 8, 2024
3165df6
Only test atom type output if relevant.
rousseab Nov 8, 2024
3b0b65e
Moving basic checks to its own module.
rousseab Nov 8, 2024
3328829
Better factoring of tests.
rousseab Nov 8, 2024
19a5f54
Use test base class.
rousseab Nov 8, 2024
7c738d2
Refactored the name of the base test class.
rousseab Nov 8, 2024
014650b
More general tests.
rousseab Nov 8, 2024
8672c81
removed needless tests
rousseab Nov 8, 2024
8c3156d
Remove needless tests.
rousseab Nov 8, 2024
3caaa22
Moving useful method to a better place.
rousseab Nov 5, 2024
6dfb50f
Pedantic tests of equivariance + rotation for MACE architecture.
rousseab Nov 5, 2024
216baee
Stack the point group symmetries.
rousseab Nov 5, 2024
881116f
Remove repetitive code.
rousseab Nov 5, 2024
12e1fa6
More comment.
rousseab Nov 7, 2024
00af358
New test battery explicitly for Equivariance.
rousseab Nov 7, 2024
7b65a11
BUG FIX in DIFFUSION MACE.
rousseab Nov 7, 2024
2038895
Cast the node_attrs to the correct kind of float.
rousseab Nov 8, 2024
74c73bd
Systematic testing of Equivariance for score networks.
rousseab Nov 8, 2024
41b5b79
Correct bug in definition of output score.
rousseab Nov 8, 2024
05059b2
Only test atom type output if relevant.
rousseab Nov 8, 2024
de92f2f
Moving basic checks to its own module.
rousseab Nov 8, 2024
1cef32c
Better factoring of tests.
rousseab Nov 8, 2024
c048b8e
Use test base class.
rousseab Nov 8, 2024
e61d869
Refactored the name of the base test class.
rousseab Nov 8, 2024
341ec17
More general tests.
rousseab Nov 8, 2024
0d94002
removed needless tests
rousseab Nov 8, 2024
417cdc8
Remove needless tests.
rousseab Nov 8, 2024
d419a8b
Merge remote-tracking branch 'origin/revisit_equivariance' into revis…
rousseab Nov 8, 2024
feb8c83
Fix docstring and name issues.
rousseab Nov 9, 2024
e453f32
code review
sblackburn-mila Nov 10, 2024
51c7dd7
Merge pull request #94 from mila-iqia/atom_type_generator
rousseab Nov 10, 2024
57cb4f3
Merge remote-tracking branch 'origin/dev_atom_types_diffusion' into r…
rousseab Nov 10, 2024
d5e3731
Fix variable name.
rousseab Nov 10, 2024
eeb704f
Fix dangling old variable name.
rousseab Nov 10, 2024
e8b1419
Merge pull request #95 from mila-iqia/revisit_equivariance
rousseab Nov 10, 2024
7fe4393
organise the SW files better
rousseab Nov 11, 2024
e59023a
upper case
rousseab Nov 11, 2024
45ac9d5
revamp the Si 1x1x1 data creation scripts
rousseab Nov 11, 2024
5736414
Better folder name.
rousseab Nov 11, 2024
6731273
simpler script
rousseab Nov 11, 2024
ad18f6f
update creation script
rousseab Nov 11, 2024
829b66a
input script for SiGe
rousseab Nov 11, 2024
b5a2a98
updated Si 1x1x1
rousseab Nov 11, 2024
f1400eb
Revamped Si 2x2x2.
rousseab Nov 11, 2024
9b01168
fixed in file
rousseab Nov 11, 2024
541c180
Revamped Si 3x3x3.
rousseab Nov 11, 2024
f190669
chmod 644
rousseab Nov 11, 2024
e0373a8
remove needless stuff
rousseab Nov 11, 2024
de1d5bf
removing ad hoc random stuff
rousseab Nov 11, 2024
4e80de4
644 file
rousseab Nov 11, 2024
f0f3f51
removed needless file
rousseab Nov 11, 2024
3b3d393
removed needless files
rousseab Nov 11, 2024
1152914
Bash function to drive data generation.
rousseab Nov 11, 2024
5cafd9b
Dump element, not type index.
rousseab Nov 12, 2024
135a130
Element type processing class.
rousseab Nov 12, 2024
9d7ac67
Revamping of dataloading code to deal with element strings.
rousseab Nov 12, 2024
b70363d
BLACK
rousseab Nov 12, 2024
9759ec7
Update location of SW coefficients files.
rousseab Nov 12, 2024
171f6f1
A bit of logging.
rousseab Nov 12, 2024
75f9f7a
New arguments to the bash driving function.
rousseab Nov 12, 2024
9d370f6
New arguments to the bash driving function.
rousseab Nov 12, 2024
8581feb
updated Si 1x1x1 data generation scripts
rousseab Nov 12, 2024
a3b9054
ignore data generation stuff
rousseab Nov 12, 2024
0dd1a08
update the data creation scripts
rousseab Nov 12, 2024
c52fa34
Tests for the ElementType class.
rousseab Nov 12, 2024
a49c4c8
More common name for the number of unique elements.
rousseab Nov 12, 2024
963ff76
Update the training code to use the config-specified elements list.
rousseab Nov 12, 2024
c79bb35
Add a static validation method.
rousseab Nov 12, 2024
61b6c80
Validate the element list.
rousseab Nov 12, 2024
1f5f644
Fix unique elements for score network creation.
rousseab Nov 12, 2024
3a6db04
Properties to expose the elements and their ids.
rousseab Nov 13, 2024
fe1b3f2
Lammps calculator.
rousseab Nov 13, 2024
9a1ecf1
Cleaner lammps energy oracle.
rousseab Nov 13, 2024
8ffb110
Cleaner oracle init.
rousseab Nov 13, 2024
491467a
Account for oracle in lightning model.
rousseab Nov 13, 2024
2225ba9
Fixed sampling script.
rousseab Nov 13, 2024
efc09a3
remove broken script
rousseab Nov 13, 2024
8b3e7be
Instantiate energy oracle in a polymorphic way.
rousseab Nov 13, 2024
4179b2e
Instantiate energy oracle correctly.
rousseab Nov 13, 2024
7cd2abc
Fix axl test.
rousseab Nov 13, 2024
6086c5d
Fix broken test.
rousseab Nov 13, 2024
45e5c68
Fix analytical score network to deal with atom types properly.
rousseab Nov 14, 2024
9d894cb
Account for the mask logit = -infinity in the atomic loss.
rousseab Nov 14, 2024
944fa51
Impose that the MASK probability is zero.
rousseab Nov 14, 2024
8f5b0a0
Update config files.
rousseab Nov 14, 2024
f3ac952
Instantiate for device discoverability.
rousseab Nov 14, 2024
a5f2c51
fix broken test.
brunorousseau Nov 15, 2024
4623496
a dummy file to test github connection
brunorousseau Nov 15, 2024
a4b7ade
Removing needless file.
brunorousseau Nov 15, 2024
c480707
Make sure all layers of the score network will be put on the correct …
brunorousseau Nov 15, 2024
9a82223
Fixing device bjorks.
brunorousseau Nov 15, 2024
45772b2
Fixing device bjorks.
brunorousseau Nov 15, 2024
73571e5
Update run script.
brunorousseau Nov 15, 2024
7b58288
Add the MPS device, if available.
brunorousseau Nov 15, 2024
8432c8f
Skip global tests on MPS.
brunorousseau Nov 15, 2024
66c11af
Deprecate something to fix later.
brunorousseau Nov 15, 2024
d2acb90
remove needless test
brunorousseau Nov 15, 2024
2974fa1
Fixing device bjorks.
brunorousseau Nov 15, 2024
4b47380
Merge pull request #96 from mila-iqia/dataloader_for_atom_types
sblackburn86 Nov 15, 2024
19b24a0
Fix the loss to align with the overleaf document.
brunorousseau Nov 16, 2024
22ce7f4
Squash divergent MASK value in CE calculation.
brunorousseau Nov 16, 2024
87677a6
Use a bette rname for the module!
brunorousseau Nov 16, 2024
8e15243
Fix name.
brunorousseau Nov 16, 2024
5e8c875
Sanity test on p_a0_given_a1.
brunorousseau Nov 17, 2024
671c83b
new configs to generate SiGe datasets
brunorousseau Nov 18, 2024
a7d7450
input scripts
brunorousseau Nov 18, 2024
6fffabb
Merge pull request #97 from mila-iqia/fix_atomic_loss
sblackburn86 Nov 18, 2024
9f2636c
Merge pull request #98 from mila-iqia/new_datasets
sblackburn86 Nov 18, 2024
70fbab2
Simplify the recording of trajectories. Push reshaping the recordings…
brunorousseau Nov 19, 2024
d08d631
Record on CPU.
brunorousseau Nov 19, 2024
02eaf71
Create cif files with ovito.
brunorousseau Nov 19, 2024
2cc0985
Don't record the corrector steps by default.
brunorousseau Nov 19, 2024
91f99c8
Avoid random collisions!
brunorousseau Nov 19, 2024
7c0fc15
Hack the correct L in the composition before recording it.
brunorousseau Nov 21, 2024
44d8918
Move analysis code elsewhere.
brunorousseau Nov 21, 2024
64326af
Hack the composition_i AXL to have the correct L field.
brunorousseau Nov 21, 2024
ceffd4f
Class to analyse the recorded trajectories.
brunorousseau Nov 22, 2024
7b3ceec
Plotting the content of the q-matrices.
brunorousseau Nov 22, 2024
ff66fa3
Merge pull request #101 from mila-iqia/update_ovito_viz
sblackburn86 Nov 22, 2024
7ab9892
Add an atom-type prefactor 'lambda' weight.
brunorousseau Nov 25, 2024
0b7129c
adding single transtion & greedy option for transitions in Langevin g…
sblackburn-mila Nov 18, 2024
c2dbe5b
atom type update in corrector step
sblackburn-mila Nov 18, 2024
383087a
sure but why
sblackburn-mila Nov 18, 2024
63b1b91
greedy prob adjustments
sblackburn-mila Nov 19, 2024
470acba
binary sampling fn
sblackburn-mila Nov 19, 2024
399aa5d
updating unit test
sblackburn-mila Nov 19, 2024
21a0e22
other fixes
sblackburn-mila Nov 19, 2024
5d47926
fixing the bad behavior when a transition happens at the zero-th time…
Nov 21, 2024
622ec83
code review
Nov 22, 2024
33e3ac1
Merge pull request #102 from mila-iqia/fix_conflicts
rousseab Nov 25, 2024
93f3ebf
adding single transtion & greedy option for transitions in Langevin g…
sblackburn-mila Nov 18, 2024
45aaa3c
atom type update in corrector step
sblackburn-mila Nov 18, 2024
7fe87bc
sure but why
sblackburn-mila Nov 18, 2024
12d0b9d
greedy prob adjustments
sblackburn-mila Nov 19, 2024
1584c08
binary sampling fn
sblackburn-mila Nov 19, 2024
f43e50e
updating unit test
sblackburn-mila Nov 19, 2024
688a382
other fixes
sblackburn-mila Nov 19, 2024
c4ca1fc
fixing the bad behavior when a transition happens at the zero-th time…
Nov 21, 2024
d86e511
code review
Nov 22, 2024
c0e222a
Merge pull request #103 from mila-iqia/dev_atom_types_diffusion
rousseab Nov 25, 2024
4b25c80
Weight the different terms in the loss in a transparent way.
rousseab Nov 25, 2024
894a3e6
Only do the recording hack if you need it!
rousseab Nov 25, 2024
5f6e86e
Merge pull request #104 from mila-iqia/better_atom_type_probabilities
sblackburn86 Nov 25, 2024
cd2d88e
Pseudo experiment
rousseab Nov 25, 2024
c9050a3
Add time embedding to MLP.
rousseab Nov 25, 2024
71e36d3
Fix mlp input parameters.
rousseab Nov 25, 2024
625a678
Nice convincing experiments.
rousseab Nov 25, 2024
8d8cec9
More updates for the MLP refactor.
rousseab Nov 25, 2024
a9b7246
More MLP fixes.
rousseab Nov 25, 2024
fd14920
Fixing atom type loss tests.
rousseab Nov 25, 2024
9651da7
Fixing mlp stuff again.
rousseab Nov 25, 2024
934c9c3
Merge pull request #105 from mila-iqia/atom_type_only_pseudo_experiments
sblackburn86 Nov 25, 2024
ab6cc55
doctstring error in d3pm_utils
Nov 26, 2024
37a7781
Record more stuff.
rousseab Nov 28, 2024
1bb875c
Align with how things are recorded.
rousseab Nov 28, 2024
c9e77ed
Better probability calculation.
rousseab Nov 28, 2024
e2ee046
Better docstring.
rousseab Nov 29, 2024
b4db573
Do not allow T=1.
rousseab Nov 29, 2024
5f25f77
Refactor how atom types are determined.
rousseab Nov 29, 2024
d4a8d24
name fix
rousseab Nov 29, 2024
d792c3e
Add explicit check.
rousseab Nov 29, 2024
0a5c6d2
Remove meaningless, useless broken test.
rousseab Nov 29, 2024
eca5015
Fix test.
rousseab Nov 29, 2024
6a74116
Merge pull request #108 from mila-iqia/test_greedy_sampling
sblackburn86 Nov 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ examples/data/
examples/*/output/
examples/*/lightning_logs/

**/train_run*/
**/valid_run*/
**/processed/
**/cache/
**/output/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
6 changes: 6 additions & 0 deletions data/SiGe_diffusion_1x1x1/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Configuration for the dataloader
batch_size: 1024
num_workers: 0
max_atom: 8
spatial_dimension: 3
elements: [Si, Ge]
16 changes: 16 additions & 0 deletions data/SiGe_diffusion_1x1x1/create_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

source ../data_generation_functions.sh

TEMPERATURE=300
BOX_SIZE=1
STEP=10000
CROP=10000
NTRAIN_RUN=10
NVALID_RUN=5

SW_PATH="../stillinger_weber_coefficients/SiGe.sw"
IN_PATH="in.SiGe.lammps"
CONFIG_PATH="config.yaml"

create_data_function $TEMPERATURE $BOX_SIZE $STEP $CROP $NTRAIN_RUN $NVALID_RUN $SW_PATH $IN_PATH $CONFIG_PATH
34 changes: 34 additions & 0 deletions data/SiGe_diffusion_1x1x1/in.SiGe.lammps
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
log log.lammps

units metal
atom_style atomic
atom_modify map array

lattice diamond 5.5421217827
region box block 0 ${S} 0 ${S} 0 ${S}

create_box 2 box
create_atoms 1 box basis 1 1 basis 2 1 basis 3 1 basis 4 1 basis 5 2 basis 6 2 basis 7 2 basis 8 2


mass 1 28.0855
mass 2 72.64

group Si type 1
group Ge type 2

pair_style sw
pair_coeff * * ${SW_PATH} Si Ge

velocity all create ${T} ${SEED}

dump dump_id all yaml 1 dump.${T}-${S}.yaml id element x y z fx fy fz
dump_modify dump_id element Si Ge

thermo_style yaml
thermo 1
#==========================Output files========================

fix 1 all nvt temp ${T} ${T} 0.01
run ${STEP}
unfix 1
6 changes: 6 additions & 0 deletions data/SiGe_diffusion_2x2x2/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Configuration for the dataloader
batch_size: 1024
num_workers: 0
max_atom: 64
spatial_dimension: 3
elements: [Si, Ge]
16 changes: 16 additions & 0 deletions data/SiGe_diffusion_2x2x2/create_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

source ../data_generation_functions.sh

TEMPERATURE=300
BOX_SIZE=2
STEP=10000
CROP=10000
NTRAIN_RUN=10
NVALID_RUN=5

SW_PATH="../stillinger_weber_coefficients/SiGe.sw"
IN_PATH="in.SiGe.lammps"
CONFIG_PATH="config.yaml"

create_data_function $TEMPERATURE $BOX_SIZE $STEP $CROP $NTRAIN_RUN $NVALID_RUN $SW_PATH $IN_PATH $CONFIG_PATH
34 changes: 34 additions & 0 deletions data/SiGe_diffusion_2x2x2/in.SiGe.lammps
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
log log.lammps

units metal
atom_style atomic
atom_modify map array

lattice diamond 5.5421217827
region box block 0 ${S} 0 ${S} 0 ${S}

create_box 2 box
create_atoms 1 box basis 1 1 basis 2 1 basis 3 1 basis 4 1 basis 5 2 basis 6 2 basis 7 2 basis 8 2


mass 1 28.0855
mass 2 72.64

group Si type 1
group Ge type 2

pair_style sw
pair_coeff * * ${SW_PATH} Si Ge

velocity all create ${T} ${SEED}

dump dump_id all yaml 1 dump.${T}-${S}.yaml id element x y z fx fy fz
dump_modify dump_id element Si Ge

thermo_style yaml
thermo 1
#==========================Output files========================

fix 1 all nvt temp ${T} ${T} 0.01
run ${STEP}
unfix 1
6 changes: 6 additions & 0 deletions data/SiGe_diffusion_3x3x3/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Configuration for the dataloader
batch_size: 1024
num_workers: 0
max_atom: 216
spatial_dimension: 3
elements: [Si, Ge]
16 changes: 16 additions & 0 deletions data/SiGe_diffusion_3x3x3/create_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

source ../data_generation_functions.sh

TEMPERATURE=300
BOX_SIZE=3
STEP=10000
CROP=10000
NTRAIN_RUN=10
NVALID_RUN=5

SW_PATH="../stillinger_weber_coefficients/SiGe.sw"
IN_PATH="in.SiGe.lammps"
CONFIG_PATH="config.yaml"

create_data_function $TEMPERATURE $BOX_SIZE $STEP $CROP $NTRAIN_RUN $NVALID_RUN $SW_PATH $IN_PATH $CONFIG_PATH
34 changes: 34 additions & 0 deletions data/SiGe_diffusion_3x3x3/in.SiGe.lammps
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
log log.lammps

units metal
atom_style atomic
atom_modify map array

lattice diamond 5.5421217827
region box block 0 ${S} 0 ${S} 0 ${S}

create_box 2 box
create_atoms 1 box basis 1 1 basis 2 1 basis 3 1 basis 4 1 basis 5 2 basis 6 2 basis 7 2 basis 8 2


mass 1 28.0855
mass 2 72.64

group Si type 1
group Ge type 2

pair_style sw
pair_coeff * * ${SW_PATH} Si Ge

velocity all create ${T} ${SEED}

dump dump_id all yaml 1 dump.${T}-${S}.yaml id element x y z fx fy fz
dump_modify dump_id element Si Ge

thermo_style yaml
thermo 1
#==========================Output files========================

fix 1 all nvt temp ${T} ${T} 0.01
run ${STEP}
unfix 1
6 changes: 6 additions & 0 deletions data/Si_diffusion_1x1x1/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Configuration for the dataloader
batch_size: 1024
num_workers: 0
max_atom: 8
spatial_dimension: 3
elements: [Si]
16 changes: 16 additions & 0 deletions data/Si_diffusion_1x1x1/create_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

source ../data_generation_functions.sh

TEMPERATURE=300
BOX_SIZE=1
STEP=10000
CROP=10000
NTRAIN_RUN=10
NVALID_RUN=5

SW_PATH="../stillinger_weber_coefficients/Si.sw"
IN_PATH="in.Si.lammps"
CONFIG_PATH="config.yaml"

create_data_function $TEMPERATURE $BOX_SIZE $STEP $CROP $NTRAIN_RUN $NVALID_RUN $SW_PATH $IN_PATH $CONFIG_PATH
6 changes: 4 additions & 2 deletions data/si_diffusion_2x2x2/in.si.lammps → data/Si_diffusion_1x1x1/in.Si.lammps
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@ mass 1 28.0855
group Si type 1

pair_style sw
pair_coeff * * ../../si.sw Si
pair_coeff * * ${SW_PATH} Si


velocity all create ${T} ${SEED}

dump 1 all yaml 1 dump.si-${T}-${S}.yaml id type x y z fx fy fz
dump dump_id all yaml 1 dump.${T}-${S}.yaml id element x y z fx fy fz
dump_modify dump_id element Si

thermo_style yaml
thermo 1
Expand Down
6 changes: 6 additions & 0 deletions data/Si_diffusion_2x2x2/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Configuration for the dataloader
batch_size: 1024
num_workers: 0
max_atom: 64
spatial_dimension: 3
elements: [Si]
16 changes: 16 additions & 0 deletions data/Si_diffusion_2x2x2/create_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

source ../data_generation_functions.sh

TEMPERATURE=300
BOX_SIZE=2
STEP=10000
CROP=10000
NTRAIN_RUN=10
NVALID_RUN=5

SW_PATH="../stillinger_weber_coefficients/Si.sw"
IN_PATH="in.Si.lammps"
CONFIG_PATH="config.yaml"

create_data_function $TEMPERATURE $BOX_SIZE $STEP $CROP $NTRAIN_RUN $NVALID_RUN $SW_PATH $IN_PATH $CONFIG_PATH
6 changes: 4 additions & 2 deletions data/si_diffusion_1x1x1_large/in.si.lammps → data/Si_diffusion_2x2x2/in.Si.lammps
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@ mass 1 28.0855
group Si type 1

pair_style sw
pair_coeff * * ../../si.sw Si
pair_coeff * * ${SW_PATH} Si


velocity all create ${T} ${SEED}

dump 1 all yaml 1 dump.si-${T}-${S}.yaml id type x y z fx fy fz
dump dump_id all yaml 1 dump.${T}-${S}.yaml id element x y z fx fy fz
dump_modify dump_id element Si

thermo_style yaml
thermo 1
Expand Down
6 changes: 6 additions & 0 deletions data/Si_diffusion_3x3x3/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Configuration for the dataloader
batch_size: 1024
num_workers: 0
max_atom: 216
spatial_dimension: 3
elements: [Si]
16 changes: 16 additions & 0 deletions data/Si_diffusion_3x3x3/create_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

source ../data_generation_functions.sh

TEMPERATURE=300
BOX_SIZE=3
STEP=10000
CROP=10000
NTRAIN_RUN=10
NVALID_RUN=5

SW_PATH="../stillinger_weber_coefficients/Si.sw"
IN_PATH="in.Si.lammps"
CONFIG_PATH="config.yaml"

create_data_function $TEMPERATURE $BOX_SIZE $STEP $CROP $NTRAIN_RUN $NVALID_RUN $SW_PATH $IN_PATH $CONFIG_PATH
6 changes: 4 additions & 2 deletions data/si_diffusion_1x1x1/in.si.lammps → data/Si_diffusion_3x3x3/in.Si.lammps
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@ mass 1 28.0855
group Si type 1

pair_style sw
pair_coeff * * ../../si.sw Si
pair_coeff * * ${SW_PATH} Si


velocity all create ${T} ${SEED}

dump 1 all yaml 1 dump.si-${T}-${S}.yaml id type x y z fx fy fz
dump dump_id all yaml 1 dump.${T}-${S}.yaml id element x y z fx fy fz
dump_modify dump_id element Si

thermo_style yaml
thermo 1
Expand Down
56 changes: 56 additions & 0 deletions data/data_generation_functions.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#!/bin/bash

function create_data_function() {
# this function drives the creation training and validation data with LAMMPS.
# It assumes :
# - the function is sourced in a bash script (the "calling script") within the folder where the data is to be created.
# - the calling script is invoked in a shell with the correct python environment.
# - the LAMMPS input file follows a template and has all the passed variables defined.
# - the paths are defined with respect to the folder where the generation script is called.

TEMPERATURE="$1"
BOX_SIZE="$2"
STEP="$3"
CROP="$4"
NTRAIN_RUN="$5"
NVALID_RUN="$6"
SW_PATH="$7"
IN_PATH="$8"
CONFIG_PATH="$9"

NRUN=$(($NTRAIN_RUN + $NVALID_RUN))

# Generate the data
for SEED in $(seq 1 $NRUN); do
if [ "$SEED" -le $NTRAIN_RUN ]; then
MODE="train"
else
MODE="valid"
fi
echo "Creating LAMMPS data for ${MODE}_run_${SEED}..."
mkdir -p "${MODE}_run_${SEED}"
cd "${MODE}_run_${SEED}"

# Calling LAMMPS with various arguments to keep it quiet. Also, the current location is "${MODE}_run_${SEED}", which is one
# folder away from the location of the calling script.
lmp -echo none -screen none < ../$IN_PATH -v STEP $(($STEP + $CROP)) -v T $TEMPERATURE -v S $BOX_SIZE -v SEED $SEED -v SW_PATH ../$SW_PATH

# extract the thermodynamic outputs in a yaml file
egrep '^(keywords:|data:$|---$|\.\.\.$| - \[)' log.lammps > thermo_log.yaml

mkdir -p "uncropped_outputs"
mv "dump.${TEMPERATURE}-${BOX_SIZE}.yaml" uncropped_outputs/
mv thermo_log.yaml uncropped_outputs/

python ../../crop_lammps_outputs.py \
--lammps_yaml "uncropped_outputs/dump.${TEMPERATURE}-${BOX_SIZE}.yaml" \
--lammps_thermo "uncropped_outputs/thermo_log.yaml" \
--crop $CROP \
--output_dir ./

cd ..
done

# process the data
python ../process_lammps_data.py --data "./" --processed_datadir "./processed/" --config ${CONFIG_PATH}
}
31 changes: 0 additions & 31 deletions data/lammps_input_example.lammps

This file was deleted.

Loading