-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR: minimal MTP training script #5
Conversation
tests/models/test_mtp.py
Outdated
# Mock self.write_cfg to simulate creating a config file without file operations | ||
mocker.patch.object(MTPWithMLIP3, "write_cfg", return_value="mock_filename.cfg") | ||
|
||
# mocker.patch("crystal_diffusion.models.mtp.itertools.chain", return_value=[1, 2, 3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest deleting this commented code. Yagni.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted
crystal_diffusion/models/mtp.py
Outdated
|
||
atoms_filename = "train.cfgs" | ||
|
||
with ScratchDir("."): # create a tmpdir - deleted afterew |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo in the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
tests/models/test_mtp.py
Outdated
mocker.patch("os.path.exists", return_value=True) | ||
|
||
# Mock the ScratchDir context manager to do nothing | ||
mocker.patch("crystal_diffusion.models.mtp.ScratchDir", mocker.MagicMock()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mtp.train
writes a min_dist
file to disk. By mocking the ScratchDir
, this min_dist
file remains in the test folder after the test.
I suggest not mocking the ScratchDir
; that way, the side effect files will get deleted along with the scratch folder at the end of the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
tests/models/test_mtp.py
Outdated
from crystal_diffusion.models.mtp import MTPWithMLIP3 | ||
|
||
|
||
class MockStructure: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a style nitpick, so feel free to ignore.
I would tend to reserve the word "Mock" for things that come from the Mock library (or the mocker fixture). Here, I would call this "FakeStructure", or "StructureStub"...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed, that's a good idea
tests/models/test_mtp.py
Outdated
|
||
|
||
@pytest.fixture | ||
def mock_structure(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I would call this fake_structure
: it is a real Structure
object, not something derived from the Mock
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
tests/models/test_mtp.py
Outdated
return instance | ||
|
||
|
||
@pytest.mark.parametrize("mock_subprocess", [0]) # Here, 0 simulates a successful subprocess return code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mock_subprocess
parameter can only be 0, or else the test fails. I suggest creating a separate fixture, like
@pytest.fixture
def successful_ subrocess_code():
return 0
and use that instead of a pytest.mark.parametrize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
tests/models/test_mtp.py
Outdated
|
||
# Mock subprocess.Popen for evaluate method's call | ||
# Mock subprocess.Popen to simulate an external call to `mlp` command | ||
mock_popen = mocker.patch("subprocess.Popen") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is common to test_evaluate
and test_train_method
. I would hide that in a fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
tests/models/test_mtp.py
Outdated
|
||
# Mock the external dependencies and method calls within the MTPWithMLIP3.train method | ||
@pytest.mark.parametrize("mock_subprocess", [0]) # Here, 0 simulates a successful subprocess return code | ||
def test_train_method(mocker, mock_subprocess): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_train_method
-> test_train
, to be consistent with test_evaluate
below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name changed
crystal_diffusion/models/mtp.py
Outdated
# calculate_grade is the method to get the forces, energy & maxvol values | ||
cmd = [self.mlp_command, "calculate_grade", self.fitted_mtp, original_file, predict_file] | ||
predict_file += '.0' # added by mlp... | ||
with subprocess.Popen(cmd, stdout=subprocess.PIPE) as p: # run mlp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block is duplicated between train
and evaluate
. I would refactor that as a "private" _method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored
crystal_diffusion/models/mtp.py
Outdated
|
||
Args: | ||
filename: name of mlp output file to be parsed. | ||
nbh_grade (optional): if True, add the nbh_grades in the resulting dataframe. Defaults to False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does "nbh_grade" mean? Is "nbh" an acronym? A quick explanation here would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nbh stands for neighborhood. mlip does either structure or neighborhood to compute the gamma factor. Let's stick to nbh (default) for now. We could revisit this later.
Added information in the arg description.
crystal_diffusion/models/mtp.py
Outdated
outputs = d["outputs"] | ||
pos_arr = np.array(outputs["position"]) | ||
force_arr = np.array(outputs["forces"]) | ||
n_atom = force_arr.shape[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a num_atoms
field in the docs
dictionary. A few sanity check asserts on array dimensions here would help prevent the user from shooting themselves in the foot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. Added assert for that
crystal_diffusion/models/mtp.py
Outdated
mtp_file_path = os.path.join(self.mlp_templates, unfitted_mtp) | ||
shutil.copyfile(mtp_file_path, os.path.join(os.getcwd(), unfitted_mtp)) | ||
commands = [self.mlp_command, "mindist", atoms_filename] | ||
with open("min_dist", "w") as f, subprocess.Popen(commands, stdout=f) as p: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Popen
could be hidden in a _method
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hidden, it is
crystal_diffusion/models/mtp.py
Outdated
# f"--bfgs-conv-tol={bfgs_conv_tol}", | ||
# f"--weighting={weighting}", | ||
] | ||
with subprocess.Popen(cmds_list, stdout=subprocess.PIPE) as p: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Popen
could be hidden away in a _method
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be hidden now
crystal_diffusion/train_mtp.py
Outdated
|
||
# TODO list of yaml files should come from an external call | ||
# yaml dump file | ||
lammps_yaml = ['lammps_scripts/Si/si-custom/dump.si-300-1.yaml'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These files are not in the repo. Maybe we could add them to the examples
folder so that the main
below is runnable from a clean repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the files in examples/local/mtp_examples
These might be removed later. I'll want to have a 'real' interface to data sources at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look too carefully at the mlip3_missing_files
: I assumed they are copied directly from the MLIP-2 repo. There is a CPP file in there: if you changed anything in there, I think it should be highlighted at the top of that file.
Otherwise, I make various suggestions for changes. We can have a live chat in the afternoon if useful.
crystal_diffusion/train_mtp.py
Outdated
return mtp_inputs | ||
|
||
|
||
def train_mtp(train_inputs: Dict[str, Any], mlip_cmd_path: str, save_dir: str) -> MTPWithMLIP3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest mlip_cmd_path
-> mlip_folder_path
to avoid suggesting that the 'path-to-executable' is what is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
crystal_diffusion/train_mtp.py
Outdated
return mtp_inputs | ||
|
||
|
||
def train_mtp(train_inputs: Dict[str, Any], mlip_cmd_path: str, save_dir: str) -> MTPWithMLIP3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think train_inputs
should be a namedtuple https://docs.python.org/3/library/collections.html#collections.namedtuple.
That's a pretty lightweight way of "typing", ie telling the user exactly what is expected as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
crystal_diffusion/train_mtp.py
Outdated
return mtp | ||
|
||
|
||
def evaluate_mtp(eval_inputs: Dict[str, Any], mtp: MTPWithMLIP3) -> Tuple[pd.DataFrame, pd.DataFrame]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think eval_inputs
should be a namedtuple
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
crystal_diffusion/train_mtp.py
Outdated
|
||
|
||
def train_mtp(train_inputs: Dict[str, Any], mlip_cmd_path: str, save_dir: str) -> MTPWithMLIP3: | ||
"""Create and evaluate an MTP potential. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this say "create and train an MTP potential"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
crystal_diffusion/train_mtp.py
Outdated
|
||
|
||
def evaluate_mtp(eval_inputs: Dict[str, Any], mtp: MTPWithMLIP3) -> Tuple[pd.DataFrame, pd.DataFrame]: | ||
"""Create and evaluate an MTP potential. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this say "Evaluate a trained MTP potential"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
crystal_diffusion/train_mtp.py
Outdated
gt_energy = df_orig.groupby('structure_index').agg({'energy': 'mean', 'atom_index': 'count'}) | ||
gt_energy = (gt_energy['energy'] / gt_energy['atom_index']).to_numpy() | ||
|
||
predicted_forces = df_predict.groupby('structure_index').agg({'fx': 'sum', 'fy': 'sum', 'fz': 'sum', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sum on forces seems like a bad idea to me. It will lead to "cancellation of errors". When we have a single structure with
MAE =
What is implemented seems to be
BAD_MAE =
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't catch that mistake when I was writing it, you are right, it was a bad implementation.
I wrote a simpler version that takes the MAE over atoms & directions. The contribution of a structure depends on the nature of atoms making it.
I doubt these metrics really mattter for us in the long run. I reused what was in maml, but you should define our own. And refactor the code so the metrics can be outside the entry point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor things and we are good to go.
gt_energy = df_orig.groupby('structure_index').agg({'energy': 'mean', 'atom_index': 'count'}) | ||
gt_energy = (gt_energy['energy'] / gt_energy['atom_index']).to_numpy() | ||
|
||
predicted_forces = (df_predict[['fx', 'fy', 'fz']].to_numpy().flatten()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the added parentheses are needed here:
predicted_forces = (df_predict[['fx', 'fy', 'fz']].to_numpy().flatten())
can be
predicted_forces = df_predict[['fx', 'fy', 'fz']].to_numpy().flatten()
tests/models/test_mtp.py
Outdated
df_predict = pd.DataFrame({ | ||
'structure_index': [0, 0, 1, 1], | ||
'atom_index': [0, 1, 0, 1], | ||
'energy': [1.05, 0.95, 3.1, 2.9], # Energy has a slight variation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace this line with
'energy': [1.1, 1.1, 3.1, 3.1], # Energy has a slight variation
and the test fails. See comment below.
tests/models/test_mtp.py
Outdated
}) | ||
|
||
# Calculate expected MAE for energy and forces | ||
expected_mae_energy = mean_absolute_error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is different from what is done in the function being tested. The function being tested computes the MAE on the energy per atom, whereas this computes the MAE on the total energy. The test still passes because the test data produces an MAE of zero for both possibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest changing the test data to be more "random".
The script trains and evaluates a MTP based on LAMMPS configurations passed as arguments. The current
main
intrain_mtp.py
depends on LAMMPS output. A clean interface with args parser needs to be done. Theevaluate
outputs dataframe in pandas format, so we land in python-friendly territory.A lot of kwargs for MLIP needs to be investigated and added to the script. The other major TODO is the write the method that converts the quasi-binary output from MLIP to a format usable by LAMMPS (not all lines are binary in the output - most are utf-8 already). maml has a script for mlip-2, I put in the code as comments. Note that this conversion was done in the
train
method. I disagree with that choice, so I move it to its own method.I am ambivalent to the choice of using a scratch dir with monty. This is from maml. I chose to keep it to simplify things, but we could revisit this decision.