Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 训练出来的模型,lammps第一次结合正常,第二次结合有问题 #1493

Open
wangyi01 opened this issue Mar 25, 2024 · 2 comments

Comments

@wangyi01
Copy link

Bug summary

DPGEN结合DPA2和fp标记样本。预训练模型+alloy_domains 获得初训练的模型后,并结合lammps来获得轨迹。然后dpgen2选择出进行fp计算的构型,fp计算获得标记的样本,然后重新训练出新模型,到此为止都是正常的,没有异常,但是在用这个新模型去再次和lammps结合时候就出问题了,这很可能是这一轮循环时候,新生成的模型和lammps的某个接口出了问题,出现的现象就是nvt 的时候,温度异常,导致原子丢失。第一轮和第二轮的区别:温度由1273K--》873K, 然后修改了train.json里面的numb_steps, disp_freq save_freq.

DP-GEN Version

dpgen2

Platform, Python Version, Remote Platform, etc

No response

Input Files, Running Commands, Error Log, etc.

https://workflows.deepmodeling.com/workflows/argo/sampling-titaalcrfenico-hk4e5

Steps to Reproduce

Further Information, Files, and Links

No response

@wangyi01 wangyi01 added the bug Something isn't working label Mar 25, 2024
@njzjz
Copy link
Member

njzjz commented Mar 27, 2024

We don't accept non-English issues. Please use English for discussion.

@njzjz njzjz closed this as not planned Won't fix, can't repro, duplicate, stale Mar 27, 2024
@njzjz njzjz added the invalid This doesn't seem right label Mar 27, 2024
@wangyi01
Copy link
Author

We don't accept non-English issues. Please use English for discussion.

The process combines DPGEN with DPA2 and the 'fp' flag for sampling. After obtaining the pre-trained model + alloy_domains and obtaining the initially trained model, lammps is used to generate trajectories. Then, DPGen2 selects configurations for FP calculation, FP calculation obtains labeled samples, and then a new model is retrained. Up to this point, everything is normal without any exceptions. However, when using this new model to combine with lammps again, there is a problem. This is likely to occur in this round of the loop when the newly generated model encounters some interface issue with lammps, resulting in abnormal temperature during NVT simulation, leading to atom loss. The difference between the first and second rounds is that the temperature changed from 1273K to 873K, and then modifications were made to numb_steps, disp_freq, and save_freq in train.json.

@njzjz njzjz reopened this Apr 10, 2024
@njzjz njzjz removed bug Something isn't working invalid This doesn't seem right labels Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants