Someone had trouble configuring the qmpy environment (#1, #2). So, I decided to create a Docker image for the qmpy environment.
Changes from the Previous Release
- Published the Docker image for building the OQMD v1.2 dataset:
tonyy999/qmpy-v1.2
(64a1781) - Added
allow_pickle=True
tonp.load()
(0c9c065, 3b407e1) - Updated the dataset instruction (0a56f88, 7db8313, 4a64f16, db3306a, a7ee460)
- Added Colab notebook links (2a910bb, 0e215c3)
Testing the Docker Image
I used a GCP virtual machine to test the Docker image. A 200GB SSD was chosen as a single boot persistent disk of the VM.
-
Importing the OQMD v1.2 and execution of
oqmd_data.py
- VM: t2d-standard-1 (1 AMD Milan CPU, 4GB RAM)
- The importing took a hour, and the data extraction by
oqmd_data.py
took 5 hours 43 minutes. - The memory consumption was lower than about 2 GB.
- The log is available at this link.
-
Execution of
mp_graph.py
- VM: c2d-standard-8 (8 AMD Milan CPUs, 32GB RAM)
- The graph generation by
mp_graph.py
took 210 minutes. - The memory consumption was lower than 2 GB.
- The log is available at this link.
-
Execution of
oqmd.py
- VM: t2d-standard-2 (2 AMD Milan CPUs, 8GB RAM)
- The dataset creation by
oqmd.py
took about 30 minutes. - The memory consumption was lower than 6 GB.
- The log is available at this link.
-
Validation of the created dataset
- A validation tool was used to validate the created dataset.
- Found were 15 mismatches in
spacegroup
, but the official dataset would have some incorrect space groups due to bugs of theSpglib
around 2019. Errata for space groups is available at this link. This problem does not affect the dataset validity because the space groups are not used to transform the other data.