Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrades Living Document - Datasets #636

Open
ardunn opened this issue Jun 8, 2021 · 6 comments
Open

Upgrades Living Document - Datasets #636

ardunn opened this issue Jun 8, 2021 · 6 comments
Labels
living document ISSUE with ideas/progress on large projects or upgrades

Comments

@ardunn
Copy link
Contributor

ardunn commented Jun 8, 2021

Cubic crystal compounds

Superconductor temperatures

2D Ferromagnets

  • 786 materials, 26 with curie point beyond 400k for 2D ferromagnets from here

UV/Vis spectra of metal oxides

  • Composition/UV-Vis measurements from 179072 metal oxides from here

UCSB Thermoelectrics database

CMRDB 2D Materials Databse

Vicker's load-dependent hardness dataset

TAATA polymorphs dataset addressed in #794

@CompRhys WBM set

@ardunn ardunn added the living document ISSUE with ideas/progress on large projects or upgrades label Jun 8, 2021
@ardunn
Copy link
Contributor Author

ardunn commented Jan 13, 2022

See also materialsproject/matbench#2

@CompRhys
Copy link

https://zenodo.org/record/5530535 I was recently able to get the permissions to upload this publicly for some of our work. Potentially of interest - the data is 3 highly sampled phase diagrams including a lot of unstable structures. I uploaded both the initial (from prototyping) and relaxed structures. The authors refer to it as the TAATA data set.

@ardunn
Copy link
Contributor Author

ardunn commented Mar 16, 2022

@CompRhys this is great! Can we upload this to our figshare as well, in a different form (one big df with all systems in it?)

If not, will that zenodo link always be available? If so, we still add it as a dataset

@CompRhys
Copy link

Zenodo is a different permanent archive service that will always exist as long as CERN exists so you can use the zenodo links safely. However I got permission to share with MIT so whatever makes sense is allowed. If you want to combine them into a single data set just note that there are some duplicated materials found in all 3 phase diagrams that should be removed - they have the same ht_id.

@ardunn
Copy link
Contributor Author

ardunn commented Mar 18, 2022

BTW @CompRhys this is an excellent and somewhat unique dataset. Shame the band gaps and tensors for all the compounds weren't included but still. I think this has the potential to be a difficult and interesting problem in matbench eventually, so I'll be adding it to the living document on that repo as well

@CompRhys
Copy link

Speaking of bandgaps, I also have another data set that I think would be good that I have been calling WBM. The relaxed structures are available already (https://archive.materialscloud.org/record/2021.68) but there are a few issues with matching the dataset to the bandgaps in the summary file that took a while to get right. When @janosh attended the MPworkshop we asked about having it added to MP as it is MP compatible but we never chased it. Lmk if interested and I will share the cleaned version with initial structures.

There's also some more data shared by the groups of B and M from WBM that I also think might be worth making a true benchmark with but haven't found time to dig into it as a super nice thing about WBM is the the sequential nature of batches which might not be easy to maintain if we add in the extra data. This was the idea I was hinting to here materialsproject/matbench#104 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
living document ISSUE with ideas/progress on large projects or upgrades
Projects
None yet
Development

No branches or pull requests

2 participants