Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create mapping file for Gb3 #169

Open
MelbourneFL opened this issue Feb 8, 2024 · 9 comments
Open

Create mapping file for Gb3 #169

MelbourneFL opened this issue Feb 8, 2024 · 9 comments
Labels
data mapping files, experiments, simulations

Comments

@MelbourneFL
Copy link
Contributor

Hello,

my name is Alexander Vogel and I'm trying to add a new simulation with 3 runs (7 microseconds each) to the databank (DOIs: 10.5281/zenodo.10635871 - 10.5281/zenodo.10635875 - 10.5281/zenodo.8335207). A while back there was a request for simulations with unusual lipids and this one contains the glycolipid Gb3.

It is my first time doing this but with your help files it went smooth so far. Now however I need to add the composition information to the yaml file. First I would like to confirm that the first part is correct which is POPC and water. Setup was done with CHARMM-GUI with the CHARMM FF. Is this correct then:

POPC:
NAME: POPC
MAPPING: mappingPOPCcharmm.yaml
SOL:
NAME: TIP3
MAPPING: mappingTIP3PCHARMMgui.yaml

In addition the simulations contain Gb3 which is constructed from separate parts for the Ceramide backbone and the three rings. They are called CER2, BGLC, BGAL and AGAL and I guess new mapping files have to be created for those. Can you help me with this?

Thanks and best regards,

Alexander

@ohsOllila
Copy link
Member

Thanks for contributing!

It is my first time doing this but with your help files it went smooth so far. Now however I need to add the composition information to the yaml file. First I would like to confirm that the first part is correct which is POPC and water. Setup was done with CHARMM-GUI with the CHARMM FF. Is this correct then:

POPC:
NAME: POPC
MAPPING: mappingPOPCcharmm.yaml
SOL:
NAME: TIP3
MAPPING: mappingTIP3PCHARMMgui.yaml

This looks correct, but note that in Python indents matter, so it should be
COMPOSITION:
POPC:
NAME: POPC
MAPPING: mappingPOPCcharmm.yaml
SOL:
NAME: TIP3
MAPPING: mappingTIP3PCHARMMgui.yaml

In addition the simulations contain Gb3 which is constructed from separate parts for the Ceramide backbone and the three rings. >They are called CER2, BGLC, BGAL and AGAL and I guess new mapping files have to be created for those. Can you help me with >this?

There is already a mapping file for GM1 available, may be this can be used as a template: https://github.com/NMRLipids/Databank/blob/main/Scripts/BuildDatabank/mapping_files/mappingGM1charmm.yaml

Also, @markussmiettinen may have thought about this also?

@MelbourneFL
Copy link
Contributor Author

I'm confused. Your indents look just like mine (there are none). I guess the indents are removed here. I created them just as in the example yaml file.

About the mappping. I looked at the GM1 file and partially understand the structure but I wouldn't know how to create that. E.g. looking at the first segment:

M_G1_M:
ATOMNAME: C3S
FRAGMENT: backbone
RESIDUE: CER160

I guess ATOMNAME, FRAGMENT and RESIDUE come from the structure and naming in the simulation...so I could figure that out but how would I come up with M_G1_M?

Alexander

@ohsOllila
Copy link
Member

Sorry, the indents were not properly shown in this editor. Check the correct indents for example from here: https://github.com/NMRLipids/Databank/blob/main/Scripts/BuildDatabank/info_files/777/info.yaml

I guess ATOMNAME, FRAGMENT and RESIDUE come from the structure and naming in the simulation...so I could figure that out

Yes.

but how would I come up with M_G1_M?

These are universal atom names. There is some more explanation in here: https://nmrlipids.github.io/moleculesAndMapping.html#universal-atom-names-in-mapping-files. However, this may not be explicitly clear for sugars. You can check how this is implemented by comparing the GM1 mapping file to the structure in here: https://zenodo.org/doi/10.5281/zenodo.8331804. In principle, it does not matter how these are named as long as each atom has unique name, but more logical they are, the easier it is for human to understand them. Or maybe @markussmiettinen has some insight on this?

@markussmiettinen
Copy link
Member

Hi Alexander, thank you for contributing! I am just about to go offline for a week, but maybe @comcon1 would like to comment on the naming of glycolipids in the meanwhile?

@MelbourneFL
Copy link
Contributor Author

Hello,

I went ahead and created a mapping file based on GM1. I tried to come up with a numbering scheme for the sugars that makes sense. Unfortunately, I can't upload it here (https://github.com/NMRLipids/Databank/tree/main/Scripts/BuildDatabank/mapping_files) because it says uploads are disabled. Also I can't attach it to this post since the file type is not supported. So I uploaded it there (where it will be available for 17 days): https://daten-transport.de/?id=J8FEt59PLbBm

Since I'll be going on vacation tomorrow (I didn't expect this to take this long), could somebody please add it to the databank and also add my simulations. The info.yaml files can be downloaded there (where it will be available for 17 days): https://daten-transport.de/?id=35JFSuY3TYKM

Thanks a lot!

Alexander

PS: When using the GM1 mapping as a template I found two atoms which in my opinion don't belong there:

This O1 seems to be in the file twice:
M_G4C1O1_M:
ATOMNAME: O1
FRAGMENT: headgroup
RESIDUE: BGLC

This HO4 is not present in the BGAL for Gb3:
M_G5O4H4_M:
ATOMNAME: HO4
FRAGMENT: headgroup
RESIDUE: BGAL

ohsOllila pushed a commit that referenced this issue Feb 25, 2024
@ohsOllila
Copy link
Member

Thanks for the files.

You cannot directly upload to the main databank branch. You need to first make own fork, upload there, and then make a pull request to the main branch. Anyway, I added the mapping file now in here https://github.com/NMRLipids/Databank/blob/main/Scripts/BuildDatabank/mapping_files/mappingGB3charmm.yaml. I also added one of the info files in here https://github.com/NMRLipids/Databank/blob/main/Scripts/BuildDatabank/info_files/790/info.yaml. However, when I tried to add the simulation in the databank (python AddData.py -f info_files/790/info.yaml), the number of atoms did not match: "Number of atoms in trajectory 74381 and README.yaml 74371 do no match." The difference is ten atoms which equals the number of GB3 molecules in the system. This suggests that there may be one GB3 atom missing from the mapping file? It would be also good to check and fix the mentioned issues in GM1 mapping file.

@MelbourneFL
Copy link
Contributor Author

Thanks for your help and feedback. Actually all atoms are accounted for but I accidentially used one M_XXX_M label twice. I fixed it and created a new fork and pull request. Could you please check again?

Alexander

ohsOllila pushed a commit that referenced this issue Mar 1, 2024
@ohsOllila
Copy link
Member

Thanks. I added one of the systems: https://github.com/NMRLipids/Databank/blob/main/Scripts/BuildDatabank/info_files/790/info.yaml, and everything seems to work now (except PCA that does not work for sugars). The results are here: https://github.com/NMRLipids/Databank/tree/main/Data/Simulations/82e/424/82e42412f1383f19441f90baf20721ab773ce1fa/6fd7d3a864973fb14dee318fc08f64f33a33e661

I had to update in mapping file CER24 -> CER240 because these are the real residue names even though last character is left out in gro files.

I would have added also other systems, but I could not access the info files in the link anymore. Could you commit these in the git, or send a new link?

@ohsOllila
Copy link
Member

@MelbourneFL I think that there would be still couple of already made info files that could be added, but are not yet added, because I was to slow to download them. Would it be possible to send them again?

@comcon1 comcon1 added the data mapping files, experiments, simulations label Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data mapping files, experiments, simulations
Projects
None yet
Development

No branches or pull requests

4 participants