Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import MOP directly into RXNO #27

Open
StroemPhi opened this issue Nov 24, 2021 · 16 comments · May be fixed by #44
Open

import MOP directly into RXNO #27

StroemPhi opened this issue Nov 24, 2021 · 16 comments · May be fixed by #44

Comments

@StroemPhi
Copy link
Contributor

At the moment MOP is not directly imported into RXNO but only classes of it are imported.
This can lead to inconsitencies, e.g:

  • cycloaddition is for example subsumed under cyclisation in RXNO but not so in MOP.
  • there is a conflict between the classes polymerisation (MOP:0000629) & polymerisation reaction (RXNO:0000246) (or their respective subclasses), since both classes express the same chemical reaction type/mechanism but are subsumed under different parent classes ('molecular process' resp. 'carbon-carbon coupling reaction)

In order to maintain MOP and RXNO better, I therefore propose to import MOP directly into RXNO and resolve any conflicting issues.

@batchelorc
Copy link
Collaborator

This would make a lot of sense.
Is there anything I need to do on my part?

@StroemPhi
Copy link
Contributor Author

I have not started yet. But I think it would make sense to go over the axiomatization design decisions first, as these are a bit entangled with this.

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 7, 2022
@StroemPhi
Copy link
Contributor Author

StroemPhi commented Apr 7, 2022

In order to be able to switch to an ODK workflow (#40), which entails having MOP being directly imported into RXNO, we need to resolve some discrepancies first. The main problem at the moment is the fact that MOP classes are defined in the namespcae of RXNO as well as in MOP. I assume that most of them have been copied from MOP into RXNO somehow or first defined in RXNO and then copied into MOP. However this was done, we are now left with some MOP classes being only defined in RXNO and some being defined in both but having different sublcass axioms as well as maybe also different annotations. In the issue branch of our NFDI4Chem fork, I've added a tsv in which I have put the MOP classes with ID, Label and SubclassOf in the first three columns and all the MOP classes from the rxno.olw in the next three columns. For this I've used the following ROBOT command on rxno.owl and mop.owl to get CSVs from which I could copy the classes defined in both OWL files.

 robot export --input rxno.owl \
  --header "ID|LABEL|SubClass Of|SubProperty Of|IRI|Type" \
  --split ", " \
  --include "classes properties" \
  --export rxno.csv

In the resulting TSV, I've sorted the MOP classes coming from RXNO according to the ID so it becomes easier to see the differences.

@StroemPhi
Copy link
Contributor Author

StroemPhi commented Apr 7, 2022

What needs to be done now, is to reconcile the MOP classes defined in RXNO with the ones in MOP.
I will try my best to do so, and will therefore update the TSV and add columns with comments for you to check/sign off @batchelorc, ok?

@StroemPhi
Copy link
Contributor Author

switching from the VM to my Windows environment for that and also from TSV to Excel for better readability, upload of the .xlsx will follow

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 7, 2022
StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 7, 2022
@StroemPhi
Copy link
Contributor Author

StroemPhi commented Apr 7, 2022

FYI, @batchelorc, idk if you know, but when checking these in Protege, it's best to load MOP & RXNO in the sam e workspace, so that you can just switch between the two while keeping the selected class. I go for using str+f to search the respective term and then switch back and forth to see the differences in the annotations/axioms.

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 7, 2022
StroemPhi referenced this issue in NFDI4Chem/rxno Apr 7, 2022
 MOP:0000543, MOP:0000550, MOP:0000555, MOP:0000556, MOP:0000561, MOP:0000562, MOP:0000563, MOP:0000564, MOP:0000565
@StroemPhi
Copy link
Contributor Author

StroemPhi commented Apr 19, 2022

@batchelorc MOP:0000584, MOP:0000585, MOP:0000586, MOP:0000587 and MOP:0000588 are subsumed only under 'alkene oxidative cleavage' (RXNO:0000344) in RXNO but in MOP under 'alkene oxidation' (MOP:0000581) and 'oxidative cleavage' (MOP:0000707) although there also exists the class 'alkene oxidative cleavage' (MOP:0000708).

So first problem, MOP:0000708 and RXNO:0000344 seem to be the same, thus where should this class be declared MOP or RXNO?

Second, to avoid asserted multiple inheritance , it would be better to stick to the pattern of using chemical groups for the differentiation of the subclasses, thus asserting MOP:0000584, MOP:0000585, MOP:0000586, MOP:0000587 and MOP:0000588 to be children of 'alkene oxidative cleavage' and defining 'oxidative cleavage' (MOP:0000707) in a way that allows us to infer that 'alkene oxidative cleavage' is also an 'oxidative cleavage'. My chemistry knowledge is to limited to propose such a definition unfortunately.

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 19, 2022
MOP:0000584, MOP:0000585, MOP:0000586, MOP:0000587 and MOP:0000588
@StroemPhi
Copy link
Contributor Author

From a non-chemist perspectiv, I don't understand why 'oxidative cleavage' is acually needed as a subclass of oxidation in MOP, when the various oxidations are defined according to their involved reactants (e.g. alkene). I can see that it is certainly helpful for grouping purposes. But its current definition "A bond-breaking process where the oxidation states of the reactive centres increase." seems to be too generic to be actually useful for a such grouping as one would have to define "oxidation state" and "reactive centre" for that. Wouldn't it thus be easier to just drop 'oxidative cleavage'?

@StroemPhi
Copy link
Contributor Author

OK, after trying to understand more what oxidative cleavage is, I see why it is needed. However, what remains to be done, if we want to avoid asserted multiple inheritance, is a better definition (textual and logical) of "oxidative cleavage".

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 19, 2022
@StroemPhi
Copy link
Contributor Author

StroemPhi commented Apr 19, 2022

My pattern proposal to infer subclasses of 'oxidative cleavage' like 'alkene oxidative cleavage' is to define the former with the equivalenceTo axiom: oxidation and ('has occurrent part' some 'breaking of covalent bond').
This way we'd only have to add the sublcassOf axiom 'has occurrent part' some 'breaking of covalent bond' to each oxidation subclass that is also an 'oxidative cleavage'. Using ELK or Pallet this multi-inheritance can then be inferred (see screen shot)

grafik

Now, I wonder if using 'breaking of covalent bond with group' would be more precise instead of 'breaking of covalent bond', as I interpret the "substrate" mentioned in the definition of the former as the "reactive centre" used in the definition of 'oxidative cleavage' . Edit: This does not make sense as the cleavage focuses on the bond of the substrate not on the bond between substrate and a functional group.

@cmungall
Copy link

I think this is on the right lines, most classification use cases can be satisfied with simple logical definitions like this.

I would need to know more about the modeling to know whether the proposed logical definition is too permissive. Can you have a single instance of an oxidation process that encompasses different mechanisms.

There is a bad smell here is that the text def differs from the proposed logical definition:

A bond-breaking process where the oxidation states of the reactive centres increase

But maybe this is just a matter of the genus and differentia being inverted in order, which doesn't affect the meaning? Still it's good if these are aligned. I really recommend MOP adopt a more standard genus-differentia form where the genus is a term in the ontology. Because the text def makes it seem like the superclass should be 'breaking of covalent bond;

Looking at the structure of this part of the hierarchy you may get more mileage from axiomatizing the terms. There are more of these, more asserted MI, and due to their triviality are less likely to lead to divergence between intended OWL and stated OWL. (I am just looking at MOP here)

@StroemPhi
Copy link
Contributor Author

StroemPhi commented Apr 20, 2022

Thank you a lot @cmungall for this feedback!

I would need to know more about the modeling to know whether the proposed logical definition is too permissive. Can you have a single instance of an oxidation process that encompasses different mechanisms.

I'm afraid, I don't really understand what you mean with the second sentence. As far as I understand it, there are oxidations which can result in a cleavage of the substrate, e.g. the double bond of an alkene is cleaved, and those are special types of 'oxidative cleavage'. If there are oxidations which entail a different mechanism than cleavage, I would assume that we'd have to define new classes for these, e.g. 'oxidative [other mechanism]', not?

There is a bad smell here is that the text def differs from the proposed logical definition:

Totally agree. Having spoken to my chemist colleagues and wrt the fact that 'oxidative cleavage' is being asserted as a subclass of 'oxidation' in MOP, the inverted order of the genus and differentia doesn't change the meaning and thus the textual definition would also have to be changed to something along the lines of: "An oxidation that also entails the breaking of a covalent bond of the oxidized substrate." But let's wait what @batchelorc, or some other chemists reading this, will say to this.

Looking at the structure of this part of the hierarchy you may get more mileage from axiomatizing the terms. There are more of these, more asserted MI, and due to their triviality are less likely to lead to divergence between intended OWL and stated OWL. (I am just looking at MOP here)

If you refer to 'primary alkene oxidation to carboxylic acid and carbon dioxide', 'quaternary alkene oxidation to ketones'
'secondary terminal alkene oxidation to ketone and carbon dioxide', 'secondary, non-terminal alkene oxidation to aldehydes' and
'tertiary alkene oxidation to carboxylic acid and ketone', I can say that they are subsumed only under 'alkene oxidative cleavage' in RXNO. This is part of the problem I try to solve in this issue, that MOP terms are declared in RXNO and MOP and unfortunately sometimes in different ways. Assuming that the subsumtion in RXNO is more correct, I propose to subsume them like this also in MOP. But then we are left with the fact that in MOP they are also subsumed under 'oxidative cleavage', hence my axiomatization proposal.

current 'oxidation' branch in MOP
grafik
vs
current 'oxidation' branch in RXNO
grafik

What leaves me puzzeled however is that ELK or Pallet in Protégé won't infer that the subclasses of 'alkene oxidative cleavage' are also subclasses of 'oxidative cleavage'. So the inferred MI stops at the level of 'alkene oxidative cleavage'. Why is that and would it thus be better to drop the proposed eqivalentTo axiom on 'oxidative cleavage' and just define 'alkene oxidative cleavage' as the union of 'alkene oxidation' and 'oxidative cleavage'?

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 20, 2022
MOP:0000619, MOP:0000627, MOP:0000628, MOP:0000642, MOP:0000650, MOP:0000656, MOP:0000671, MOP:0000705, MOP:0000713, MOP:0000714,  MOP:0000715,  MOP:0000716,  MOP:0000717,  MOP:0000718,  MOP:0000719,  MOP:0000720
@balhoff
Copy link

balhoff commented Apr 20, 2022

What leaves me puzzeled however is that ELK or Pallet in Protégé won't infer that the subclasses of 'alkene oxidative cleavage' are also subclasses of 'oxidative cleavage'. So the inferred MI stops at the level of 'alkene oxidative cleavage'. Why is that and would it thus be better to drop the proposed eqivalentTo axiom on 'oxidative cleavage' and just define 'alkene oxidative cleavage' as the union of 'alkene oxidation' and 'oxidative cleavage'?

@StroemPhi sorry if I'm missing something more interesting, but is this just a view issue? Protege only shows direct inferred superclasses in this view. If you do a DL query for Subclasses of 'oxidative cleavage' do you see all the terms you expect?

@StroemPhi
Copy link
Contributor Author

@balhoff , ha thanks! It was indeed just a view issue. I didn't use the DL query tab before and didn't now about the normal view limitation. ROBOT reason also infers as expected.

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 20, 2022
MOP:0000721, MOP:0000730-0000740, MOP:0000790-0000795, MOP:0000802, MOP:0000825, MOP:0000826, MOP:0001369, MOP:0001458, MOP:0001550, MOP:0002364, MOP:0002369, MOP:0002411, MOP:0002479, MOP:0002524, MOP:0003339, MOP:0003479, MOP:0003524, MOP:0006369
@StroemPhi
Copy link
Contributor Author

StroemPhi commented Apr 20, 2022

@batchelorc I'm done with the reconciliation in the Excel file of my issue branch. So I'm waiting for your feedback to my comments in there. But I will already start to purge the ones from RXNO that are identical in both tomorrow and push this rxno.owl to our NFDI4Chem issue branch.

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 21, 2022
StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 21, 2022
StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 21, 2022
StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 21, 2022
…owl and correct its superclass subsumption in MOP"

This reverts commit d9a6663.
@StroemPhi
Copy link
Contributor Author

StroemPhi commented Apr 21, 2022

As the owl files in the root folder are the release artefacts, I need to create editor files in which the direct import and the reconciliation is done and from which the improved release artefact will be generated. This is also needed as a prerequisite to switch to an ODK workflow (#40).

StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 21, 2022
StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue Apr 21, 2022
subsume children according to RXNO:0000344 before its obsoletion and add axiomatization as proposed in rsc-ontologies#27 (comment)
StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue May 5, 2022
@StroemPhi StroemPhi linked a pull request May 11, 2022 that will close this issue
StroemPhi added a commit to NFDI4Chem/rxno that referenced this issue May 19, 2022
…cleavage" and alkene oxidative cleavage

reason:'has occurrent part' some 'breaking of covalent bond' is too inspecific as it doesn't restrict it to the reactive centre, which would be actually needed in this case. thus keeping both parents

see also: rsc-ontologies#27 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants