Cleaning up the custom Makefile #2973

gouttegd · 2023-07-13T18:13:45Z

The uberon.Makefile is currently a big mess.

Since burning it down to the ground and starting anew is not an option (not that I wouldn’t want to do it, but more reasonable people on this project would disagree, and that’s probably for the best), we need to improve it incrementally instead.

Purging commented out code

It’s easy to think that code that is commented out is harmless, but I argue that it is not the case. The amount of commented code in the Makefile is one of the things that make it very hard to read. I can understand the reluctance to throw away old code just in case it could be useful again, but we should remember that everything is under version control anyway: we’ll never definitely lose anything.

So I propose to mercilessly erase commented out code in cases like the following examples:

##MAKEOBO= owltools $< --remove-axiom-annotations -o -f obo [email protected] && grep -v ^property_value: [email protected] | grep -v ^owl-axioms: > [email protected] && obo2obo [email protected] -o $@
MAKEOBO=  $(OWLTOOLS) $< --add-obo-shorthand-to-properties  -o -f obo --no-check $@.tmp1 && grep -v ^property_value: $@.tmp1 | perl -npe 's@relationship: dc-@property_value: dc-@' | grep -v ^owl-axioms: > $@.tmp && mv $@.tmp  $@

[…]

# Not used anywhere so commenting out
#$(TMPDIR)/phenoscape_homology.owl:
#       wget --no-check-certificate https://raw.githubusercontent.com/phenoscape/phenoscape-ontologies/master/demo/phenoscape_homology.owl -O $@

[…]

# First pass at making base module
# Currently this will be missing the temporary reflexivity axioms.
# should this include downward-injected axioms, e.g on ZFA?
#uberon-base.owl: $(TMPDIR)/unreasoned.owl
#       $(OWLTOOLS) $< --remove-imports-declarations --remove-axioms -t Declaration --set-ontology-id -v $(RELEASE)/$@ $(URIBASE)/uberon/$@ -o [email protected] && mv [email protected] $@

etc.

Purging obsolete comments

In addition to commented out code, there are also plain text comments that are clearly outdated. Again, they are not harmless. One can lose quite a bit of time just trying to figure out whether a given comment is actually saying something useful that one should be aware of, or if it is something that is only of interest for Uberon archeologists.

For example, there’s an entire blocks of comments dedicated on phenoscape and how to merge it with Uberon – but phenoscape as a separate entity is no longer a thing since two years ago!

Same punition as above: we should delete such comments. Again, they will remain in the history of the project for eternity, if really someone needed to consult them one day.

Obsolete mirroring and import rules

There are lots of custom rules to deal with mirrors and imports. I suspect many if not most of them should no longer be necessary especially after the switch to base merging. We should review those rules to check:

if they are used at all; if they are not, they should be deleted;
if they are used, then are they doing something useful? Are they supposed to be used? If no, again they should be deleted.

Obsolete rules related to Travis

We are no longer using Travis, so there’s no reason to keep Travis-specific rules around. We should whether those rules are still used as part of the GitHub-based CI checks, and if they are not, delete them.

Use of OWLTOOLS when a ROBOT equivalent already exists

There are rules that make use of OWLTOOLS for tasks that ROBOT is completely able to perform. For example:

$(TMPDIR)/unreasoned.owl: $(OWLSRC) $(BRIDGEDIR)/uberon-bridge-to-bfo.owl
        $(OWLTOOLS) $^ --merge-support-ontologies -o -f functional $@

This is merely equivalent to

$(TMPDIR)/unreasoned.owl: $(OWLSRC) $(BRIDGEDIR)/uberon-bridge-to-bfo.owl
        $(ROBOT) merge -i $< -i $(BRIDGEDIR)/uberon-bridge-to-bfo.owl --collapse-import-closure false -o $@

Merging rules that no longer need to be kept separate

In the previous example, the $(TMPDIR)/unreasoned.owl target is only used once elsewhere in the Makefile, as a dependency to the following rule:

$(TMPDIR)/materialized.owl: $(TMPDIR)/unreasoned.owl $(TMP_REFL)
        $(ROBOT) merge -i $< -i $(DEVELOPS_FROM_CHAIN) --collapse-import-closure false \
                relax \
                materialize -T $(CONFIGDIR)/basic_properties.txt -r elk \
                reason -r elk --exclude-duplicate-axioms true --equivalent-classes-allowed asserted-only \
                unmerge -i $(TMP_REFL) \
                unmerge -i $(DEVELOPS_FROM_CHAIN) \
                annotate -O $(URIBASE)/uberon/materialized.owl -V  $(RELEASE)/materialized.owl -o $@ 2>&1 > [email protected]

If $(TMPDIR)/unreasoned.owl is now itself generated by a ROBOT command and is not used anywhere else, then there’s no need for two separate rules and even for two separate commands! It can all be done in a single rule with a single ROBOT pipeline:

$(TMPDIR)/materialized.owl: $(OWLSRC) $(BRIDGEDIR)/uberon-bridge-to-bfo.owl $(TMP_REFL)
        $(ROBOT) merge -i $< -i $(BRIDGEDIR)/uberon-bridge-to-bfo.owl -i $(DEVELOPS_FROM_CHAIN) --collapse-import-closure false \
                relax \
                materialize -T $(CONFIGDIR)/basic_properties.txt -r elk \
                reason -r elk --exclude-duplicate-axioms true --equivalent-classes-allowed asserted-only \
                unmerge -i $(TMP_REFL) \
                unmerge -i $(DEVELOPS_FROM_CHAIN) \
                annotate -O $(URIBASE)/uberon/materialized.owl -V  $(RELEASE)/materialized.owl -o $@ 2>&1 > [email protected]

Reducing the number of times we spawn a new OWLTOOLS or ROBOT process reduces the number of times the ontology needs to be parsed from disk. In the case of Uberon I am willing to bet the gain speed wouldn’t be negligeable.

Potentially obsolete workarounds

There are several instances where we work around problems coming from imported ontologies. Nothing inherently wrong with that (sometimes we don’t have any choice), but we should check whether those workarounds are still needed.

For example:

# TODO: We should fix that upstream
# They are using too strict relationships, which are here replaced by more general ones
$(TMPDIR)/fixed-zfa.obo: $(TMPDIR)/mirror-zfa.obo
        if [ $(MIR) = true ] && [ $(IMP) = true ]; then perl -npe 's@RO:0002488@RO:0002496@;s@RO:0002492@RO:0002497@' $< > $@; fi

mirror/zfa.owl: $(TMPDIR)/fixed-zfa.obo
        if [ $(MIR) = true ] && [ $(IMP) = true ]; then $(OWLTOOLS_NO_CAT) $< -o -f ofn $@; fi

$(TMPDIR)/fixed-ehdaa2.obo: $(TMPDIR)/mirror-ehdaa2.obo | $(SCRIPTSDIR)/obo-grep.pl $(SCRIPTSDIR)/fix-ehdaa2-stages.pl
        if [ $(MIR) = true ] && [ $(IMP) = true ]; then $(SCRIPTSDIR)/obo-grep.pl -r 'id: (EHDAA2|AEO)' $< | $(SCRIPTSDIR)/fix-ehdaa2-stages.pl | grep -v ^alt_id > $@; fi

Pointless targets

Consider the following rule:

# TODO: do we need this intermediate step? Used for subsets
tmp/merged.owl: ext.owl
        $(OWLTOOLS) $< --merge-import-closure --set-ontology-id -v $(RELEASE)/$@ $(URIBASE)/uberon/$@ -o $(TMPDIR)/$@ &&\
        $(ROBOT) merge -i $(TMPDIR)/$@ --collapse-import-closure false annotate --ontology-iri $(ONTBASE)/$@ $(ANNOTATE_ONTOLOGY_VERSION) -o $@

First, it is again using OWLTOOLS where ROBOT would do just fine. But most importantly, the only point of that rule is to merge the imports. OK, this may be “used for subsets”, except that… at this point in the pipeline the imports have already been merged! They have been merged since the early steps of the pipeline. tmp/merged.owl is actually the same thing as ext.owl (modulo the differences caused by serialisation, e.g. different IDs for blank nodes). This entire step is useless: two invocations of OWLTOOLS/ROBOT, two loads of the same ontology… for nothing.

Bottom line

Without re-writing the Makefile scratch, I think that, step-by-step, we can considerably improvide its readibility and therefore its maintainability, all the while making the pipeline faster (by removing needless steps), without introducing any changes in the pipeline’s outputs.

The text was updated successfully, but these errors were encountered:

gouttegd · 2023-07-13T18:16:43Z

@matentzn @anitacaron Any thoughts or objections before I start tackling this?

matentzn · 2023-07-15T15:41:51Z

Purging commented out code/comments

Its all in version control, follow your instincts on what to delete! I will support it as a principle, and may object to specific things being removed if I see them in a PR.

Obsolete mirroring and import rules

I did a refactoring at some point, but yes, you can remove if you know for a fact they are redundant. However, i know that the ones called "local-%" are not 100% redundant, but I never had the time to investigate in what way.

Generally agreed

Travis rules

Misnowner, these are just tests that should be run on the normal gitub action QC CI. If the all tests that are labelled as travis_test are called by test as well, then yes, delete all mentions of travis.

Use of OWLTOOLS when a ROBOT equivalent already exists

Hard yes. ready for burn down.

Remove pointless targets

I would trust your judgement.

Merging rules that no longer need to be kept separate

I will trust you and support you on this!

Potentially obsolete workarounds

Lots of work to investigate, would have that low in list of priorities, but yes.

In summary, all your suggestions are sane and make sense! As for the evalution, I would like to make sure that three files are kept an eye on during major refactorings:

uberon.obo
uberon-base.owl
subsets/human-view.owl

As long as all changes to these three files are under control, I am happy!

gouttegd · 2023-07-15T16:14:52Z

Travis rules

Misnowner, these are just tests that should be run on the normal gitub action QC CI.

I’m talking about those rules:

# This OWLTools call is designed for running in travis; does not clog stdout
ELKRUN= $(OWLTOOLS) $< $(QELK) --run-reasoner -r elk -u > $@.tmp || (tail -1000 $@.tmp && exit -1) && (tail -1000 $@.tmp && mv $@.tmp $@)

travis_test: $(TMPDIR)/is_ok

ttest-uberon: uberon.owl $(REPORTDIR)/bfo-check.txt
	$(ELKRUN)

ttest-ext: ext.owl
	$(ELKRUN)

ttest-tax: $(TMPDIR)/uberon-edit-plus-tax-equivs.owl
	$(ELKRUN)

They are not called from anywhere else in the Makefile and they are not explicitly called from the CI workflow. As far as I can tell, it’s dead code.

Use of OWLTOOLS when a ROBOT equivalent already exists

Hard yes. ready for burn down.

Filling my flame thrower right now.

Lots of work to investigate, would have that low in list of priorities

Yes, checking which workarounds are still needed and which can be jettisoned could prove to be quite a bit of work, I agree. This would clearly come after easier bits such as removing comments.

(Overall I’ve listed the subtasks more or less in the order in which I plan to tackle them.)

matentzn · 2023-07-16T15:04:55Z

IMO, all this travis stuff can go!

gouttegd · 2023-08-07T14:28:54Z

Related issues that can be fixed as part of the Makefile cleanup:

gouttegd self-assigned this Jul 13, 2023

gouttegd added the tech label Jul 13, 2023

gouttegd mentioned this issue Aug 5, 2023

BUG: some strange CL classes in last release of composite-metazoan.obo (v2023-01-09) #2782

Closed

gouttegd mentioned this issue Aug 18, 2023

Cleaning up makefile #3028

Merged

gouttegd closed this as completed in #3028 Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleaning up the custom Makefile #2973

Cleaning up the custom Makefile #2973

gouttegd commented Jul 13, 2023

gouttegd commented Jul 13, 2023

matentzn commented Jul 15, 2023

gouttegd commented Jul 15, 2023 •

edited

Loading

matentzn commented Jul 16, 2023

gouttegd commented Aug 7, 2023

Cleaning up the custom Makefile #2973

Cleaning up the custom Makefile #2973

Comments

gouttegd commented Jul 13, 2023

Purging commented out code

Purging obsolete comments

Obsolete mirroring and import rules

Obsolete rules related to Travis

Use of OWLTOOLS when a ROBOT equivalent already exists

Merging rules that no longer need to be kept separate

Potentially obsolete workarounds

Pointless targets

Bottom line

gouttegd commented Jul 13, 2023

matentzn commented Jul 15, 2023

gouttegd commented Jul 15, 2023 • edited Loading

matentzn commented Jul 16, 2023

gouttegd commented Aug 7, 2023

gouttegd commented Jul 15, 2023 •

edited

Loading