Skip to content
Yevgeny Kazakov edited this page Jan 19, 2018 · 4 revisions

This project contains programs and scripts for performing experiments on axiom-pinpointing algorithms from PULi and other related tools. It also contains utilities for collecting statistics and processing the experiment results.

Compilation

From the root directory of the project run:

$ mvn clean install

Running the experiments

All experiments can be executed using main java files (e.g., when importing the project from eclipse using Import > Maven > Existing Maven Projects and creating an appropriate Java application Run configuration), or from the command line using the shell scripts located in the bin directory.

The experiments evaluate axiom-pinpointing algorithms that extract justifications from proofs. In particular, the algorithms find all justifications of why some conclusion is entailed by some ontology and they use a complete proof of that conclusion to do this. So an input for such an algorithm is an ontology, a conclusion and a proof. When performing one experiment, the tested algorithm, the ontology and the type of proofs is fixed. The experiment runs the algorithm on each conclusion from some collection. In this context we say that one conclusion is a query and a collection of conclusions is specified by a query file.

Query Files

A query file is a simple text file where each line is a query. One such file should contain queries that are conclusions entailed by one ontology. The current implementation of the experiments recognizes only queries that are SubClassOf axioms (subsumptions). One line in a query file contains full IRI of the subclass and the superclass from such axiom separated by a space.

Query files can be generated using the main class org.liveontologies.pinpointing.ExtractSubsumptions or the script extract_subsumptions.sh. It collects subsumptions between atomic classes entailed by the provided ontology. Optionally, the subsumptions may be limited to direct ones and told subsumptions may be filtered out. Invocation:

$ ./bin/extract_subsumptions.sh ${options} ${ontology} ${output}

The important command-line arguments are:

  • -h: prints the help message and exit
  • ${ontology}: the ontology in functional style OWL that entails the collected subsumptions
  • ${output}: name of the query file that is generated

Example

The following command collects all subsumptions between atomic classes entailed by the test ontology in src/test/resources/test_input/simple/ExistCycle.owl and outputs them into the query file ExistCycle.queries:

$ ./bin/extract_subsumptions.sh src/test/resources/test_input/simple/ExistCycle.owl ExistCycle.queries

When successful, it creates an ExistCycle.queries file with the content:

http://example.org/B http://example.org/A
http://example.org/C http://example.org/A
http://example.org/D http://example.org/A
http://example.org/A http://example.org/B
http://example.org/C http://example.org/B
http://example.org/D http://example.org/B
http://example.org/A http://example.org/C
http://example.org/B http://example.org/C
http://example.org/D http://example.org/C
http://example.org/E http://example.org/F
http://example.org/G http://example.org/F
http://example.org/E http://example.org/G

The following command collects only direct subsumptions:

$ ./bin/extract_subsumptions.sh --direct src/test/resources/test_input/simple/ExistCycle.owl ExistCycle.direct.queries

So the file ExistCycle.direct.queries will be the same as above minus but without the indirect subsumption:

http://example.org/E http://example.org/F.

Experiments Implemented in Java

The Java main class org.liveontologies.pinpointing.RunJustificationExperiments runs an axiom-pinpointing algorithms that implement the interface org.liveontologies.puli.pinpointing.MinimalSubsetsFromProofs from PULi. The corresponding shell script can be used as follows:

$ ./bin/run_justification_experiments.sh ${options} ${record} ${queries} ${experiment} -- ${experiment_arguments}

The important options and arguments are:

  • -h: prints the help message and exit
  • -t T: local timeout per query in milliseconds
  • -g G: global timeout for the whole experiment in milliseconds
  • ${record}: the file into which the experiment results will be written
  • ${queries}: the query file
  • ${experiment}: the name of the class that implements the experiment
  • ${experiment_arguments}: the arguments for the experiment

Experiment classes

The particular experiments are implementations of interface org.liveontologies.pinpointing.experiments.JustificationExperiment. When one of these classes is specified as ${experiment} in the command line arguments, it is instantiated by reflexion and the arguments ${experiment_arguments} that follow -- are passed to it.

Current implementations of org.liveontologies.pinpointing.experiments.JustificationExperiment:

  • org.liveontologies.pinpointing.experiments.ElkFactoryJustificationExperiment
  • org.liveontologies.pinpointing.experiments.ElkResolutionJustificationExperiment
  • org.liveontologies.pinpointing.experiments.OwlFactoryJustificationExperiment
  • org.liveontologies.pinpointing.experiments.OwlResolutionJustificationExperiment
  • org.liveontologies.pinpointing.experiments.SatFactoryJustificationExperiment
  • org.liveontologies.pinpointing.experiments.SatResolutionJustificationExperiment

The first word in the class names specifies what kind of proofs are used for the experiment and the second word specifies what kind of axiom-pinpointing algorithm is used. Experiments with names matching the following patterns have the following properties:

Elk*

Uses internal proofs of ELK. Experiment arguments contain:

  • ${ontology}: ontology file in functional style OWL

This ontology must be the same as the one from which the query file specified by the argument ${queries} was generated. The ontology is classified before the experiments are run. Tracing of each query is done during the experiment run on that query.

Owl*

Uses external OWL API proofs of ELK. Experiment arguments contain:

  • ${ontology}: ontology file in a format recognized by OWL API

This ontology must be the same as the one from which the query file specified by the argument ${queries} was generated. The ontology is classified before the experiments are run. Tracing of each query is done during the experiment run on that query.

Sat*

Uses proofs encoded in SAT as described here. Experiment arguments contain:

  • ${input_dir}: directory with the encoding

The SAT encoding must be generated using the same query file as specified by the argument ${queries} was generated. The experiment does not need to perform any ontology classification nor tracing, because they were performed during encoding to SAT. Encoded proof of a query is loaded before the experiment run on that query.

*Factory*

The axiom-pinpointing algorithm is specified by the class that implements it. The class must have public static method getFactory() that returns a org.liveontologies.puli.pinpointing.MinimalSubsetsFromProofs.Factory that instantiates the class. Experiment arguments contain:

  • ${class}: the name of the class of the axiom-pinpointing algorithm

Some of the axiom-pinpointing algorithm classes are:

  • org.liveontologies.pinpointing.BottomUpJustificationComputation
  • org.liveontologies.pinpointing.TopDownJustificationComputation
  • org.liveontologies.puli.pinpointing.ResolutionJustificationComputation (only default selection type can be used with this experiment)

*Resolution*

The axiom-pinpointing algorithm is org.liveontologies.puli.pinpointing.ResolutionJustificationComputation. Experiment arguments contain:

  • ${selection}: the selection type used by the resolution algorithm (one of TOP_DOWN, BOTTOM_UP, THRESHOLD)

Example

The following command

$ ./bin/run_justification_experiments.sh \
 -t 60000 record.csv ExistCycle.queries org.liveontologies.pinpointing.experiments.ElkResolutionJustificationExperiment \
 -- src/test/resources/test_input/simple/ExistCycle.owl THRESHOLD

runs resolution axiom-pinpointing algorithm on internal proofs from ELK. This is specified by the experiment class org.liveontologies.pinpointing.experiments.ElkResolutionJustificationExperiment. The selection type for the resolution algorithm is specified by THRESHOLD. The specified query file ExistCycle.queries was generated from the specified ontology src/test/resources/test_input/simple/ExistCycle.owl (see previous example). The algorithm will be run on each query from ExistCycle.queries and time out for each of these runs will be 1 minute as specified by -t 60000. The results will be written into record.csv.

Experiments implemented in scripts

TODO

Results

The results of an experiment are written into file specified as ${record} when invoking the experiment. These are simple CSV files that contain one line per query from query file with which the experiment was run, except the first line that contains column names.

First few columns are:

  • query: The query as in the query file.
  • didTimeOut: Whether the algorithm timed out with the local timeout.
  • time: The time spent on the query in milliseconds.
  • nJust: The number of justifications found within the local timeout. The following columns contain various statistics, which depend on the particular experiment.

Example

The previous example writes the results into file record.csv, which looks something like this:

query,didTimeOut,time,nJust,usedMemory,firstQuartileJustSize,just1Time,just2Time,justHalfTime,justificationComputation.nMinimalInferences,justificationComputation.nProducedInferences,maxJustSize,meanJustSize,medianJustSize,minJustSize,obtainingInferencesTime,proofProvider.proof.elk.traceState.evictor.capacity,proofProvider.proof.elk.traceState.evictor.size,proofProvider.proof.elk.traceState.nCacheHits,proofProvider.proof.elk.traceState.nCacheMisses,thirdQuartileJustSize
"http://example.org/B http://example.org/A",FALSE,86.970762,1,20815896,2.0,"86.623158","","86.623158",21,23,2,2.0,2.0,2,56.209206,16896,7,6,1,2.0
"http://example.org/C http://example.org/A",FALSE,3.11089,1,21792064,1.0,"2.787223","","2.787223",18,21,1,1.0,1.0,1,12.073124,16896,17,6,1,1.0
"http://example.org/D http://example.org/A",FALSE,3.450787,1,23389528,1.0,"3.250585","","3.250585",22,24,1,1.0,1.0,1,7.47551,16896,29,8,1,1.0
"http://example.org/A http://example.org/B",FALSE,1.71385,1,24373864,1.0,"1.570368","","1.570368",18,21,1,1.0,1.0,1,4.674867,16896,36,6,1,1.0
"http://example.org/C http://example.org/B",FALSE,3.018365,1,25038392,2.0,"1.882771","","1.882771",21,23,2,2.0,2.0,2,3.59218,16896,36,7,0,2.0
"http://example.org/D http://example.org/B",FALSE,3.259645,1,25979328,2.0,"2.341369","","2.341369",25,27,2,2.0,2.0,2,17.647464,16896,36,9,0,2.0
"http://example.org/A http://example.org/C",FALSE,2.210724,1,26299424,2.0,"2.070114","","2.070114",21,23,2,2.0,2.0,2,3.723399,16896,36,7,0,2.0
"http://example.org/B http://example.org/C",FALSE,3.769187,1,27579376,1.0,"3.359034","","3.359034",18,21,1,1.0,1.0,1,34.496105,16896,36,7,0,1.0
"http://example.org/D http://example.org/C",FALSE,3.622303,1,28856488,3.0,"3.027331","","3.027331",28,30,3,3.0,3.0,3,27.671066,16896,36,9,0,3.0
"http://example.org/E http://example.org/F",FALSE,14.525833,4,29496392,3.5,"11.07972","12.188796","12.188796",115,121,5,4.25,4.5,3,14.335333,16896,49,34,1,5.0
"http://example.org/G http://example.org/F",FALSE,4.562595,2,29816488,2.0,"4.184983","4.406288","4.184983",60,63,4,3.0,3.0,2,10.449368,16896,58,18,1,4.0
"http://example.org/E http://example.org/G",FALSE,2.293428,1,31725040,1.0,"1.87716","","1.87716",14,15,1,1.0,1.0,1,5.993184,16896,58,5,0,1.0

Plotting the results

TODO

Creating SAT Encoding

A "direct" SAT encoding is used by a number of axiom-pinpointing tools. EL+SAT describes a format of this encoding, that is, up to small differences, used by all these tools. The java main class org.liveontologies.pinpointing.DirectSatEncodingUsingElkCsvQuery produces such encoding in a format usable by most of these tools. It essentially encodes internal proofs of ELK into CNF. The corresponding shell script can be used as follows:

$ ./bin/direct_sat_enconding_using_elk_csv_query.sh ${options} ${ontology} ${queries} ${output_dir}

The important command-line arguments are:

  • -h: prints the help message and exit
  • ${ontology}: ontology file in functional style OWL
  • ${queries}: the query file generated from the ontology
  • ${output_dir}: name of the directory that will contain the encoding after success After success, the specified output directory will contain one subdirectory for each query, each of which will contain the "direct" SAT encoding for that query. For more information see documentation of org.liveontologies.pinpointing.DirectSatEncodingUsingElkCsvQuery.

(The encoding is called direct, because inferences are encoded as implications from the premises to the conclusion.)

Example

After producing ExistCycle.queries by the first example, queries from this file can be encoded in SAT by:

$ ./bin/direct_sat_enconding_using_elk_csv_query.sh src/test/resources/test_input/simple/ExistCycle.owl ExistCycle.queries ExistCycle.elk_sat

TODO: how to encode proofs provided by EL+SAT?