Skip to content

Commit

Permalink
Update readme & pre-commit; remove vocabularies dir
Browse files Browse the repository at this point in the history
  • Loading branch information
dalito committed Jul 27, 2023
1 parent 93ed9dd commit 31eef45
Show file tree
Hide file tree
Showing 7 changed files with 16 additions and 1,551 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ jobs:
run: |
python -m coverage combine
python -m coverage html --skip-empty --skip-covered
# By merging files from vocexcel coverage dropped from 100% to 93%.
# By merging files from vocexcel in #119 coverage dropped from 100% to 93%.
python -m coverage report --fail-under=95
- name: Upload HTML report
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ repos:
- id: black
args: ['--target-version', 'py38']
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.0.278
rev: v0.0.280
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
Expand Down
25 changes: 12 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ To support what is not provided by the original VocExcel project we have develop
- Generating documentation (with [pyLODE](https://github.com/RDFLib/pyLODE) or [ontospy](http://lambdamusic.github.io/Ontospy/))
- Support for expressing concept-hierarchies by indentation.

Starting with v0.5.0 (July 2023) voc4cat uses its own internal vocexcel command line tool ([PR #119](https://github.com/nfdi4cat/voc4cat-tool/pull/119)).
Starting with v0.5.0 (July 2023) voc4cat uses its internal converter and does no longer depend on vocexcel ([PR #119](https://github.com/nfdi4cat/voc4cat-tool/pull/119)).

## Installation

Expand Down Expand Up @@ -78,43 +78,42 @@ You may first use simple temporary IRIs like (`ex:my_term`).

With voc4cat you can later replace all IDs belonging to a given prefix (here `ex`) by numeric IDs e.g. starting from 1001:

`voc4cat --make-ids ex 1001 --output-directory output example/concept_hierarchy_043_4Cat.xlsx`
`voc4cat --make-ids temp 1001 --output-directory output example/photocatalysis_example_prelim-IDs.xlsx`

This will update all IRIs matching the `ex:`-prefix in the sheets "Concepts", "Additional Concept Features" and "Collections".
This will update all IRIs matching the `temp:`-prefix in the sheets "Concepts", "Additional Concept Features" and "Collections".

Manually filling the Children URI (in sheet "Concepts") and Members URI (in sheet "Collections") with lists of IRIs can be tedious.
An easier way to express hierarchies between concepts is to use indentation.
voc4Cat understands Excel-indentation (the default) for this purpose but can also work with other indentation formats (e.g. by 3 spaces per level).
voc4cat supports converting between indentation-based hierarchy and Children-URI hierarchy (both directions). For example, use

`voc4cat --hierarchy-from-indent --output-directory output example/indent_043_4Cat.xlsx`
`voc4cat --hierarchy-from-indent --output-directory output example/photocatalysis_example_indented_prelim-IDs.xlsx`

or if you were using 3 spaces per level
or if you were using 3 spaces per level (this file does not exist)

`voc4cat --hierarchy-from-indent --indent-separator " " --output-directory output example/indent_3spaces_043_4Cat.xlsx`
`voc4cat --hierarchy-from-indent --indent-separator " " --output-directory output example/photocatalysis_example_prelim-IDs_3spaces_indent.xlsx`

to convert to ChildrenURI-hierarchy. For ChildrenURI-hierarchy to Excel-indentation, use
to convert to ChildrenURI-hierarchy. To create such a file you can convert from ChildrenURI-hierarchy to indentation by

`voc4cat --hierarchy-to-indent --output-directory output example/concept_hierarchy_043_4Cat.xlsx`
`voc4cat --hierarchy-to-indent --indent-separator " " --output-directory output example/photocatalysis_example_prelim-IDs.xlsx`

Finally, the vocabulary file can be converted to turtle format. In this case the wrapper script forwards the job to VocExcel:

`voc4cat vocabulary.xlsx`
`voc4cat example/photocatalysis_example.xlsx`

A turtle file `vocabulary.ttl` is created in the same directory where the xlsx-file is located.
A turtle file `photocatalysis_example.ttl` is created in the same directory where the xlsx-file is located.

The reverse is also possible. You can create an xlsx file from a turtle vocabulary file. Optionally a custom XLSX-template-file can be specified for this conversion:

`voc4cat --template template/VocExcel-template_043_4Cat.xlsx vocabulary.ttl`

Options that are specific for VocExcel can be put at the end of a `voc4cat` command.
Here is an example that forwards the `-e 3` and `-m 3` options to VocExcel and moreover demonstrates a complex combination of options (as used in CI):
Here is an example that forwards the `-e 3` and `-m 3` options to VocExcel and moreover demonstrates a complex combination of options:

`voc4cat --check --forward --docs pylode --output-directory outbox inbox-excel-vocabs/ -e 3 -m 3`

Besides `voc4cat` this project also installs its own version of the `vocexcel` command line tool (for historic reasons). To get help on how to use it type
Besides `voc4cat` this project also installs its own version of the `vocexcel` command line tool (for historic reasons). Its use is deprecated in version 0.5.0 and it will be removed in version 0.6.0.

`vocexcel --help` (or simply `vocexcel`)

## Feedback and code contributions

Expand Down
2 changes: 1 addition & 1 deletion src/voc4cat/wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -284,7 +284,7 @@ def hierarchy_to_indent(fpath, outfile, sep):
if iri in row_by_iri: # merge needed
# compare fields, merge if one is empty, error if different values
new_data = [row[col_no] for col_no in range(6, col_last)]
old_data = row_by_iri[iri][list(row_by_iri[iri].keys())[0]][5:]
old_data = row_by_iri[iri][next(iter(row_by_iri[iri].keys()))][5:]
merged = []
for old, new in zip(old_data, new_data):
if (old and new) and (old != new):
Expand Down
2 changes: 1 addition & 1 deletion tests/test_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ def test_check_for_removed_iris( # noqa: PLR0913
# Prepare data with removed concept
g = Graph()
g.parse(original, format="turtle")
one_uri = list(g.subjects(RDF.type, skos_el))[0]
one_uri = next(iter(g.subjects(RDF.type, skos_el)))
g.remove((one_uri, None, None))
reduced = tmp_path / (str(original.stem) + "_reduced.turtle")
g.serialize(destination=reduced, format="turtle")
Expand Down
Loading

0 comments on commit 31eef45

Please sign in to comment.