Skip to content

Commit

Permalink
#250 - Convenience for setting the document language
Browse files Browse the repository at this point in the history
- Update documentation
  • Loading branch information
reckart committed Feb 4, 2024
1 parent 8260333 commit f604231
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 9 deletions.
45 changes: 38 additions & 7 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,10 @@ Usage

Example CAS XMI and types system files can be found under :code:`tests\test_files`.

Loading a CAS
~~~~~~~~~~~~~
.. _reading_a_cas_file:

Reading a CAS file
~~~~~~~~~~~~~~~~~~

**From XMI:** A CAS can be deserialized from the UIMA CAS XMI (XML 1.0) format either
by reading from a file or string using :code:`load_cas_from_xmi`.
Expand All @@ -98,8 +100,10 @@ Most UIMA JSON CAS files come with an embedded typesystem, so it is not necessar
with open('cas.json', 'rb') as f:
cas = load_cas_from_json(f)
Writing a CAS
~~~~~~~~~~~~~
.. _writing_a_cas_file:

Writing a CAS file
~~~~~~~~~~~~~~~~~~

**To XMI:** A CAS can be serialized to XMI either by writing to a file or be
returned as a string using :code:`cas.to_xmi()`.
Expand All @@ -126,6 +130,30 @@ returned as a string using :code:`cas.to_xmi()`.
# Written to file
cas.to_json("my_cas.json")
.. _creating_a_cas:

Creating a CAS
~~~~~~~~~~~~~~

A CAS (Common Analysis System) object typically represents a (text) document. When using cassis,
you will likely most often :ref:`reading <reading_a_cas_file>` existing CAS files, modify them and then
:ref:`writing <writing_a_cas_file>` them out again. But you can also create CAS objects from scratch,
e.g. if you want to convert some data into a CAS object in order to create a pre-annotated text.
If you do not have a pre-defined typesystem to work with, you will have to :ref:`define one <creating_a_typesystem>`.

.. code:: python
typesystem = TypeSystem()
cas = Cas(
sofa_string = "Joe waited for the train . The train was late .",
document_language = "en",
typesystem = typesystem)
print(cas.sofa_string)
print(cas.sofa_mime)
print(cas.document_language)
Adding annotations
~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -237,6 +265,8 @@ The same goes for setting:
assert lst["tail.tail.head"] == "newer_baz"
.. _creating_a_typesystem:

Creating types and adding features
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -269,12 +299,13 @@ properties of the Sofa can be read and written:

.. code:: python
cas = Cas()
cas.sofa_string = "Joe waited for the train . The train was late ."
cas.sofa_mime = "text/plain"
cas = Cas(
sofa_string = "Joe waited for the train . The train was late .",
document_language = "en")
print(cas.sofa_string)
print(cas.sofa_mime)
print(cas.document_language)
Array support
~~~~~~~~~~~~~
Expand Down
6 changes: 4 additions & 2 deletions tests/test_documentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,7 @@ def test_readme_is_proper_rst():
with path_to_readme.open() as f:
rst = f.read()

errors = list(rstcheck.check(rst))
assert len(errors) == 0, "; ".join(str(e) for e in errors)
errors = [str(e) for e in list(rstcheck.check(rst))]
# https://github.com/rstcheck/rstcheck-core/issues/4
errors = [s for s in errors if "Hyperlink target" not in s]
assert len(errors) == 0, "; ".join(errors)

0 comments on commit f604231

Please sign in to comment.