Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation Alignments: Port translation alignment endpoints from Flask app to ATLAS #4

Merged
merged 37 commits into from
Apr 3, 2020

Conversation

jacobwegner
Copy link
Collaborator

@jacobwegner jacobwegner commented Jul 5, 2019

Models translation alignments in ATLAS and provides hooks to import alignment data and query TextAlignmentChunk data by reference.

Each TextAlignmentChunk has both start and end Line FKs as well as a contains property that can be used to retrieve all TextPart instances within the range.

importers.alignments handles importing the data from Homeric Iliad and Odyssey aligned at the sentence level to the A. T. Murray Translation.

When retrieving TextAlignmentChunk instances, the actual alignment data is stored in an items JSON field with the following format:

[
  [
    // the Greek line refeference, e.g. 1.1-1.7 or 1.8
    "<bookPosition>.<linePosition>"
    // the Greek text content
    "<content>",
    // a valid value enumerated by the `LINE_KIND_<foo>` constants 
    // that indicates if the line is a new line, is continued by 
    // the next line, or is a continuation of the previous line
    "<continuation-data>"
  ],
  [
    // the English line refeference, e.g. 1.1-1.7 or 1.8
    "<bookPosition>.<linePosition>"
    // the English text content
    "<content>",
    // the value enumerated as LINE_KIND_UNKNOWN (since the)
    // Greek lines are being aligned to the English sentences
    null
  ],
]

@jacobwegner
Copy link
Collaborator Author

This is being ported from the examples provided at:

scaife-viewer/readhomer#25

and implemented within:

https://github.com/scaife-viewer/readhomer/blob/d9f8b5cc36af2c7fa99f2d703aa3d9a46b99317a/src/homer/api.js#L21

image

@jacobwegner
Copy link
Collaborator Author

I've pushed, deployed and loaded all alignment data. Still working out the best way to support a "reference" based query for alignment chunks like was done in scaife-viewer/readhomer#25

@jtauber
Copy link
Member

jtauber commented Jul 9, 2019

Still working out the best way to support a "reference" based query for alignment chunks

I think this is an example of a more general pattern where you want to retrieve As but you want to specify the range of As in terms of some other referencing scheme based on Bs.

Other examples might include, "give me the sentences in John 3.1–3.11" or give me the Beowulf fitts for lines 1000–1500.

I think the basic algorithm is: (1) work out the (first) fitt containing line 1000; (2) work out the (last) fitt containing line 1500; (3) use those two fitts as the "new" range returned. We might want to annotate the results with the fact it was actually lines 1000 to 1500 that were requests (and the two chunking systems aren't perfectly aligned.

Obviously that means we need to index to support (1) and (2).

Anyway, IMO all this applies to translation units containing other referencing schemes too.

@jacobwegner
Copy link
Collaborator Author

jacobwegner commented Jul 9, 2019

@jtauber I agree; my original comment could have been clearer in that I was trying to figure out how to support a reference-based query in GraphQL / graphene-django.

I'd already set up a contains relation and have begun implementing a custom filter in
42cecdc:

Sample query

{
  alignmentChunks(version_Urn: "urn:cts:greekLit:tlg0012.tlg001.perseus-grc2", reference: "1.1-1.6",) {
    edges {
      cursor
      node {
        id
        citation
        items
      }
    }
  }
}

Will still want to work out the "annotation" approach for GraphQL as well. The Flask-based API had a metadata entry for reference_resolved:

https://readhomer-dev-api.herokuapp.com/urn:cts:greekLit:tlg0012.tlg001.perseus-grc2/alignment/eng/1.1-1.6/

@jacobwegner jacobwegner changed the base branch from feature/importers to master July 16, 2019 18:38
fixes an issue where we had to define filter_fields even with an explicit filterset_class
was in use
# Conflicts:
#	readhomer_atlas/library/admin.py
#	readhomer_atlas/library/schema.py
#	requirements.txt
@jacobwegner
Copy link
Collaborator Author

I've updated this branch based on some things we've worked on with KITAB's ATLAS server.

LineReferenceFilterMixin is used to implement a "range" query (rather than relying on a contains relation).

@jacobwegner jacobwegner force-pushed the feature/translation-alignments branch from 6469882 to 6ee0bd2 Compare November 14, 2019 23:09
@jacobwegner jacobwegner changed the title WIP: Port translation alignment endpoints from Flask app Port translation alignment endpoints from Flask app Nov 15, 2019
@jacobwegner jacobwegner requested a review from jtauber November 15, 2019 16:23
# Conflicts:
#	README.md
#	readhomer_atlas/library/admin.py
#	readhomer_atlas/library/importers/versions.py
#	readhomer_atlas/library/migrations/0001_initial.py
#	readhomer_atlas/library/models.py
#	readhomer_atlas/library/schema.py
@jacobwegner jacobwegner had a problem deploying to explorehomer-feature-tr-vlxhmh March 16, 2020 16:30 Failure
@jacobwegner jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 16:40 Inactive
@jacobwegner jacobwegner changed the base branch from master to develop March 19, 2020 16:40
@jacobwegner jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 16:44 Inactive
@jacobwegner jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 17:05 Inactive
@jacobwegner jacobwegner temporarily deployed to explorehomer-ta-stable March 19, 2020 17:24 Inactive
@jacobwegner jacobwegner temporarily deployed to explorehomer-ta-stable March 19, 2020 17:26 Inactive
@jacobwegner jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 17:34 Inactive
@jacobwegner jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 17:44 Inactive
we had a a "contains" relation ("lines") that isn't performant:

345b87e

took ingestion from ~7s to ~45s

this allows us to do contains queries (and would likely allow further metadata
such as "resolved")
@jacobwegner jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 23, 2020 18:04 Inactive
@jacobwegner jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh April 3, 2020 20:12 Inactive
@jacobwegner jacobwegner changed the title Port translation alignment endpoints from Flask app Translation Alignments: Port translation alignment endpoints from Flask app to ATLAS Apr 3, 2020
@jacobwegner jacobwegner merged commit 040e7d2 into develop Apr 3, 2020
@jacobwegner jacobwegner deleted the feature/translation-alignments branch April 24, 2020 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants