Translation Alignments: Port translation alignment endpoints from Flask app to ATLAS #4

jacobwegner · 2019-07-05T19:56:32Z

Models translation alignments in ATLAS and provides hooks to import alignment data and query TextAlignmentChunk data by reference.

Each TextAlignmentChunk has both start and end Line FKs as well as a contains property that can be used to retrieve all TextPart instances within the range.

importers.alignments handles importing the data from Homeric Iliad and Odyssey aligned at the sentence level to the A. T. Murray Translation.

When retrieving TextAlignmentChunk instances, the actual alignment data is stored in an items JSON field with the following format:

[
  [
    // the Greek line refeference, e.g. 1.1-1.7 or 1.8
    "<bookPosition>.<linePosition>"
    // the Greek text content
    "<content>",
    // a valid value enumerated by the `LINE_KIND_<foo>` constants 
    // that indicates if the line is a new line, is continued by 
    // the next line, or is a continuation of the previous line
    "<continuation-data>"
  ],
  [
    // the English line refeference, e.g. 1.1-1.7 or 1.8
    "<bookPosition>.<linePosition>"
    // the English text content
    "<content>",
    // the value enumerated as LINE_KIND_UNKNOWN (since the)
    // Greek lines are being aligned to the English sentences
    null
  ],
]

jacobwegner · 2019-07-05T20:11:41Z

This is being ported from the examples provided at:

scaife-viewer/readhomer#25

and implemented within:

https://github.com/scaife-viewer/readhomer/blob/d9f8b5cc36af2c7fa99f2d703aa3d9a46b99317a/src/homer/api.js#L21

jacobwegner · 2019-07-08T22:37:17Z

I've pushed, deployed and loaded all alignment data. Still working out the best way to support a "reference" based query for alignment chunks like was done in scaife-viewer/readhomer#25

jtauber · 2019-07-09T12:33:51Z

Still working out the best way to support a "reference" based query for alignment chunks

I think this is an example of a more general pattern where you want to retrieve As but you want to specify the range of As in terms of some other referencing scheme based on Bs.

Other examples might include, "give me the sentences in John 3.1–3.11" or give me the Beowulf fitts for lines 1000–1500.

I think the basic algorithm is: (1) work out the (first) fitt containing line 1000; (2) work out the (last) fitt containing line 1500; (3) use those two fitts as the "new" range returned. We might want to annotate the results with the fact it was actually lines 1000 to 1500 that were requests (and the two chunking systems aren't perfectly aligned.

Obviously that means we need to index to support (1) and (2).

Anyway, IMO all this applies to translation units containing other referencing schemes too.

jacobwegner · 2019-07-09T17:52:05Z

@jtauber I agree; my original comment could have been clearer in that I was trying to figure out how to support a reference-based query in GraphQL / graphene-django.

I'd already set up a contains relation and have begun implementing a custom filter in
42cecdc:

Sample query

{
  alignmentChunks(version_Urn: "urn:cts:greekLit:tlg0012.tlg001.perseus-grc2", reference: "1.1-1.6",) {
    edges {
      cursor
      node {
        id
        citation
        items
      }
    }
  }
}

Will still want to work out the "annotation" approach for GraphQL as well. The Flask-based API had a metadata entry for reference_resolved:

https://readhomer-dev-api.herokuapp.com/urn:cts:greekLit:tlg0012.tlg001.perseus-grc2/alignment/eng/1.1-1.6/

fixes an issue where we had to define filter_fields even with an explicit filterset_class was in use

# Conflicts: # readhomer_atlas/library/admin.py # readhomer_atlas/library/schema.py # requirements.txt

jacobwegner · 2019-11-14T23:06:39Z

I've updated this branch based on some things we've worked on with KITAB's ATLAS server.

LineReferenceFilterMixin is used to implement a "range" query (rather than relying on a contains relation).

# Conflicts: # README.md # readhomer_atlas/library/admin.py # readhomer_atlas/library/importers/versions.py # readhomer_atlas/library/migrations/0001_initial.py # readhomer_atlas/library/models.py # readhomer_atlas/library/schema.py

we had a a "contains" relation ("lines") that isn't performant: 345b87e took ingestion from ~7s to ~45s this allows us to do contains queries (and would likely allow further metadata such as "resolved")

jacobwegner added 7 commits July 5, 2019 11:11

improve admin

2028e6b

first pass at a VersionAlignment model

ca306b4

first pass at alignment chunk

7ef593a

update schema for referencing alignments from Line

14b661b

add alignment chunks to schema; add sample queries

f0df5b6

update fixture

aad9d8a

fix fixture

0eb39b4

port alignment chunk loading from the Flask app

b7c63c0

jacobwegner added 2 commits July 9, 2019 12:24

first pass at implementing a reference-based filter

42cecdc

ensure distinct

1b75b71

quick pass at reference filter for lines endpoint

cd1c64a

jacobwegner changed the base branch from feature/importers to master July 16, 2019 18:38

jacobwegner added 4 commits August 20, 2019 09:34

upgrade to latest graphene-django

d13a172

fixes an issue where we had to define filter_fields even with an explicit filterset_class was in use

Merge branch 'master' into feature/translation-alignments

c37b1d6

# Conflicts: # readhomer_atlas/library/admin.py # readhomer_atlas/library/schema.py # requirements.txt

update for SQLite

da55a61

update db prep script

0cd2604

jacobwegner added 4 commits November 14, 2019 17:08

drop contains relation in favor of a computed property

345b87e

switch to bulk create

14ea562

set batch_size

6c448c9

factor out common functionality as LineReferenceFilterMixin

6ee0bd2

jacobwegner force-pushed the feature/translation-alignments branch from 6469882 to 6ee0bd2 Compare November 14, 2019 23:09

jacobwegner added 2 commits November 15, 2019 10:21

update README for database changes and ref filter

fef8bc9

update comments

e906234

jacobwegner changed the title ~~WIP: Port translation alignment endpoints from Flask app~~ Port translation alignment endpoints from Flask app Nov 15, 2019

jacobwegner requested a review from jtauber November 15, 2019 16:23

jacobwegner added 5 commits March 16, 2020 08:50

reset migrations

7cc4efc

fix importer

3e80aec

fix alignments importer

6e1a8b7

restore alignment chunk filtering

b554f8a

jacobwegner had a problem deploying to explorehomer-feature-tr-vlxhmh March 16, 2020 16:30 Failure

Merge branch 'develop' into feature/translation-alignments

c5874b3

jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 16:40 Inactive

jacobwegner changed the base branch from master to develop March 19, 2020 16:40

jacobwegner added 2 commits March 19, 2020 11:43

fix failing tests

e62a5f6

ignore .coverage

e48a816

jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 16:44 Inactive

fix up migrations

3460898

jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 17:05 Inactive

jacobwegner temporarily deployed to explorehomer-ta-stable March 19, 2020 17:24 Inactive

jacobwegner temporarily deployed to explorehomer-ta-stable March 19, 2020 17:26 Inactive

fix AlignmentChunk.contains

2c9e604

jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 17:34 Inactive

jacobwegner added 2 commits March 19, 2020 12:41

prefer TextAlignment and TextAlignmentChunk

bf36d5d

don't assume text part chunk kind

aaafcf6

jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 19, 2020 17:44 Inactive

filter chunks that contain a reference

8439315

we had a a "contains" relation ("lines") that isn't performant: 345b87e took ingestion from ~7s to ~45s this allows us to do contains queries (and would likely allow further metadata such as "resolved")

jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh March 23, 2020 18:04 Inactive

Merge branch 'develop' into feature/translation-alignments

69d2846

jacobwegner temporarily deployed to explorehomer-feature-tr-vlxhmh April 3, 2020 20:12 Inactive

jacobwegner changed the title ~~Port translation alignment endpoints from Flask app~~ Translation Alignments: Port translation alignment endpoints from Flask app to ATLAS Apr 3, 2020

jacobwegner merged commit 040e7d2 into develop Apr 3, 2020

jacobwegner mentioned this pull request Apr 23, 2020

Backport translation web annotations from hmt-cite-atlas #30

Merged

jacobwegner deleted the feature/translation-alignments branch April 24, 2020 18:51

jacobwegner mentioned this pull request Jun 6, 2020

Prepare for v2020-06-06-001 release #50

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translation Alignments: Port translation alignment endpoints from Flask app to ATLAS #4

Translation Alignments: Port translation alignment endpoints from Flask app to ATLAS #4

jacobwegner commented Jul 5, 2019 •

edited

Loading

jacobwegner commented Jul 5, 2019

jacobwegner commented Jul 8, 2019

jtauber commented Jul 9, 2019

jacobwegner commented Jul 9, 2019 •

edited

Loading

jacobwegner commented Nov 14, 2019

Translation Alignments: Port translation alignment endpoints from Flask app to ATLAS #4

Translation Alignments: Port translation alignment endpoints from Flask app to ATLAS #4

Conversation

jacobwegner commented Jul 5, 2019 • edited Loading

jacobwegner commented Jul 5, 2019

jacobwegner commented Jul 8, 2019

jtauber commented Jul 9, 2019

jacobwegner commented Jul 9, 2019 • edited Loading

jacobwegner commented Nov 14, 2019

jacobwegner commented Jul 5, 2019 •

edited

Loading

jacobwegner commented Jul 9, 2019 •

edited

Loading