Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added transcription helpers for extracting text from a canvas #15

Merged
merged 36 commits into from
May 27, 2024

Conversation

stephenwf
Copy link
Member

@stephenwf stephenwf commented Apr 23, 2024

Transcription helper.

Will find the following transcriptions:

  • VTT as rendering on canvas
  • Embedded Annotation page
  • External Annotation page
  • ALTO annotations (FUTURE)

Cookbook:

Plaintext rendering on canvas:

"rendering": [
  {
    "id": "https://fixtures.iiif.io/video/indiana/volleyball/volleyball.txt",
    "type": "Text",
    "label": {
      "en": [
        "Transcript"
      ]
    },
    "format": "text/plain"
  }
]

VTT annotation body on AV canvases:

"annotations": [
  {
    "id": "https://iiif.io/api/cookbook/recipe/0219-using-caption-file/canvas/page2",
    "type": "AnnotationPage",
    "items": [
      {
        "id": "https://iiif.io/api/cookbook/recipe/0219-using-caption-file/canvas/page2/a1",
        "type": "Annotation",
        "motivation": "supplementing",
        "body": {
          "id": "https://fixtures.iiif.io/video/indiana/lunchroom_manners/lunchroom_manners.vtt",
          "type": "Text",
          "format": "text/vtt",
          "label": {
            "en": [
              "Captions in WebVTT format"
            ]
          },
          "language": "en"
        },
        "target": "https://iiif.io/api/cookbook/recipe/0219-using-caption-file/canvas"
      }
    ]
  }
]

OCR annotations:

  • a motivation of supplementing,
  • the URI of the OCR file in the id property of the Annotation body, and
  • the target set to the applicable Canvas.
{
  "id": "https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-anno_p1.json-1",
  "type": "Annotation",
  "motivation": "supplementing",
  "body": {
    "type": "TextualBody",
    "format": "text/plain",
    "language": "de",
    "value": "I. 54. Jahrgang"
  },
  "target": {
    "type": "SpecificResource",
    "source": {
      "id": "https://iiif.io/api/cookbook/recipe/0068-newspaper/canvas/p1",
      "type": "Canvas",
      "partOf": [
        {
          "id": "https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-manifest.json",
          "type": "Manifest"
        }
      ]
    },
    "selector": {
      "type": "FragmentSelector",
      "conformsTo": "http://www.w3.org/TR/media-frags/",
      "value": "xywh=0,376,399,53"
    }
  }
}

OR
Linking Directly to an ALTO File. (FUTURE, NOT IMPLEMENTED)

"rendering": [
  {
    "id": "https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-alto_p2.xml",
    "type": "Text",
    "format": "application/xml",
    "profile": "http://www.loc.gov/standards/alto/",
    "label": {
      "en": [
        "ALTO XML"
      ]
    }
  }
],

It will produce a standard format for both temporal and plaintext/positional plaintext, including selectors.

interface Transcription {
  id: string;
  source: any;
  plaintext: string;
  segments: Array<{
    text: string;
    textRaw: string;
    granularity?: 'word' | 'line' | 'paragraph' | 'block' | 'page';
    language?: string;
    selector?: ParsedSelector;
    startRaw?: string;
    endRaw?: string;
  }>;
}

ParsedSelector include spatial and temporal information. Either from an annotation or from VTT (very simple parsing at the moment - external libraries for it are heavy). If there is just plaintext by itself, then there are no segments.

A viewer could start with just showing the plaintext, and then implement optional segments later.

Some new helpers too:

  • canvasHasTranscriptionSync() - checks if there is a transcription on a canvas without making any network requests
  • canvasLoadExternalAnnotationPages() loads and waits for external Annotation Pages
  • annotationPageToTranscription() - actual code for fetching the transcription - will also fetch all annotation pages. Recommended to use with Vault (to avoid multiple requests).

Copy link

codesandbox-ci bot commented Apr 23, 2024

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

@stephenwf
Copy link
Member Author

At the moment, we are losing track of the Annotation target when parsing. It will very likely be the Canvas, but it could be

  • Canvas ID
  • Media id (complex timeline)
  • Choice ID (indicating it works with all choices)

And clients might need to check when they are providing navigation using the selector that it's got the right target.

@stephenwf
Copy link
Member Author

Also need to pass in a language, so that the transcription can check for choices structured like this:
https://iiif.io/api/cookbook/recipe/0074-multiple-language-captions/

@stephenwf
Copy link
Member Author

This still needs more testing, will leave open.

@stephenwf stephenwf marked this pull request as ready for review May 27, 2024 21:54
@stephenwf stephenwf merged commit d7dee09 into main May 27, 2024
3 checks passed
@stephenwf stephenwf deleted the feature/transcription-helpers branch May 27, 2024 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant