Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alt text not clearly distinguishable from body text #1983

Open
Minnemann-dzblesen opened this issue Aug 24, 2023 · 5 comments
Open

Alt text not clearly distinguishable from body text #1983

Minnemann-dzblesen opened this issue Aug 24, 2023 · 5 comments

Comments

@Minnemann-dzblesen
Copy link

Minnemann-dzblesen commented Aug 24, 2023

During our manual accessibility testing of an EPUB, we encountered the following problem:
the alt text to an image is well announced with Thorium's integrated read aloud feature. However, it does NOT announce that it is an image. This might be pretty confusing for readers with impairments. The blind reader should recognize that this is an image text.

In combination NVDA or Jaws with Thorium (which is certainly not meant to be), "image" (or also "image end") is announced after or before the alt text in each case. Also the screen reader integrated in Windows or Voice Over with Apple Books announce that it is an image. We have tested with Thorium 2.2.0.

Or are we doing something wrong? Thanks.

@danielweck
Copy link
Member

Hello, Thorium doesn't inject additional prompts, the TTS readaloud experience is based purely on authored alternate text / accessible descriptions.
If Thorium injected speech prompts at runtime, would they need to match the language of the user, or of the publication metadata, or of the text content itself? I assume the user locale, so there could be a discrepancy with authored language

@gautierchomel
Copy link
Member

Team discussions so far on the subject:

  1. Some users will want additional semantics, but others will prefer a lighter reading experience, meaning that implementation should provide a verbose parameter (as in assistive technologies).
  2. If we start with Images we'll have to discuss about Notes, Asides, Emphasizes, etc.
  3. Different levels of semantics will be needed, HTML and ARIA at least (we must be able to differentiate images with roles).
  4. Translations will be needed

The unresponded question behind is "shall we consider Thorium TTS feature as an AT".

On the positive side it would allow for a strong higlight of semantics (meaning that it makes sense to add them for producers) and will certainly serve a lot of users.

Still, the effort to design, add and maintain is to be considered and dedicated funding shall be found.

@gautierchomel
Copy link
Member

Also to consider page numbers :) see related issue #1974 and discussions #1951 and #1799.

@tedvandertogt
Copy link
Contributor

I remember from narrating books as a volunteer, that we had a whole set of rules including how to handle special elements (like authors, page number, images, footnotes), If a user relies on audio alone, he or she will easily get confused without those prompts. Depending on the type of user, the desirable UX may differ. I suggest to draft user requirements for several user types, and decide how to address these without introducing too much complexity

@Minnemann-dzblesen
Copy link
Author

Many thanks for the numerous replies, the good discussion and thoughts on this and even if there is no prompt solution to this, the drive that comes in here. We as a German institution and also many other European institutions often and primarily recommend Thorium to publishers (to check their EPUBs in fact of the approaching EAA), so a comprehensible and high quality is important to us, but I am sure about that in the process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants