-
Notifications
You must be signed in to change notification settings - Fork 31
Description
VTT Metadata Cue format is ambiguous; some metadata may be unintentionally presented to the user in a context outside HTML.
Consider clarifying that metadata cues SHOULD or MUST be formatted as one or more unambiguous patterns. JSON is the obvious one, to retain backwards compatibility with the JSON usage documented in the VTT spec, but there may be others.
WEBVTT
1
00:00:10.123 --> 00:00:15.432
{
key: "value"
}
Background
§ 4.2.1. WebVTT metadata text (Normative) defines metadata text as:
WebVTT metadata text consists of any sequence of zero or more characters other than U+000A LINE FEED (LF) characters and U+000D CARRIAGE RETURN (CR) characters, each optionally separated from the next by a WebVTT line terminator. (In other words, any text that does not have two consecutive WebVTT line terminators and does not start or end with a WebVTT line terminator.)
WebVTT metadata text cues are only useful for scripted applications (e.g. using the metadata text track kind in a HTML text track).
§ 1.7. Metadata example (Informative) clarifies:
A WebVTT file can consist of time-aligned metadata.
Metadata can be any string and is often provided as a JSON construct.
Problem
"Metadata can be any string" results in a format that is ambiguous, and therefore may be presented to the user unintentionally.
In an HTML <video>
element, this ambiguity is resolved by the author providing a kind="metadata"
attribute on the text track.
But there isn't a logical place to duplicate this disambiguation in some other VTT contexts, including when they are embedded in some media container formats.
Proposed Solution
Consider clarifying that metadata cues SHOULD or MUST be formatted as one or more unambiguous patterns. JSON is the obvious one, to retain backwards compatibility with the JSON usage documented in the VTT spec, but there may be others.
Additional context for this change in the following issue.