-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renditions List proposal #11
base: main
Are you sure you want to change the base?
Conversation
just realized this is missing recommendations on how it should get populated from hls and dash content, like the other proposals do. |
readonly attribute unsigned long height; | ||
readonly attribute unsigned long bitrate; | ||
readonly attribute unsigned long frameRate; | ||
readonly attribute unsigned long codec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be only the video codec? How to handle differing audio tracks for otherwise similar renditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, this should be a DOMString or something more relevant rather than a long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The easier path forward would be to simply omit any audio codec. The more robust would be to identify whether or not the audio track (in the container sense) is in-mux with the video track (in the container sense). If yes, we list both. If no, (e.g. with AUDIO groups for HLS or separate audio AdaptationSets for MPEG-DASH) we don't include the audio codec. I'm cool with either approach here.
Related: Maybe this should be the mime codec instead? That would make it consistent with: https://developer.mozilla.org/en-US/docs/Web/API/MediaSource/addSourceBuffer#parameters & https://developer.mozilla.org/en-US/docs/Web/HTML/Element/source#attributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, maybe it should be type
and match the source element type attribute and the addSourceBuffer
param. Consistency is a plus, though, it does make it a bit more annoying since then users would need to parse out the codecs from the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still planning on a more thorough review of this. Tl;dr - broad strokes this is 👌 and everything else is getting into the weeds for final approval (copy edit, clear definitions, implementation recommendations for HLS/DASH, etc. etc.)
@@ -0,0 +1,172 @@ | |||
- Feature Name: renditions_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick(blocking): We've been using "human readable" names here. Consider:
- Feature Name: renditions_list | |
- Feature Name: Renditions List |
readonly attribute unsigned long height; | ||
readonly attribute unsigned long bitrate; | ||
readonly attribute unsigned long frameRate; | ||
readonly attribute unsigned long codec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The easier path forward would be to simply omit any audio codec. The more robust would be to identify whether or not the audio track (in the container sense) is in-mux with the video track (in the container sense). If yes, we list both. If no, (e.g. with AUDIO groups for HLS or separate audio AdaptationSets for MPEG-DASH) we don't include the audio codec. I'm cool with either approach here.
Related: Maybe this should be the mime codec instead? That would make it consistent with: https://developer.mozilla.org/en-US/docs/Web/API/MediaSource/addSourceBuffer#parameters & https://developer.mozilla.org/en-US/docs/Web/HTML/Element/source#attributes
sorry for the late review, I'd scanned it over before but this all looks great to me! don't see any issues to not start development on this |
readonly attribute unsigned long bitrate; | ||
readonly attribute unsigned long frameRate; | ||
readonly attribute unsigned long codec; | ||
attribute boolean enabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes more sense to be in line with the track API and be named selected
.
enabled
is used in the AudioTrack where multiple audio tracks can be enabled.
selected
is used when only 1 track can be selected like a VideoTrack.
when a rendition is chosen for a video track, the old rendition is deselected hence...
the name selected
scratch that, enabled
makes sense now. if a manual rendition is selected all others will be disabled.
so by default most will be enabled in the typical use case.
but we'll want to know which one the ABR algo has activated if all renditions are enabled.
we might need both enabled and selected or some property that represents the active state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great v1 of this @gkatsev. 👍
I'm interested to debate the need for enabling/disabling video tracks in this API. See related comment.
After seeing @luwes's implementation of this, the renditions attached to tracks leads to some pretty complex stacking of event listeners, first listening for track changes then rendition changes on top of that. I want to explore surfacing videoEl.videoRenditions
more. I see the issue of potentially creating two paths to the same concept, but I worry that the complexity of the current version might actually keep it from getting adopted more widely.
I sketched up this after a conversation with Wes, as what would feel like an easier world. Essentially videoRenditions would only list the renditions of the currently selected video track, and would not be configurable.
Building a quality selector menu
const renditionList = video.videoRenditions;
renditionList.addEventListener('addrendition', () => {
// ...empty <option> list to rebuild it?
// Reordering will probably be needed
// when adding/removing renditions...
// ...add "auto" <option>...
if (renditionList.selectedIndex == -1) {
// ...display "auto" <option> as selected...
}
const options = Array.prototype.sort.map(renditionList, (rendition, index) => {
// ...build <option> ...
option.renditionHeight = rendition.height
option.value = index
if (index == renditionList.selectedIndex) {
option.selected = true;
}
});
const sortedOptions = Array.prototype.sort.call(options, (a, b) => {
a.renditionHeight > b.renditionHeight
});
// ...add options to DOM...
});
Selecting a specific rendition
video.videoRenditions.selectedIndex = option.value;
|
||
The main addition in this proposal is a `VideoRenditionList` which augments the [`VideoTrack`](https://html.spec.whatwg.org/multipage/media.html#videotrack) object API. This `VideoRenditionList` is a list of `VideoRendition`s which include the minimum set of information that's request by HAS formats. In addition, it includes a way to turn each individual rendition on and off. | ||
|
||
For example, if I wanted to limit a video to only play back SD content, I can loop through the `VideoRenditionList`, and enable only the renditions that have a height less than 720. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this solving a specific problem related to building the UI? It adds some complexity to the API, both the UI side and for player developers building support for this. This API would essentially dictate how player developers expose configurability of adaptivity, and I'm not sure we should do that. For example <hls-video>
could alternatively do <hls-video max-resolution-height="720">
, which I'd expect to limit the rendition set before it shows up in the renditionList. That may not be exactly how it's done, but it makes me want to keep this API more "read only" when it comes to what renditions are available.
Reading further into your notes below, it sounds like HLS.js wouldn't support this out of the box?
It is only possible to set a level cap
I've had a few conversations IRL on this, including philosophical ones about this project's purpose in general. Starting with the tl;dr:
The result then being a simple On to the broader conversations... There's one related topic about the sometimes competing goals of "solve APIs only for UI purposes (intentionally avoiding the deeper complexities of video)" and "create APIs that the W3C might some day adopt". Because sometimes the video element APIs support more than what the UI might require. In this case, building on videoTracks is intuitive because of the natural data structure of video tracks and renditions, however quality selection UIs don't need that level of complexity. The UI doesn't need to know the available renditions of a non-active video track, or even what track they belong to. It just needs to know what renditions the viewer should be able to choose from, and what's currently selected. If it made no difference to the design, great! But the stacking events/listeners on videoTracks and videoRenditions makes the existing shape complicated to build on and implement. After this conversation I think the W3C adoption of these APIs still needs to be a consideration, but only a secondary concern. Otherwise it can distract from the UI problems we're specifically trying to solve for, and open the APIs back up to video technology complexity we're trying to avoid. Ultimately meaning these specs take a lot longer to ship, and [I think] puts them at risk of not actually being adopted by other video player projects. The other related topic was about where in the stack this API lives. Is the list a direct representation of the renditions available in the manifest, or does the list only represent what renditions are available after configuration decisions have been applied. i.e. is this API seeing what HLS.js see and in a place to tell HLS.js what renditions it should choose from, or is HLS.js telling this API what to list after it's imposed a resolution cap (for example). This is related to the feature of enabling/disabling renditions within the list, changing if a rendition can be chosen. This is another situation where the UI doesn't need the feature of enabling/disabling tracks, so it's going deeper than the first goal and only there for the second goal. The main issue I see with this feature is it's imposing a configuration interface on players that they may not be able to support easily. For that reason alone I think we shouldn't include the feature in the first version of this spec at least. But I also think configuration of which renditions are available should just be up to players to provide. Feel free to push back on any of that. But the general conclusions I'm coming to are:
|
EDIT: Apologies, I know this isn't the place for this conversation. Happy to delete and move where appropriate.
I'm completely with you on this one Steve. W3C adoption is terribly slow (with good reason) but it gets in the way of innovation and exploration. I also think it's hard to standardize APIs we haven't explored pratically, things we just won't know without real world experimentation and user feedback. A simple process in my mind would be come up with a list of media player UI/UX requirements which we can have conversations around, design API's solely around that with a backing spec around the intricate/important video details, and then ship solutions to these problems in some package that we can all install and start using (e.g., import { VideoRendtionList, ... } from 'media-extensions'
You're also on point about the goal difference. At the UI-layer we just want what gets the job done and leads to best UX, don't care about intricate details at that level. Extremely likely we won't expose the same primitives to our users. Browser APIs are generally shaped in the form of authoring, and as library authors we adapt them for consuming. There's always a gap. If we can just figure out what library authors need to build great UI/UX first, ship the code to start experimenting, we can then work backwards to primtives that might make sense internally. I think W3C should be the last step in the proposal process/stages.
Agreed. My personal feedback is technical video conversations/specs are amazing and clearly required as they form the basis on why certain higher-level API decisions were made, but ultimately the main deliverable should be simplified and answer, "what player UI/UX challenge is this solving and how?" Preferably with code, docs, and examples. Priority is getting something we can start experimenting with out the door fast. I'm personally okay with mistakes, semver and proposal stages can iron that out.
Agreed. Top-level conversations should be centered around UI/UX. Easier for people to chime in and easier/faster adoption. |
Thanks for weighing in @mihar-22! And good reframing of the situation. I like the idea of W3C adoption being the last step, with real world implementation coming first, rather than projecting ahead. We have a PR in media chrome exploring an implementation of this. Want to try it out in Vidstack and see if it works in that context? |
Closes #1