Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort formats [720p codec, viewing bitrate ~2 Mbit/s] #60

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

deldesir
Copy link
Collaborator

@deldesir deldesir commented Dec 7, 2023

This is based on this discussion and designed to download the best available format that is free (i.e webm), has a resolution of 720p, and has an average bitrate of audio and video up to 2000 kbps. It allows for flexibility by considering various formats but still prioritizes quality aspects like resolution and bitrate.

--prefer-free-formats: Prefer video formats with free containers over non-free ones of same quality

--format='best': This option specifies the initial format to consider. In this case, it's set to 'best', which tells yt-dlp to initially consider the best available quality for both video and audio.

--format-sort='res:720,tbr~2000': This option instructs yt-dlp to sort the available formats based on a 720p resolution (res:720) and the average of audio and video bitrate up to 2000 kbps (tbr~2000).

--prefer-free-formats: Prefer video formats with free containers over non-free ones of same quality

--format: Video format code

--format-sort: Sort the formats by the fields given, see https://github.com/yt-dlp/yt-dlp#sorting-formats for more details
@deldesir deldesir added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 7, 2023
@deldesir deldesir requested a review from holta December 7, 2023 03:30
@deldesir deldesir self-assigned this Dec 7, 2023
@deldesir
Copy link
Collaborator Author

deldesir commented Dec 7, 2023

Thanks, @chapmanjacobd, for recommending sorting formats! Would appreciate your thoughts – do you think I'm on the right track?

@holta
Copy link
Member

holta commented Dec 7, 2023

Looks amazing! Building on:

FYI @deldesir recommends merging this first:

@holta holta changed the title Sort formats Sort formats [720p codec, target ~2000 Mbit Dec 7, 2023
@holta holta changed the title Sort formats [720p codec, target ~2000 Mbit Sort formats [720p codec, viewing bitrate ~2 Mbit/s] Dec 7, 2023
@deldesir
Copy link
Collaborator Author

deldesir commented Dec 7, 2023

  • Is this tested?

Yes, on Ubuntu 24.04 (10.8.0.26)

  • What additional testing would be most useful from volunteers who can assist?

Perform two separate downloads: one without incorporating this pull request and another with it. Conduct a thorough comparison of the results and repeat the process with various other video URLs to effectively battle test it.

@holta
Copy link
Member

holta commented Dec 7, 2023

prefers formats with a resolution of 720p (res:720)

Prefers means... what exactly in this situation:

Will it occasionally download 480p or 1080p possibly?

target bitrate up to 2000 kbps (tbr~2000).

If a video on YouTube/Vimeo offers only 2001+ kbit/s just above 2000 kbit/s — what will be downloaded?

(Nothing at all?)

@chapmanjacobd
Copy link

chapmanjacobd commented Dec 7, 2023

Thanks, @chapmanjacobd ... do you think I'm on the right track?

Yeah I agree with holta, I think --prefer-free-formats is an ideal choice and its interaction with --format-sort is unknown to me.

But I do know that using --format-sort instead of changing --format from the default bestvideo*+bestaudio/best is likely the right way to go. Although... the more that I read the documentation the less clear it is to me how the default -f interacts with -S. I think some experimentation is in order.

It would be helpful to have a couple different test videos across multiple video platforms which are most likely to be used in your use cases. Because, for example, YouTube offers many different video/audio formats but the formats are not uniform across all videos.

Some data that you should track:

  • the output of yt-dlp -F $url
  • the output of yt-dlp -S ??? --print format $url
  • the output of yt-dlp -f ??? --print format $url
  • the output of yt-dlp --prefer-free-formats -S ??? --print format $url

to effectively battle test it.

I think this part is the key :-)

Currently, I'm using -f bestvideo[height<=576][filesize<2G]+bestaudio/best[height<=576][filesize<2G]/bestvideo[height<=576]+bestaudio/best[height<=576]/best and this works for me. I haven't noticed any downloads failing but that is because of the /best escape hatch at the end--though I think this is less than ideal.

I'm interested in upstreaming any insights learned here into lb dl --small (thanks for pinging me). On some websites the file size is very large--but this has only been a minor nuisance for me so I haven't taken the time to investigate this. It could simply be that no smaller formats were offered on the specific site.

I don't think this will be required, but this MCDA library is quite good if you need to shave some yaks: https://gitlab.com/shekhand/mcda

What I would do is create an spreadsheet with different options that you think are important and write a script that makes it easy to do a cartesian product over all the combinations and test across 3+ different video links per platform. Collectively go through the list of available formats (yt-dlp -F) and decide which one is ideal, then see which combination of options make the ideal format chosen the most number of times.

@deldesir
Copy link
Collaborator Author

deldesir commented Dec 7, 2023

Thank you, @chapmanjacobd, for your detailed insights and recommendations. I'll follow your suggestion to experiment with --prefer-free-formats and --format-sort and conduct thorough testing across various video platforms. Your provided checklist for data tracking will be invaluable. I appreciate your sharing of the current working format and the mention of lb dl --small, and I'll certainly consider your input when creating the spreadsheet for testing combinations. I'll keep you posted on the progress and findings.

@holta, I guess I'll need to come back to you with solid answers considering all of this. Thanks for your thoughtful questions.

@holta
Copy link
Member

holta commented Dec 7, 2023

@chapmanjacobd your work (xklb) is an Absolute Godsend:

We will ask those using this in schools in Haiti & Beyond to recognize you for all of their lives, Thank You 🙏

@tim-moody
Copy link

I struggled with matching desired video format to audio format in https://github.com/tim-moody/scraper-tools/blob/master/generic/basicspider/sp_lib.py. Not sure if any of it is useful to you. I think I tried to drive off of the size resolution that I wanted. see select_480p_format, select_audio_format, select_vid_format_by_res, and select_vid_format_by_width

@holta
Copy link
Member

holta commented Dec 7, 2023

4 quick excerpts from the output of yt-dlp --help in case it helps us (@EMG70 too?) incrementally test various options!

--prefer-free-formats
Prefer video formats with free containers over non-free ones of same quality. Use with "-S ext" to strictly prefer free containers

-S, --format-sort SORTORDER
Sort the formats by the fields given, see "Sorting Formats" ( https://github.com/yt-dlp/yt-dlp#sorting-formats ) for more details

-F, --list-formats
List available formats of each video. Simulate unless --no-simulate is used

-f, --format FORMAT
Video format code, see "FORMAT SELECTION" ( https://github.com/yt-dlp/yt-dlp#format-selection ) for more details

@holta
Copy link
Member

holta commented Dec 7, 2023

I struggled with matching desired video format to audio format in https://github.com/tim-moody/scraper-tools/blob/master/generic/basicspider/sp_lib.py. Not sure if any of it is useful to you. I think I tried to drive off of the size resolution that I wanted. see select_480p_format, select_audio_format, select_vid_format_by_res, and select_vid_format_by_width

@tim-moody can you say a bit more about select_480p_format and select_vid_format_by_res if possible?

(@deldesir asked me if we're possibly looking in the wrong place! e.g. is there another link we should be looking at, alongside sp_lib.py above?)

@holta
Copy link
Member

holta commented Dec 7, 2023

4 quick excerpts from the output of yt-dlp --help in case it helps us (@EMG70 too?) incrementally test various options!

A couple more suggestions — thanks to @deldesir:

--check-formats
Make sure formats are selected only from those that are actually downloadable

--check-all-formats
Check all formats for whether they are actually downloadable

@tim-moody
Copy link

The functions I mentioned are all in the library I mentioned, but of course are called elsewhere. I went down the road of trying to find a syntax that gave me the right format, but I found it always had some problem with mixing video and audio. So my strategy was to get info on the video in question and then try to compute a good combination and ask for it specifically by format number. You have selected 720p, wherease I was searching for 480p, but the approach could be the same.

@tim-moody
Copy link

The trick is to scan the available formats for the one that comes closest to the width or resolution you want, with rules to decide whether to go bigger or smaller if not found, and then find a good audio format that is also available and fits in the video container.

@holta
Copy link
Member

holta commented Dec 7, 2023

The functions I mentioned are all in the library I mentioned

Which library?

Are I and @deldesir misunderstanding and looking in the wrong place? Thanks @tim-moody if you can clarify.

(select_480p_format and select_vid_format_by_res look potentially interesting!)

@holta
Copy link
Member

holta commented Dec 7, 2023

https://github.com/tim-moody/scraper-tools/blob/master/generic/basicspider/sp_lib.py

If possible, can you clarify what line numbers on the above page are relevant to select_480p_format select_vid_format_by_res etc?

(Apologies I and @deldesir are not quite yet understanding what you are saying.)

@tim-moody
Copy link

I guess I should mention that I import youtube_dl rather than using a cli tool.

@tim-moody
Copy link

start reading at line 268

@tim-moody
Copy link

sorry. I just realized I am not talking about the main branch, but 0.2.1-wip

@tim-moody
Copy link

tim-moody commented Dec 7, 2023

It's been awhile since I worked on this. And there's code you write for yourself and code you write for others; this is the former.

@holta
Copy link
Member

holta commented Dec 7, 2023

sorry. I just realized I am not talking about the main branch, but 0.2.1-wip

Got it! Here's a better link for @deldesir to look over if he can!

@holta
Copy link
Member

holta commented Dec 11, 2023

@deldesir deldesir closed this Jul 1, 2024
@deldesir deldesir deleted the deldesir-sort-formats branch July 1, 2024 21:42
@deldesir deldesir restored the deldesir-sort-formats branch July 1, 2024 22:02
@deldesir deldesir reopened this Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants