Skip to content

Commit

Permalink
0.5.0 alpha bugfixes (#9)
Browse files Browse the repository at this point in the history
Bugfix of Nvidia GPU lookup causing failure on systems w/o an Nvidia GPU or driver, plus verbose & error logging for ffmpeg failures
  • Loading branch information
Proryanator authored Mar 7, 2023
1 parent 57628f5 commit 6f602a5
Show file tree
Hide file tree
Showing 8 changed files with 118 additions and 35 deletions.
68 changes: 50 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,13 @@ can:
- help you identify the maximum possible achievable quality at a given bitrate, resolution, and fps for your hardware
- identify maximum capabilities to be applied to OBS Studio, or author's
suggested [Game Streaming Software](#streaming-host--client-software-suggestions) for streaming games anywhere
- identify optimal encoder settings that allow you to squeeze the most quality out of a bitrate limited streaming environment, such as streaming to Twitch or Youtube at low bitrates
- identify optimal encoder settings that allow you to squeeze the most quality out of a bitrate limited streaming
environment, such as streaming to Twitch or Youtube at low bitrates

### The Two Tools

- **benchmark** - one-click pre-configured encoding benchmark that runs on your chosen encoder, useful for a quick-check of your GPU's performance at various resolutions/framerates
- **benchmark** - one-click pre-configured encoding benchmark that runs on your chosen encoder, useful for a quick-check
of your GPU's performance at various resolutions/framerates
- **permutor-cli** - command-line tool to iterate over all possible encoder settings and bitrates to find
encoder limitations, in both performance and quality

Expand Down Expand Up @@ -107,9 +109,14 @@ Assuming you have followed the [Installation Setup Requirements](#installation--
benchmark is as simple as:

1) Opening the **benchmark** executable as you would any other program (double-click)
2) Follow the on-screen instructions: select your GPU (if you have more than 1, otherwise it auto-selects your only
card), select your encoder, and whether to run it on all resolutions or just a specific
one
2) Follow the on-screen instructions:

- select your GPU (if you have more than 1, otherwise it auto-selects your only
card)
- select your encoder
- select whether to run the benchmark on all resolutions or just a specific one
- select whether you want to run it in verbose mode for extra logging (useful for error debugging)

3) Wait for the benchmark to finish, which should not take that long

![img.png](docs/benchmark.png)
Expand Down Expand Up @@ -228,44 +235,67 @@ your cards for you.

## Applying your Findings

This section details out how to use knowledge you've gained from this tool in software like Sunshine, Moonlight, OBS Studio, and many more.
This section details out how to use knowledge you've gained from this tool in software like Sunshine, Moonlight, OBS
Studio, and many more.

### Updating Encoder Settings in Sunshine

We'll first be discussing how to change encoder settings in Sunshine. Bitrate settings will not be something you can set in Sunshine, but will be something you can change in your Moonlight app on your computer or other streaming device.
We'll first be discussing how to change encoder settings in Sunshine. Bitrate settings will not be something you can set
in Sunshine, but will be something you can change in your Moonlight app on your computer or other streaming device.

As of February 2023, Nvidia is stopping support of it's own home GameStream service bundled with GeForce Experience. Introducing <a href='https://github.com/LizardByte/Sunshine/releases'>Sunshine</a>, the open-source alternative that runs on your gaming rig, and encodes your gameplay footage to be streamed to other devices, like another computer or even your phone. Sunshine, unlike other streaming programs like Nvidia's GameStream, allows you to customize some encoding settings that can often out-perform Nvidia's GameStream program.
As of February 2023, Nvidia is stopping support of it's own home GameStream service bundled with GeForce Experience.
Introducing <a href='https://github.com/LizardByte/Sunshine/releases'>Sunshine</a>, the open-source alternative that
runs on your gaming rig, and encodes your gameplay footage to be streamed to other devices, like another computer or
even your phone. Sunshine, unlike other streaming programs like Nvidia's GameStream, allows you to customize some
encoding settings that can often out-perform Nvidia's GameStream program.

Note: we'll assume that you already have a Sunshine server setup and that you have attached at least one client device. Sunshine sets some encoder settings by default, at the time of writing this, for Nvidia encoders the default preset is `p4`. You can view the currently used encoder settings by going to `youripaddress:47990 -> Web UI -> Configuration -> NVIDIA NVENC Encoder / Intel QuickSync Encoder / AMD AMF Encoder`.
Note: we'll assume that you already have a Sunshine server setup and that you have attached at least one client device.
Sunshine sets some encoder settings by default, at the time of writing this, for Nvidia encoders the default preset
is `p4`. You can view the currently used encoder settings by going
to `youripaddress:47990 -> Web UI -> Configuration -> NVIDIA NVENC Encoder / Intel QuickSync Encoder / AMD AMF Encoder`.

Let's say that using the tools in this project, you identified that of all the possible encoder settings for NVENC_H264 on your 3080, the settings that allowed you to encode 4K@120 were:
Let's say that using the tools in this project, you identified that of all the possible encoder settings for NVENC_H264
on your 3080, the settings that allowed you to encode 4K@120 were:

`-preset p1 -tune ll -profile:v high -rc cbr`

To apply these settings in Sunshine (for Nvidia), go to `Web UI -> Configuration -> NVIDIA NVENC Encoder` and change to the following values in the dropdowns:
To apply these settings in Sunshine (for Nvidia), go to `Web UI -> Configuration -> NVIDIA NVENC Encoder` and change to
the following values in the dropdowns:

```
NVENC Preset: p1 -- fastest (lowest quality)
NVENC Tune: ll -- low latency
NVENC Rate Control: cbr -- constant bitrate
```

You may have noticed that you could not set the profile for the encoder in Sunshine. Sunshine does not expose _all_ encoder settings, but exposes the ones that make the most impact to your encode (most likely Sunshine defaults profile to high). Perhaps in a future update you'll be able to specify more settings but, for now you may be limited.
You may have noticed that you could not set the profile for the encoder in Sunshine. Sunshine does not expose _all_
encoder settings, but exposes the ones that make the most impact to your encode (most likely Sunshine defaults profile
to high). Perhaps in a future update you'll be able to specify more settings but, for now you may be limited.

Once you've saved these settings, Sunshine will now encode your game using your specific settings, enabling you to stream at potentially higher framerates, or framerates with higher 1% lows than before. (Author was not able to get higher than 4K@90 with default settings in Sunshine and Nvidia's GameStreaming service, but with the findings from this tool, is able to get stable 4K@120).
Once you've saved these settings, Sunshine will now encode your game using your specific settings, enabling you to
stream at potentially higher framerates, or framerates with higher 1% lows than before. (Author was not able to get
higher than 4K@90 with default settings in Sunshine and Nvidia's GameStreaming service, but with the findings from this
tool, is able to get stable 4K@120).

### Applying Bitrate Knowledge in Moonlight App

When using Moonlight as your game streaming client, it auto-recommends a bitrate for you to stream at. Most of the time
this is pretty accurate for lower resolutions, however depending on your hardware's capabilities you might be able to
get away with less bitrate than it suggests. Even moreso, some AMD GPU's need way more bitrate than Nvidia cards, so you'll want to know if you'll need much higher bitrates.
get away with less bitrate than it suggests. Even moreso, some AMD GPU's need way more bitrate than Nvidia cards, so
you'll want to know if you'll need much higher bitrates.

For example: Moonlight auto-selects `80Mb/s` for streaming 4K@60 game content. However from our testing, you really only
need `50Mb/s` when encoding using H264_NVENC. Notice that this applies to _nvenc_ encoders on Nvidia GPU's, and may or may not apply for other vendor GPU's, even using the same H264 algorithm.
need `50Mb/s` when encoding using H264_NVENC. Notice that this applies to _nvenc_ encoders on Nvidia GPU's, and may or
may not apply for other vendor GPU's, even using the same H264 algorithm.

After running the tool on a 4K@60 input file, we know we can get a visually lossless streaming experience with just 50Mb/s on our Nvidia GPU. We also know that, if we are attempting to stream our games outside our home network, we know that our cellular connection speeds or wifi speeds should be at least 50Mb/s to get a clean 4K@60. In addition to this, our gaming rig (and home network upload speeds) should also be capable of 50Mb/s.
After running the tool on a 4K@60 input file, we know we can get a visually lossless streaming experience with just
50Mb/s on our Nvidia GPU. We also know that, if we are attempting to stream our games outside our home network, we know
that our cellular connection speeds or wifi speeds should be at least 50Mb/s to get a clean 4K@60. In addition to this,
our gaming rig (and home network upload speeds) should also be capable of 50Mb/s.

The tools here enable you to know whether you can actually stream to where you are, or if you are bitrate limited, encoder hardware limited, or somewhere in-between. It's easier to know if you can stream games to your phone while on cellular data, and know what resolution & framerate to set your stream to given that you are bitrate limited.
The tools here enable you to know whether you can actually stream to where you are, or if you are bitrate limited,
encoder hardware limited, or somewhere in-between. It's easier to know if you can stream games to your phone while on
cellular data, and know what resolution & framerate to set your stream to given that you are bitrate limited.

### Streaming with OBS Studio

Expand All @@ -276,7 +306,9 @@ TBD
## Author's Research Findings and Discussion

A _lot_ of research has gone into the development of this tool, and some decisions were made along the way that might
not be obvious to why some conclusions were drawn. This section is also for you if you are interested in some nitty-gritty details of video encoding, from SSD i/o read speed limitations, framerate statistics, and some design decisions of the tool made by the author during development.
not be obvious to why some conclusions were drawn. This section is also for you if you are interested in some
nitty-gritty details of video encoding, from SSD i/o read speed limitations, framerate statistics, and some design
decisions of the tool made by the author during development.

### Streaming Host & Client Software Suggestions

Expand Down
15 changes: 15 additions & 0 deletions benchmark/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ fn main() {

permutation.bitrate = bitrate;
permutation.encoder_settings = settings;
permutation.verbose = cli.verbose;
engine.add(permutation);
}

Expand Down Expand Up @@ -139,6 +140,20 @@ fn read_user_input(cli: &mut BenchmarkCli, gpus: Vec<String>) {
}
}

loop {
print!("\nRun with verbose mode? [y/n]: ");
let full: String = read!("{}");
if full != "n" && full != "y" {
println!("Invalid input, try again...");
} else {
if full == "y" {
cli.verbose = true;
}

break;
}
}

println!();
}

Expand Down
2 changes: 1 addition & 1 deletion engine/src/benchmark_engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ impl BenchmarkEngine {
let permutation = self.permutations[i].clone();
// benchmark will not log ETA since every encode will be different
log_benchmark_header(i, &self.permutations, calc_time);
self.results.push(run_encode(permutation, &ctrl_channel));
self.results.push(run_encode(permutation.clone(), &ctrl_channel));
calc_time = Option::from(permutation_start_time.elapsed().unwrap());
}

Expand Down
41 changes: 32 additions & 9 deletions engine/src/engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -74,25 +74,48 @@ fn log_header(index: usize, permutations: &Vec<Permutation>, calc_time: Option<D
println!("[{}]", permutation.encoder_settings);
}

pub fn spawn_ffmpeg_child(ffmpeg_args: &FfmpegArgs) -> Child {
return Command::new("ffmpeg")
.args(ffmpeg_args.to_vec())
.stdout(Stdio::null())
.stderr(Stdio::null())
.spawn().expect("Failed to start instance of ffmpeg");
pub fn spawn_ffmpeg_child(ffmpeg_args: &FfmpegArgs, verbose: bool, log_error_output: Option<bool>) -> Child {
// log the full ffmpeg command to be spawned
if verbose {
println!("V: ffmpeg args: {:?}", ffmpeg_args.encoder_args);
let mut cloned = ffmpeg_args.clone();
cloned.set_no_output_for_error();
println!("V: ffmpeg args no network calls (copy this and run locally, minus the quotes): {:?}", cloned.encoder_args);
}

let mut effective_ffmpeg_args = ffmpeg_args.clone();
if log_error_output.is_some() && log_error_output.unwrap() {
effective_ffmpeg_args.set_no_output_for_error();
}

let mut command = Command::new("ffmpeg");
let child = command.args(effective_ffmpeg_args.to_vec());

if log_error_output.is_some() && log_error_output.unwrap() {
child.stdout(Stdio::inherit())
.stderr(Stdio::inherit());
} else {
child.stdout(Stdio::null())
.stderr(Stdio::null());
}

return child.spawn().expect("Failed to start instance of ffmpeg");
}

fn run_overload_benchmark(metadata: &MetaData, ffmpeg_args: &FfmpegArgs, verbose: bool, detect_overload: bool, ctrl_channel: &Result<Receiver<()>, Error>) -> TrialResult {
let mut child = spawn_ffmpeg_child(ffmpeg_args);
let mut child = spawn_ffmpeg_child(ffmpeg_args, verbose, None);
if verbose {
println!("Successfully spawned encoding child")
println!("V: Successfully spawned encoding child");
}

let trial_result = progressbar::watch_encode_progress(metadata.frames, detect_overload, metadata.fps, verbose, ffmpeg_args.stats_period, ctrl_channel);

if trial_result.ffmpeg_error && !was_ctrl_c_received(&ctrl_channel) {
let _ = child.kill();
println!("Ffmpeg encountered an error when attempting to run, double-check that your environment is setup correctly. If so, open an issue in github!");
eprintln!("Ffmpeg encountered an error when attempting to run, double-check that your environment is setup correctly. If so, open an issue in github!");
// spawn the ffmpeg command, with output logged so we can troubleshoot better
// modifying the command just a little bit so that it fails immediately
spawn_ffmpeg_child(&ffmpeg_args, verbose, Option::from(true));
} else if trial_result.was_overloaded && !was_ctrl_c_received(&ctrl_channel) {
let _ = child.kill();
println!("Encoder was overloaded and could not encode the video file in realtime, stopping...");
Expand Down
12 changes: 6 additions & 6 deletions engine/src/permutation_engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ impl PermutationEngine {

if !result.was_overloaded && permutation.check_quality.clone() {
let vmaf_start_time = SystemTime::now();
result.vmaf_score = check_encode_quality(permutation.clone(), &ctrl_channel);
result.vmaf_score = check_encode_quality(permutation.clone(), &ctrl_channel, permutation.verbose);
result.vmaf_calculation_time = vmaf_start_time.elapsed().unwrap().as_secs();

// if this is higher than the target quality, stop at this bitrate during benchmark
Expand Down Expand Up @@ -122,7 +122,7 @@ impl PermutationEngine {
}
}

fn check_encode_quality(mut p: Permutation, ctrl_channel: &Result<Receiver<()>, Error>) -> c_float {
fn check_encode_quality(mut p: Permutation, ctrl_channel: &Result<Receiver<()>, Error>, verbose: bool) -> c_float {
let ffmpeg_args = FfmpegArgs::build_ffmpeg_args(p.video_file.clone(), p.encoder.clone(), &p.encoder_settings, p.bitrate.clone());

println!("Calculating vmaf score; might take longer than original encode depending on your CPU...");
Expand All @@ -131,21 +131,21 @@ fn check_encode_quality(mut p: Permutation, ctrl_channel: &Result<Receiver<()>,
// first spawn the ffmpeg instance to listen for incoming encode
let vmaf_args = ffmpeg_args.map_to_vmaf(metadata.fps);
if p.verbose {
println!("Vmaf args calculating quality: {}", vmaf_args.to_string());
println!("V: Vmaf args calculating quality: {}", vmaf_args.to_string());
}

let mut vmaf_child = spawn_ffmpeg_child(&vmaf_args);
let mut vmaf_child = spawn_ffmpeg_child(&vmaf_args, verbose, None);

// then spawn the ffmpeg instance to perform the encoding
let mut encoder_args = ffmpeg_args.clone();

encoder_args.output_args = String::from(insert_format_from(TCP_OUTPUT, &ffmpeg_args.encoder));

if p.verbose {
println!("Encoder fmmpeg args sending to vmaf: {}", encoder_args.to_string());
println!("V: Encoder fmmpeg args sending to vmaf: {}", encoder_args.to_string());
}

spawn_ffmpeg_child(&encoder_args);
spawn_ffmpeg_child(&encoder_args, verbose, None);

// not the cleanest way to do this but oh well
progressbar::watch_encode_progress(metadata.frames, false, metadata.fps, false, ffmpeg_args.stats_period, ctrl_channel);
Expand Down
5 changes: 5 additions & 0 deletions ffmpeg/src/args.rs
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,11 @@ impl FfmpegArgs {
return output;
}

pub fn set_no_output_for_error(&mut self) {
self.output_args = NO_OUTPUT.to_string();
self.send_progress = false;
}

pub fn to_vec(&self) -> Vec<String> {
return self.to_string().split(" ").map(|s| s.to_string()).collect();
}
Expand Down
9 changes: 8 additions & 1 deletion gpus/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,14 @@ use nvml_wrapper::Nvml;
pub mod device;

pub fn get_gpus() -> Vec<String> {
let nvml = Nvml::init().unwrap();
let nvml = match Nvml::init() {
Ok(nvml) => { nvml }
Err(_) => {
println!("Warning: Unable to auto-detect multiple GPU's, falling back to using first GPU or provided one via '-gpu' option if specified");
return Vec::new();
}
};

let device_count = nvml.device_count().unwrap();

let mut list = Vec::new();
Expand Down
1 change: 1 addition & 0 deletions permutor-cli/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ fn build_nvenc_setting_permutations(engine: &mut PermutationEngine, cli: &Permut
permutation.verbose = cli.verbose;
permutation.detect_overload = cli.detect_overload;
permutation.allow_duplicates = cli.allow_duplicate_scores;
permutation.verbose = cli.verbose;
engine.add(permutation);

// break out early here to just make 1 permutation
Expand Down

0 comments on commit 6f602a5

Please sign in to comment.