-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extremely slow for large archives #61
Comments
I've tried with the linux codebase, it's ~2 GiB (528MiB packed) with ~100k files. Extracting that rar file with the library takes over a minute for me. Extracting with This is obviously unacceptable but also impossible to fix from my side. Maybe there is a regression with a recent version, so my next approach would be to checkout older versions and see if they yield the same results. After that, not much I can do except mail the DLL authors or RIIR^TM, but this is just a fun side project which hasn't even any use for me anymore, so investing too much time is not really an option. I'll report back |
The first version of ouch that used this library was v0.5.0, which uses v0.5.2 of unrar.rs. I've tried this version of ouch and the same slow decompression speed exists, so if there was a regression, it wasn't recent. |
I found out why const NUM_THREADS: u32 = 16;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let args = std::env::args();
let file = args.skip(1).next().unwrap_or("archive.rar".to_owned());
let mut handles = Vec::with_capacity(NUM_THREADS as usize);
for i in 0..NUM_THREADS {
let file = file.clone();
let handle = std::thread::spawn(move || {
let mut archive = unrar::Archive::new(&file).open_for_processing()?;
while let Some(header) = archive.read_header()? {
if header.entry().file_crc % NUM_THREADS == i {
archive = header.extract()?;
} else {
archive = header.skip()?;
}
}
anyhow::Ok(())
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap()?;
}
Ok(())
} This would have to be done at the application level. It's a bit harder to design a concept for this in the library. |
Nice! I'm confused with the fact that the library doesn't allow for multithreading though, doesn't the |
The binary doesn't use the (extern C) DLL functions that we have to use, it directly interacts with the C++ objects so I'm assuming it can do more there, even though I haven't looked at exactly how it achieves multithreading. |
@AntoniosBarotsis is there anything you feel has to be done on the library side? Otherwise we can close this, right? |
I haven't had the time to properly look into this but I guess not. Though keeping it open for anyone that stumbles on the same issue is also an option, up to you |
I do not think WinRAR support multi thread decompression. it only support multi thread compression. ./rar | grep threads
mt<threads> Set the number of threads rar is for both compress and decompress, so it has ./unrar | grep threads unrar does not see also https://www.winrar-france.fr/winrar_instructions_for_use/source/html/HELPGeneralSettings.htm
see also |
@ttys3 thanks for investigating. Either way, unrar uses multiple threads even if it not advertised. To verify this, I had manually compiled unrar with print statements, and for example here, it is setting this value to 11: https://github.com/muja/unrar.rs/blob/master/unrar_sys/vendor/unrar/cmddata.cpp#L689 |
thansk, that's very helpful. but I see Line 83 in 1d9e413
does this mean, in unrar.rs, multi thread decompression is always disabled ? |
|
Oops, my bad. I did not check the doc of it only enable multi thread when #ifdef RAR_SMP
case 'T':
Threads=atoiw(Switch+2);
if (Threads>MaxPoolThreads || Threads<1)
BadSwitch(Switch);
else
{
}
break;
#endif |
but the multithreading feature of unrar cpp is indeed not particularly efficient. My CPU is basically idle rather than being fully utilized by it. |
As the title mentions, when I try to decompress a large archive (~11gb, 800k+ files according to win file explorer) it takes a lot of time, after letting it run for just under an hour, I cancelled it. In contrast, the
unrar
binary that comes with an installation of WinRAR took 12 minutes for the same archive. Considering this crate is a wrapper, I should be able to get nearly identical performance in both cases. Note that this was ran in release as well as the following profile:It is very possible that the code I used is wildly suboptimal, I can't tell.
This is coming from ouch-org/ouch#714 after I did my own testing and arrived at the numbers I mentioned above. You can find the code here. It reads in a
test.rar
archive and extracts all files to atest
directory while keeping track of the amount of files extracted.The text was updated successfully, but these errors were encountered: