Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VST3: State performance issue #478

Open
sevenc-nanashi opened this issue Dec 8, 2024 · 1 comment
Open

VST3: State performance issue #478

sevenc-nanashi opened this issue Dec 8, 2024 · 1 comment

Comments

@sevenc-nanashi
Copy link

sevenc-nanashi commented Dec 8, 2024

My plugin has ~50MiB of state.
It looks like DPF reallocates the buffer for every 512 bytes, and this causes significant performance issue (It might be $O(N^2)$ ):

Simulation Source
def simulate(source_length)
  buffer_length = 512
  read = 0

  allocated = 0
  copied = 0

  while read < source_length
    read += buffer_length # read from source
    allocated += 1 # realloc is called
    copied += read # memory is copied by realloc
  end

  puts "== Source length: #{source_length} bytes"
  puts "Allocated: #{allocated} times"
  puts "Copied: #{copied} bytes (#{(copied / source_length.to_f * 100).round(2)}%)"
  puts "Unused buffer: #{read - source_length} bytes"
  puts
end

simulate(1 * 1024) # 1KiB
simulate(10 * 1024) # 10KiB
simulate(50 * 1024 * 1024) # 50MiB
== Source length: 1024 bytes
Allocated: 2 times
Copied: 1536 bytes (150.0%)
Unused buffer: 0 bytes

== Source length: 10240 bytes
Allocated: 20 times
Copied: 107520 bytes (1050.0%)
Unused buffer: 0 bytes

== Source length: 52428800 bytes
Allocated: 102400 times
Copied: 2684380774400 bytes (5120050.0%)
Unused buffer: 0 bytes

Looks like there's room for improvement.

Example: doubling the buffer size for every 2 allocations if there are more than 512*8 bytes:

Details
def simulate(source_length)
  buffer_length = 512
  read = 0

  allocated = 0
  copied = 0

  extended = 0
  while read < source_length
    if allocated >= 8 && allocated % 2 == 0
      extended += 1
      buffer_length *= 2
    end
    read += buffer_length # read from source
    allocated += 1 # realloc is called
    copied += read # memory is copied by realloc
  end

  puts "== Source length: #{source_length} bytes"
  puts "Allocated: #{allocated} times"
  puts "Extended: #{extended} times (final buffer length: #{buffer_length} bytes)"
  puts "Copied: #{copied} bytes (#{(copied / source_length.to_f * 100).round(2)}%)"
  puts "Unused buffer: #{read - source_length} bytes"
  puts
end

simulate(1 * 1024) # 1KiB
simulate(10 * 1024) # 10KiB
simulate(50 * 1024 * 1024) # 50MiB
== Source length: 1024 bytes
Allocated: 2 times
Extended: 0 times (final buffer length: 512 bytes)
Copied: 1536 bytes (150.0%)
Unused buffer: 0 bytes

== Source length: 10240 bytes
Allocated: 12 times
Extended: 2 times (final buffer length: 2048 bytes)
Copied: 48128 bytes (470.0%)
Unused buffer: 0 bytes

== Source length: 52428800 bytes
Allocated: 38 times
Extended: 15 times (final buffer length: 16777216 bytes)
Copied: 234953728 bytes (448.14%)
Unused buffer: 14682112 bytes
@sevenc-nanashi
Copy link
Author

sevenc-nanashi commented Dec 8, 2024

Update:
I implemented that optimization in my local environment(doubling the buffer size for every 2 allocations if there are more than 512*8 bytes, without exceeding 1MiB):

Details
diff --git a/distrho/src/DistrhoPluginVST3.cpp b/distrho/src/DistrhoPluginVST3.cpp
index a9b99416..e96580c9 100644
--- a/distrho/src/DistrhoPluginVST3.cpp
+++ b/distrho/src/DistrhoPluginVST3.cpp
@@ -954,14 +954,27 @@ public:
         bool fillingKey = true; // if filling key or value
         char queryingType = 'i'; // can be 'n', 's' or 'p' (none, states, parameters)
 
-        char buffer[512], orig;
-        buffer[sizeof(buffer)-1] = '\xff';
+        char orig;
+        //char buffer[512];
+        //buffer[sizeof(buffer)-1] = '\xff';
+        int bufferSize = 512;
+        char* buffer = new char[bufferSize];
+        buffer[bufferSize - 1] = '\xff';
+        int allocationCount = 0;
         v3_result res;
 
         for (int32_t terminated = 0, read; terminated == 0;)
         {
+            allocationCount += 1;
+            if (allocationCount > 8 && allocationCount % 2 == 0 && bufferSize < 1024 * 1024)
+            {
+                bufferSize *= 2;
+                char* newBuffer = new char[bufferSize];
+                delete[] buffer;
+                buffer = newBuffer;
+            }
             read = -1;
-            res = v3_cpp_obj(stream)->read(stream, buffer, sizeof(buffer)-1, &read);
+            res = v3_cpp_obj(stream)->read(stream, buffer, bufferSize - 1, &read);
             DISTRHO_SAFE_ASSERT_INT_RETURN(res == V3_OK, res, res);
             DISTRHO_SAFE_ASSERT_INT_RETURN(read > 0, read, V3_INTERNAL_ERR);
 

And it improved the performance a lot.

  • Without this optimization: ~70s
  • With this optimization: ~5s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant