Compressed_Buffer_Improvements #119

ecbadeaux · 2024-12-16T04:34:10Z

Intro:

Hi, my name is Everett, and I wanted to make some improvements to the spectatord project. I firmly believe I can really contribute to this team and project. I recently applied for the L4 role.

I took it upon myself to look at the spectatord repo and tried to find an area for improvement. I stumbled upon the compressed_buffer implementation, which I assume is called quite frequently, so I made some changes and will describe them below.

Improvement 1: Simplifying Function Overloads

The first change I made was function unification. The compressed_buffer had several overloads of the Append() function. The internal implementations of the functions were also nearly identical. The only difference being the parameter types, argument count, and amount of calls to push_back().

auto Append(std::string_view s);
auto Append(uint8_t b1);
auto Append(uint8_t b1, uint8_t b2);
auto Append(uint8_t b1, uint8_t b2, uint8_t b3);
auto Append(uint8_t b1, uint8_t b2, uint8_t b3, uint8_t b4);
auto Append(uint8_t b1, uint8_t b2, uint8_t b3, uint8_t b4, uint8_t b5);
auto Append(uint8_t b1, uint8_t b2, uint8_t b3, uint8_t b4, uint8_t b5, uint8_t b6);

Rather than the style above, which also included multiple implementations, I consolidated this into the two functions below. I created a templated variadic function called Append() that serves as a public wrapper to a private method AppendImpl(). AppendImpl() will then execute the original code depending on the arguments and their types.

I have included a snapshot of these functions below for reference:

// Public Wrapper
template <typename... Args>
auto Append(Args&&... args) -> void 
{
    static_assert(
        ((std::is_convertible_v<std::decay_t<Args>, std::string_view> || ::is_same_v<std::decay_t<Args>, uint8_t>) && ...),
        "Invalid argument types. Only std::string, std::string_view, or uint8_t are allowed.");
    
    assert(init_);
    AppendImpl(std::forward<Args>(args)...);
    maybe_compress();
}

// Private Implamentation
template <typename... Args>
auto AppendImpl(Args&&... args) -> void
{
    if constexpr ((std::is_convertible_v<std::decay_t<Args>, std::string_view> && ...))
    {
        static_assert(sizeof...(args) == 1, "Only one string-like argument is allowed.");
        
        (cur_.insert(cur_.end(), std::forward<Args>(args).begin(), std::forward<Args>(args).end()), ...);
    } 
    else if constexpr ((std::is_same_v<std::decay_t<Args>, uint8_t> && ...)) 
    {
        static_assert(sizeof...(args) >= 1 && sizeof...(args) <= 6, "The number of uint8_t arguments must be between 1 and 6.");
        
        (cur_.push_back(std::forward<Args>(args)), ...);
    }
}

Lets talk about `Append()`:

The new public Append() function examines the parameter packs types at compile time and asserts that every argument provided to Append() is either implicitly convertible to string_view or is strictly a uint8_t. If you try to call this function with arguments outside of those types, it will fail at compile time. You can test this by uncommenting the unit tests I added on lines 51–69. This assert will now force you to cast the types you are trying to append so implicit conversion does not occur. Implicit conversion can result in possible loss of precision or data truncation.

On line 89, I had to now static_cast the value being added because it was causing an implicit narrowing conversion. This line should probably be examined for a potential bit manipulation error because mid & 0x7FU results in the creation of an unsigned int. With the new templated function, you will be informed at compile time if this occurs in other locations. You can also enable your compiler to catch these errors by adding the following flags to your CMakeLists.txt file on line 10: -Wconversion -Wsign-conversion .

Once the types are confirmed to be valid (all params implicitly convertible to string_view, or all params are strictly uint8_t), the function proceeds with the original code, but now AppendImpl() handles writing to the private vector cur_

    assert(init_);
    AppendImpl(std::forward<Args>(args)...);
    maybe_compress();

Lets talk about Private `AppendImpl()`:

The private AppendImpl() function also contains asserts to match the original behavior. Your initial implementation only allowed for one string_view, or one uint8_t, or multiple uint8_t values in the inclusive range 1 through 6. The new AppendImpl() checks these conditions during compile time

The compile time assert logic simply says if every parameter in the parameter pack is of the type string_view lets also make sure that there is only one parameter in that pack. If this fails it jumps to the next section which says if every parameter is of the type uint8_t lets make sure we have at least one parameter in the pack and at most six.

Here is a brief example of what happens at compile time but you could just take a look at my unit tests

.Append("Hello");                   // valid
.Append("Hello", "World");          // compile time failure AppendImpl() only allows 1 string param
.Append(7);                         // compile time failure 7 defaults to int
.Append(7.5);                       // compile time failure 7.5 defaults to double
.Append(static_cast<uint8_t> (7))   // valid 
.Append(static_cast<uint8_t> (7),  static_cast<uint8_t>(66))   // valid arguments in range 1-6

// compile time failure AppendImpl only allows a pack size in the range(1-6) for uint8_t... 7 were provided
.Append(static_cast<uint8_t>(65), static_cast<uint8_t>(66), static_cast<uint8_t>(67),
            static_cast<uint8_t>(68), static_cast<uint8_t>(69), static_cast<uint8_t>(70),
        static_cast<uint8_t>(71));

Improvement 2: Performance

I also made a performance fix:

For implicit string_view types, you were previously using std::copy(s.begin(), s.end(), std::back_inserter(cur_));. I’ve changed this to -> (cur_.insert(cur_.end(), std::forward<Args>(args).begin(), std::forward<Args>(args).end()), ...);

This is more efficient because std::back_inserter() defaults to calling push_back(). So for every byte in your string glibCC will do the following:
- 1: Check the internal size of the vector
- 2: If the internal size allows for another byte we will add it and return
- 3: Otherwise ask the Heap Manager aka GlibCC for memory; (If the heap manager does not have the memory in its memory pool, your process will have to switch to kernel mode and map in new memory this is slow and its especially slow if it calls sbrk)
Step 3 will not happen becuase you called Reserve() for cur_ on construction, but step 1 and step 2 are still occurring for each byte in the string. By using insert() we will just check the internal size once and then insert all the bytes if possible.
For the sequential push_back() calls that were occurring in the original Append() when the types were uint8_t, I replaced them with a fold expression:
```
(cur_.push_back(std::forward<Args>(args)), ...);
```
Ideally, this should be done with an initializer list. You could make sure cur has the internal capacity and then insert all the elements at once. This again would avoid the repetitive nature of going through glibcc push_back() call multiple times.

Here are some profiling images, as we increase the amount of bytes we want to copy the benefits of insert() show.

In the first image the trade off is negligible because the amount of bytes you are copying is 5 which is a small amount. The GLibC Insert() implementation has a bit more initial work to do because of the iterators passed in, but as the copy amount increases, the initial performance cost will be offset.

Add Another Unit Test CompressedBufferImprovements

copperlight · 2024-12-16T17:40:02Z

Hello, Everett, my name is Matthew, and I am maintaining this project. 👋🏻

There are a number of things that I like about this contribution:

Demonstrates initiative and good working knowledge of C++.
You found a place to make a useful and targeted improvement in a project with ~10K LOC, which reduces code duplication and offers a performance improvement, as demonstrated with benchmarking.
Your explanation of the change demonstrates excellent communication skills.

Thank you!

Compressed_Buffer_Improvements

f0bcea1

Add Another Unit Test CompressedBufferImprovements

copperlight merged commit 35f2dce into Netflix-Skunkworks:main Dec 16, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compressed_Buffer_Improvements #119

Compressed_Buffer_Improvements #119

ecbadeaux commented Dec 16, 2024 •

edited

Loading

copperlight commented Dec 16, 2024

Compressed_Buffer_Improvements #119

Compressed_Buffer_Improvements #119

Conversation

ecbadeaux commented Dec 16, 2024 • edited Loading

Intro:

Improvement 1: Simplifying Function Overloads

Lets talk about Append():

Lets talk about Private AppendImpl():

Improvement 2: Performance

copperlight commented Dec 16, 2024

ecbadeaux commented Dec 16, 2024 •

edited

Loading

Lets talk about `Append()`:

Lets talk about Private `AppendImpl()`: