Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP fix: convert strings to boost:flyweight #1169

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vsoch
Copy link
Member

@vsoch vsoch commented Apr 12, 2024

Problem: string comparison is immensely inefficient, taking 6-10% of resources (reported by Tom) for a trace in the scoring module..
Solution: start with the root of the issue in the scoring module and work backwards to convert string types to boost:: flyweight. I tried to have a minimal footprint here but it spread really quickly

Background: boost::flyweight is a cool type that essentially maps a string to an internal hashmap. @trws chose it because it still works with a lot of native string stuff. The large change was converting the subsystem_t to use it, which bubbles into many places. Design decisions we will need to make:

  • Changes to top level interfaces vs. parsing the strings as flyweights (as I do now)
  • Creating classes for groups of flyweights (e.g., subsystem, relation) unlike the flattened versions I have
  • When to do a conversion vs. a .get()

There is a lot of room for improvement, but I'm really new to C++ so I'm trying to keep this as simple as possible to start, and first getting something working, and then make it better. The build currently works, but tests fail.

Opening this as a draft because I need help with debugging tests - we have segfaults that need gdb-fu and I've only used -disas - ping @trws when you have time (or others that might be interested - I mostly need to learn how to do it).

Also this is how I feel about C++ strings... 😆 (sorry still laughing about Monk / this entire find)! Gotta make the painful things fun, I think.

@vsoch
Copy link
Member Author

vsoch commented Apr 12, 2024

Some tiny progress! Thanks to @milroy for seeing this. Here is the first failure to build (this is for the focal build)

 /usr/include/boost/flyweight/detail/recursive_lw_mutex.hpp:57: undefined reference to `pthread_mutexattr_init'

We are missing Threads! 🧵 No, not the Facebook chat app - nobody wants that kind of threads. :P

image

So I very simply added it to the CMakeLists.txt, and specifically to check that it was supported in the top level one, and then as a library that is needed for linking for the resource public library. That seemed to get through that first failure (this is the same focal build), which now gets through that, albeit the tests still fail:

image

So probably I need to figure out how to use valgrind to debug test failures next.

@trws
Copy link
Member

trws commented Apr 12, 2024

Ah, it must default to using the mutex protection for flyweights. If we only use them in the resource module, we can use the nolock one to not need that, but if we use them in more places it will be useful.

Problem: string comparison is immensely inefficient, taking 6-10%
of resources (reported by Tom) for a trace.
Solution: start with the root of the issue in the scoring
module and work backwards to convert string types to boost::
flyweight. I tried to have a minimal footprint here but it
spread really quickly

Signed-off-by: vsoch <[email protected]>
@vsoch
Copy link
Member Author

vsoch commented Apr 13, 2024

I wonder if the equality check is just failing because we are still using strings (and comparing different things) in the tests?

infra6->x_checker = planner_new (0, INT64_MAX, resource_total, resource_type);
infra6->subplans["containment"] =
planner_multi_new (0, INT64_MAX, resource_totals, resource_types, len);
infra6->subplans["network"] =
planner_multi_new (0, INT64_MAX, resource_totals, resource_types, len);
bo = (bo || (*infra6 == *infra1));
ok (!bo, "pool_infra_t initialized with different colors not equal to original pool_infra_t");
infra7 = new pool_infra_t ();
infra7->tags[0] = 0;
infra7->x_spans[0] = 0;
infra7->job2span[0] = 0;
infra7->colors["containment"] = 0;
infra7->colors["network"] = 1;
infra7->x_checker = planner_new (0, INT64_MAX, resource_total, resource_type);
infra7->subplans["containment"] =
Will try some things after dinner!

The only difference between test 5 (passes) and 6 (doesn't passes) is that one has colors for network (5) and the other for storage (6)

infra5->colors["network"] = 1;
infra6->colors["storage"] = 1;

and the one being compared to matches the first:

infra1->colors["network"] = 1;

so if the previous test passing meant that when the colors indices are different the objects should be considered the same (passing) that suggests something about the comparison now, using flyweight, is determining they are different. This is what is failing:

bo = (bo || (*infra6 == *infra1));
ok (!bo, "pool_infra_t initialized with different colors not equal to original pool_infra_t");

so both the conditions (bo || (*infra6 == *infra1)) must evaluate to false to deem !bo as true and trigger the error (if I'm reading that right) so we need to understand how flyweights (the string in there) are being compared, and (I'm wondering) why they were not before (given different strings, storage vs network). Oh actually @trws showed me this yesterday! This is how you write a custom comparison function for ==...

bool operator== (const pool_infra_t &o) const;
and that's being checked with == here...

hmm if under the hood we are comparing a map with subsystem_t and before it was string, now is flyweight, I guess I'm not sure why having a map with two different strings (network or storage) should be equal in the first place. Maybe I'm misreading the test.

@trws
Copy link
Member

trws commented Apr 13, 2024 via email

@vsoch
Copy link
Member Author

vsoch commented Apr 13, 2024

Yeah I think I'm reading the test wrong too, I think "ok" is from this tap thing:

#define ok(...) ok_at_loc(__FILE__, __LINE__, __VA_ARGS__, NULL)

@vsoch
Copy link
Member Author

vsoch commented Apr 13, 2024

That’s one of the tests that doesn’t run mod main. So it’s comparing uninitialized flyweights.

It's strange that doesn't throw an error.

What about if we try an init() somewhere? I'm reading here: https://live.boost.org/doc/libs/1_74_0/libs/flyweight/doc/tutorial/technical.html

static flyweight<std::string>::initializer  fwinit;

@vsoch
Copy link
Member Author

vsoch commented Apr 13, 2024

Albeit the design needs some work, I wanted to test that doing an init for flywheel in the "one off" places (resource query utiliity and tests) would resolve those tests (and I think it did).
image

@vsoch
Copy link
Member Author

vsoch commented Apr 13, 2024

I don't know how to debug this more because my local run doesn't reproduce what happens here:

image

I also have jammy, so not sure why it would be different. But I can't compare apples to oranges.

@vsoch
Copy link
Member Author

vsoch commented Apr 13, 2024

There is also something called a key value flyweight, I wonder if we need to use that for some of the subsystem maps? https://www.boost.org/doc/libs/1_79_0/libs/flyweight/example/key_value.cpp. I also don't know the difference between when they show:

boost::flyweights::flyweight

vs. what we are doing, which is missing the middle level:

boost::flyweight

It could be the first is just an old design - I don't see it here: https://www.boost.org/doc/libs/1_79_0/libs/flyweight/doc/reference/index.html. They also have a note about the threads library there:

In order to use the serialization capabilities of Boost.Flyweight, the appropriate Boost.Serialization library module must be linked. Other than that, Boost.Flyweight is a header-only library, requiring no additional object modules. The default simple_locking locking policy introduces a dependency on the Pthreads library on those POSIX-compliant environments where the Boost.Config macro BOOST_HAS_PTHREADS is set.

Maybe we want a holder too?

intermodule_holder_class maintains a C instance which is unique even across different dynamically linked modules of the program using this same type. In general, this guarantee is not provided by static_holder_class, as most C++ implementations are not able to merge duplicates of static variables stored in different dynamic modules of a program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants