Improve `ValuePlug::getValue()` performance #5362

johnhaddon · 2023-06-27T16:04:36Z

I'll soon be opening a PR to fix #1971, but it comes at a small runtime cost that shows up as a few percent overhead in ValuePlugTest.testCacheOverhead. This PR provides some substantial performance improvements targeted at that test so that overall we will still come out well on top. I'm seeing the following runtime reductions :

ValuePlugTest.testCacheOverhead : ~23%
ValuePlugTest.testContentionForOneItem : ~50%

I did also investigate reducing the number of thread-local lookups by smooshing a bunch of disparate thread locals into one, as we had discussed. While it had a small benefit performance-wise it was a mess from a maintenance perspective, and didn't produce anything near the improvements from this PR. I don't intend to revisit that until we're scraping the bottom of the barrel.

I should follow the example of d608cb0 and inline the getValue() methods for the other plug types, but the question there is "where to put the inlined methods?". We already have .inl files for things like TypedPlug, but they're only intended for inclusion from select .cpp files. I think they should probably be renamed with some other suffix, but what?

danieldresser-ie

Those are some very nice wins.

I should follow the example of d608cb0 and inline the getValue() methods for the other plug types, but the question there is "where to put the inlined methods?". We already have .inl files for things like TypedPlug, but they're only intended for inclusion from select .cpp files. I think they should probably be renamed with some other suffix, but what?

The wording here was initially ambiguous to me, but looking at it: the standard convention is that *.inl is included in the corresponding .h ( about 40 of 50 current examples ). So, the outliers are .inl that aren't intended to be included in the .h, and they're what needs renaming? I guess the closest to our other conventions would be to call it Private/TypedPlug.inl, but maybe it would be confusing to have both TypedPlug.inl and Private/TypedPlug.inl?

danieldresser-ie · 2023-06-27T18:44:32Z

include/Gaffer/ValuePlug.inl

-			);
-			IECore::msg( IECore::Msg::Error, "ValuePlug::getObjectValue", error );
-			throw IECore::Exception( error );
-		}


It's been quite a while, and I don't clearly remember all this stuff - but it sounds like this warning was put here quite explicitly to help debug something nasty ... removing it doesn't seem related to this PR, but maybe you're aware of some reason why this logic no longer applies?

I traced this back to the original PR, but it still wasn't completely clear to me why this was done in the first place. One thing was clear - there was a segfault in the unexpected value == nullptr case that was being fixed in addition to the message being added. I've kept that using the ternary in the format call. The other thing that was clear in the original PR was that the cause of the value == nullptr has been fixed elsewhere and has its own test coverage (at the TaskMutex level). So I don't think this message is any more likely to be useful than a belt-and-braces message anywhere else we throw an exception, and since we don't do that as a matter of course, I don't really see a reason to keep it here.

OK, I don't feel too strongly about it.

johnhaddon · 2023-06-28T07:38:25Z

So, the outliers are .inl that aren't intended to be included in the .h, and they're what needs renaming? I guess the closest to our other conventions would be to call it Private/TypedPlug.inl, but maybe it would be confusing to have both TypedPlug.inl and Private/TypedPlug.inl?

Yes. The convention is that .inl contains stuff that we want to be inlineable, but don't want to clutter up the .h with. But the TypedPlug.inl isn't used like that - it's just included in a single compilation unit (TypedPlug.cpp etc) where we instantiate the template explicitly. It's not really private, because the reason it's in a header rather than just the .cpp is so that other modules can instantiate the template for their own types (for instance, GafferImage::FormatPlug). So I think maybe a different extension might make sense - I'm not sure there's any established convention for this though - we could use .impl maybe?

…ionScopes This means calling `ValuePlug::dirty()` before the scope closes, so that the hash cache is invalidated before the compute is performed. This was first requested way back in 2017 in issue GafferHQ#1971, where the use case was edit/compute loops in the PythonEditor. It has become a high priority more recently though, as it turned out we were actually interleaving edits and computes in the TransformTools, and noticed incorrect results while working on the LightTool in GafferHQ#5218. The problem was as follows, and is demonstrated in `testMultipleSelectionWithEditScope` : 1. We push a dirty propagation scope using UndoScope. 2. We create a transform edit for `/cube1`, which dirties some plugs, incrementing their dirty count. 3. We create a transform edit for `/cube`, but as part of doing that we need to compute the _current_ transform. This "bakes" that transform into the (hash) cache, keyed by the current dirty count. 4. We propagate dirtiness for this new transform edit, but don't increment the dirty count again, because we've already visited those plugs in step 2. 5. After closing the scope, we query the transform for `/cube` and end up with the cached value from step 3. This is incorrect, because it isn't accounting for the edits made subsequently. The problem is demonstrated even more simply by `ComputeNodeTest.testInterleavedEditsAndComputes`. There is an overhead to calling `flushDirtyPropagationScope()` for each `getValue()` call, and it is measurable in `ValuePlugTest.testCacheOverhead()`. But taking into account the performance improvements made in GafferHQ#5362, we still come out with an %18 reduction in runtime even with the overhead. Fixes GafferHQ#1971

johnhaddon · 2023-06-28T16:05:58Z

Those are some very nice wins.

I think there's another reference counting win that might be a little bit more faffy, but could be worthwhile. In getValue() and getObjectValue() we need to own a reference to values that come out of the compute cache, because they could be evicted at any moment. But static values that come from ValuePlug::m_value are guaranteed to stay alive as long as we need them to, so we could completely avoid incrementing their reference count. We'd need getObjectValue() to return some sort of const T *, T::ConstPtr pair, with the second item possibly being null, but I think the wins are probably worth it. I'm thinking cases where lots of threads are computing the hash on a node that largely has static input values (shader networks especially). Might be worth a look at some point?

danieldresser-ie · 2023-06-28T20:40:43Z

Since it's a fairly special case, I guess another option would be rather than using a new file extension, to just call it TypedPlugImplementation.h

But I don't mind ".impl" if that's what you want.

And yeah, skipping reference counting for static values does sound interesting.

johnhaddon · 2023-07-03T16:49:07Z

I guess another option would be rather than using a new file extension, to just call it TypedPlugImplementation.h

That's much better - that's what I've gone with in 83f78d8, allowing me to inline getValue() in 6c5e585.

danieldresser-ie

LGTM, aside from two typos.

danieldresser-ie · 2023-07-10T21:39:43Z

SConstruct

@@ -1090,10 +1090,10 @@ libraries = {

 	"GafferSceneUI" : {
 		"envAppends" : {
-			"LIBS" : [ "Gaffer", "GafferUI", "GafferImage", "GafferImageUI", "GafferScene", "Iex$IMATH_LIB_SUFFIX", "IECoreGL$CORTEX_LIB_SUFFIX", "IECoreImage$CORTEX_LIB_SUFFIX", "IECoreScene$CORTEX_LIB_SUFFIX" ],
+			"LIBS" : [ "Gaffer", "GafferUI", "GafferImage", "GafferImageUI", "GafferScene", "GafferImage", "Iex$IMATH_LIB_SUFFIX", "IECoreGL$CORTEX_LIB_SUFFIX", "IECoreImage$CORTEX_LIB_SUFFIX", "IECoreScene$CORTEX_LIB_SUFFIX" ],


"GafferImage" was already listed here - looks like it's here twice now?

Fixed in 0945ec9.

danieldresser-ie · 2023-07-10T21:42:24Z

include/GafferImage/AtomicFormatPlug.h

+template<>
+IECore::MurmurHash GafferImage::AtomicFormatPlug::hash() const;
+
+} // namespace Gaffer


Github is now listing a missing newline at EOF here - I don't think we should be doing that, though the whitespace checker isn't complaining about it.

Also fixed in 0945ec9.

This produces the following reductions in runtime : - ValuePlugTest.testCacheOverhead : 7% - ValuePlugTest.testContentionForOneItem : 25%

The common case is that a plug has no input, so it's worth avoiding the overhead of calling `typeId()`. This knocks ~2% off the runtime of `ValuePlugTest.testCacheOverhead`.

This produces the following reductions in runtime : - ValuePlugTest.testCacheOverhead : 5% - ValuePlugTest.testContentionForOneItem : 13%

This produces the following reductions in runtime : - ValuePlugTest.testCacheOverhead : 7% - ValuePlugTest.testContentionForOneItem : 20%

This produces the following reductions in runtime : - ValuePlugTest.testCacheOverhead : 5% - ValuePlugTest.testContentionForOneItem : 1%

We use `.inl` files to store inline implementation that we want the compiler to see, but which we don't want to clutter up the header with, so the headers are more human-readable. These files don't match that pattern - they're to be included in a single `.cpp` file and used to explicitly instantiate the template.

This also means declaring AtomicFormatPlug's specialisations publicly so that we don't get the publicly visible default `getValue()` implementation being compiled instead. And the part I don't understand : we need to link `libGafferImage` to far more libraries now, to avoid undefined symbols related to AtomicFormatPlug with MSVC. It seems that the mere existence of the inlined `getValue()` - even though it is invalidated by the specialization - is enough to make MSVC greedily do a bunch of implicit template instantiations that it wasn't doing before, and which ultimately depend on `GafferImage::Format`.

…ionScopes This means calling `ValuePlug::dirty()` before the scope closes, so that the hash cache is invalidated before the compute is performed. This was first requested way back in 2017 in issue GafferHQ#1971, where the use case was edit/compute loops in the PythonEditor. It has become a high priority more recently though, as it turned out we were actually interleaving edits and computes in the TransformTools, and noticed incorrect results while working on the LightTool in GafferHQ#5218. The problem was as follows, and is demonstrated in `testMultipleSelectionWithEditScope` : 1. We push a dirty propagation scope using UndoScope. 2. We create a transform edit for `/cube1`, which dirties some plugs, incrementing their dirty count. 3. We create a transform edit for `/cube`, but as part of doing that we need to compute the _current_ transform. This "bakes" that transform into the (hash) cache, keyed by the current dirty count. 4. We propagate dirtiness for this new transform edit, but don't increment the dirty count again, because we've already visited those plugs in step 2. 5. After closing the scope, we query the transform for `/cube` and end up with the cached value from step 3. This is incorrect, because it isn't accounting for the edits made subsequently. The problem is demonstrated even more simply by `ComputeNodeTest.testInterleavedEditsAndComputes`. There is an overhead to calling `flushDirtyPropagationScope()` for each `getValue()` call, and it is measurable in `ValuePlugTest.testCacheOverhead()`. But taking into account the performance improvements made in GafferHQ#5362, we still come out with an %18 reduction in runtime even with the overhead. Fixes GafferHQ#1971

johnhaddon requested a review from danieldresser-ie June 27, 2023 16:04

johnhaddon self-assigned this Jun 27, 2023

danieldresser-ie reviewed Jun 27, 2023

View reviewed changes

johnhaddon mentioned this pull request Jun 28, 2023

Plug/ValuePlug : Allow interleaving edit/compute within DirtyPropagationScopes #5368

Merged

johnhaddon force-pushed the getValueOptimisationPR branch from 52528a9 to c978f7f Compare July 3, 2023 16:47

johnhaddon force-pushed the getValueOptimisationPR branch from c978f7f to 6c5e585 Compare July 4, 2023 08:55

johnhaddon requested a review from danieldresser-ie July 6, 2023 09:03

danieldresser-ie reviewed Jul 10, 2023

View reviewed changes

johnhaddon force-pushed the getValueOptimisationPR branch from 6c5e585 to c7e4e53 Compare July 11, 2023 08:45

johnhaddon added 8 commits July 11, 2023 09:46

ValuePlug : Avoid extra refcount operations

16487c9

This produces the following reductions in runtime : - ValuePlugTest.testCacheOverhead : 7% - ValuePlugTest.testContentionForOneItem : 25%

ValuePlug : Avoid unnecessary typeId() call

d961b0e

The common case is that a plug has no input, so it's worth avoiding the overhead of calling `typeId()`. This knocks ~2% off the runtime of `ValuePlugTest.testCacheOverhead`.

NumericPlug : Inline getValue()

3f6f8e2

This produces the following reductions in runtime : - ValuePlugTest.testCacheOverhead : 5% - ValuePlugTest.testContentionForOneItem : 13%

ValuePlug : Avoid additional refcount operations on cache hit

7803459

This produces the following reductions in runtime : - ValuePlugTest.testCacheOverhead : 7% - ValuePlugTest.testContentionForOneItem : 20%

Plug : Inline direction() and node()

e170cdc

This produces the following reductions in runtime : - ValuePlugTest.testCacheOverhead : 5% - ValuePlugTest.testContentionForOneItem : 1%

Changes.md : Update to summarise recent performance improvements

ff34605

johnhaddon force-pushed the getValueOptimisationPR branch from c7e4e53 to 0945ec9 Compare July 11, 2023 08:49

johnhaddon merged commit 8e909c8 into GafferHQ:main Jul 11, 2023
4 checks passed

johnhaddon deleted the getValueOptimisationPR branch July 11, 2023 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `ValuePlug::getValue()` performance #5362

Improve `ValuePlug::getValue()` performance #5362

johnhaddon commented Jun 27, 2023

danieldresser-ie left a comment

danieldresser-ie Jun 27, 2023

johnhaddon Jun 28, 2023

danieldresser-ie Jun 28, 2023

johnhaddon commented Jun 28, 2023

johnhaddon commented Jun 28, 2023

danieldresser-ie commented Jun 28, 2023

johnhaddon commented Jul 3, 2023 •

edited

Loading

danieldresser-ie left a comment

danieldresser-ie Jul 10, 2023

johnhaddon Jul 11, 2023 •

edited

Loading

danieldresser-ie Jul 10, 2023

johnhaddon Jul 11, 2023 •

edited

Loading

Improve ValuePlug::getValue() performance #5362

Improve ValuePlug::getValue() performance #5362

Conversation

johnhaddon commented Jun 27, 2023

danieldresser-ie left a comment

Choose a reason for hiding this comment

danieldresser-ie Jun 27, 2023

Choose a reason for hiding this comment

johnhaddon Jun 28, 2023

Choose a reason for hiding this comment

danieldresser-ie Jun 28, 2023

Choose a reason for hiding this comment

johnhaddon commented Jun 28, 2023

johnhaddon commented Jun 28, 2023

danieldresser-ie commented Jun 28, 2023

johnhaddon commented Jul 3, 2023 • edited Loading

danieldresser-ie left a comment

Choose a reason for hiding this comment

danieldresser-ie Jul 10, 2023

Choose a reason for hiding this comment

johnhaddon Jul 11, 2023 • edited Loading

Choose a reason for hiding this comment

danieldresser-ie Jul 10, 2023

Choose a reason for hiding this comment

johnhaddon Jul 11, 2023 • edited Loading

Choose a reason for hiding this comment

Improve `ValuePlug::getValue()` performance #5362

Improve `ValuePlug::getValue()` performance #5362

johnhaddon commented Jul 3, 2023 •

edited

Loading

johnhaddon Jul 11, 2023 •

edited

Loading

johnhaddon Jul 11, 2023 •

edited

Loading