serialization performance improvements #361

reicheratwork · 2023-01-10T16:53:13Z

These improvements could solve part of issue #358

Further performance improvements by:

Expanding the buffer in place while writing
Not checking the completeness of the written struct (as it is already complete as far as we know)

@wjbbupt : could you give this a look?

zijian-teng · 2023-02-10T01:51:21Z

Tested on my Ubuntu 18.04 in VirtualBox( 6 cores of Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz, 12G/32G RAM):
Testing with the CXX throughput example ./cxxThroughputPublisher 68 0 1 0, and received with C throughput example ./ThroughputSubscriber

2 commits regarding removing completeness check did improve the throughput about 10%
72a1afc has almost no impact on the throughput
af5e2af decrease the perfomance by ~10% unfortunately on my environment :(

reicheratwork · 2023-02-17T16:20:03Z

Tested on my Ubuntu 18.04 in VirtualBox( 6 cores of Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz, 12G/32G RAM): Testing with the CXX throughput example ./cxxThroughputPublisher 68 0 1 0, and received with C throughput example ./ThroughputSubscriber

2 commits regarding removing completeness check did improve the throughput about 10%

72a1afc has almost no impact on the throughput

af5e2af decrease the perfomance by ~10% unfortunately on my environment :(

@tez1szh Sorry to hear that these changes did not result in the performance gains we had hoped for.

All changes were made by running the throughput example in valgrind with the callgrind option set, and then examining the points in the serialization functions which were of most influence on the performance
[72a1afc] is a more efficient copy into a sequence of simple types (char, long, ...) as this skips the in place default construction of these types and immediately copies the data into the memory, if this has no negative effect on performance, i would say, leave it, as it will have an effect for large sequences of simple types
[af5e2af] was an attempt to see what the effect of skipping the step of determining the serialized size of the entity and replacing it with dynamically increasing the serialization buffer size, while this may perform worse in your case, in other cases, such as deeply nested structures, the penalty of copying the buffer on an increase may be smaller than basically doing 2 serializations, where 1 just skips the actual data copying step, maybe this is something that could be set in a QoS setting, or on a DataWriter or something?

I will have a further look at this

Removed the completeness checks on structs being written, as this is nonsensical for writing, as the compiler will ensure that all members that are mandatory (such as key members) are present in the struct passed for writing For the reading step, this validation will still need to be done as the data here does not originate from the structs being read into Signed-off-by: Martijn Reicher <[email protected]>

Because std::vector::resize adds an unnecessary initialization step this performs worse than using assign Signed-off-by: Martijn Reicher <[email protected]>

In read operations, the completeness checking can be skipped for final extensibility structs Signed-off-by: Martijn Reicher <[email protected]>

reicheratwork · 2023-05-04T07:38:40Z

Did some rebasing and reorganising of the commits, I think it is ready for review now.

eboasson

It makes sense to me, but I do have a few questions, likely out of ignorance ...

eboasson · 2023-05-12T08:46:08Z

src/ddscxx/include/org/eclipse/cyclonedds/core/cdr/cdr_stream.hpp

+     * Depending on the implementation and mode header length fields may be completed.
+     * This function can be overridden in cdr streaming implementations.
+     *
+     * @param[in] props Properties of the member to finish.


So presumably the _unchecked refers to it not outputting a set of members that were completed successfully (the member_ids) in the original one. I presume that for both functions the return value is false iff some failure occurred, and that the nice bit of the non-unchecked version is that you get more detailed information on what actually happened.

Ok, that makes sense.

What I do wonder is if it would not make more sense to overload them? If the only difference is getting some additional information out, then it seems like that would more elegant and introduce less risk of head-scratching.

Would you agree?

The _unchecked versions of the finish_member and finish_struct functions do not check the member or struct for completeness (whether all members are present) when doing a streaming operation.
These functions do the streaming steps when finishing a struct or a member, like reading/inserting headers etc.
The check for completeness is not necessary, unless reading a non-final extensibility struct, since there we have no control over the set of members present.
This check uses std::set as a container of the member ids that were read, and this was a significant performance drain, and therefore was split out for those datatype+streaming operation combinations that needed it.

Overloading is indeed an option, i will look in to this, this should also simplify the code generator.
Don't know why I overlooked this when writing this, probably went into this with a little too much C code attitude.

eboasson · 2023-05-12T08:47:42Z

src/ddscxx/include/org/eclipse/cyclonedds/core/cdr/cdr_stream.hpp

+     *
+     * @return Whether the struct is complete and correct.
+     */
+    virtual bool finish_struct_unchecked(const entity_properties_t &props) {(void)props; return true;}


Similar reasoning to the finish_member_unchecked applies here (perhaps there are other places as well, I won't repeat this if there are, that'd be silly).

I am cool with being wrong, mind 🙂

Yep, the explanation is included in the previous comment.

eboasson · 2023-05-12T08:57:07Z

src/ddscxx/src/org/eclipse/cyclonedds/core/cdr/extended_cdr_v1_ser.cpp

@@ -55,7 +55,7 @@ bool xcdr_v1_stream::start_member(const entity_properties_t &prop, bool is_set)
  return cdr_stream::start_member(prop, is_set);
 }

-bool xcdr_v1_stream::finish_member(const entity_properties_t &prop, member_id_set &member_ids, bool is_set)


Am I correct that finish_member is no longer needed because cdr_stream::finish_member takes care of it?

Yes, the functionality which was duplicated here is now absorbed into the base class's version of the function.

eboasson · 2023-05-12T08:59:13Z

src/ddscxx/src/org/eclipse/cyclonedds/core/cdr/extended_cdr_v1_ser.cpp

@@ -300,11 +300,8 @@ bool xcdr_v1_stream::finish_struct(const entity_properties_t &props, const membe
      if (list_necessary(props))
        return move_final_list_entry();
      break;
-    case stream_mode::read:
-      return check_struct_completeness(props, member_ids);


I can't yet figure out how check_struct_completeness role is now being fulfilled (also applies to xcdr_v2_stream, obviously).

reicheratwork mentioned this pull request Jan 10, 2023

c++ performance vs c performance #358

Open

reicheratwork force-pushed the performance_improvements branch from 797af1d to de4843d Compare January 26, 2023 09:12

reicheratwork added 3 commits May 3, 2023 11:48

Using std::vector::assign in stead of resize + memcpy

976f9a3

Because std::vector::resize adds an unnecessary initialization step this performs worse than using assign Signed-off-by: Martijn Reicher <[email protected]>

Remove struct completeness checking for final extensibility

1c27b67

In read operations, the completeness checking can be skipped for final extensibility structs Signed-off-by: Martijn Reicher <[email protected]>

reicheratwork force-pushed the performance_improvements branch from de4843d to 1c27b67 Compare May 3, 2023 15:23

reicheratwork marked this pull request as ready for review May 4, 2023 07:38

eboasson reviewed May 12, 2023

View reviewed changes

eboasson mentioned this pull request Jun 9, 2023

Throughput example receive many out of order samples #421

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

serialization performance improvements #361

serialization performance improvements #361

reicheratwork commented Jan 10, 2023

zijian-teng commented Feb 10, 2023

reicheratwork commented Feb 17, 2023

reicheratwork commented May 4, 2023

eboasson left a comment

eboasson May 12, 2023

reicheratwork May 12, 2023

eboasson May 12, 2023

reicheratwork May 12, 2023

eboasson May 12, 2023

reicheratwork May 12, 2023

eboasson May 12, 2023

serialization performance improvements #361

Are you sure you want to change the base?

serialization performance improvements #361

Conversation

reicheratwork commented Jan 10, 2023

zijian-teng commented Feb 10, 2023

reicheratwork commented Feb 17, 2023

reicheratwork commented May 4, 2023

eboasson left a comment

Choose a reason for hiding this comment

eboasson May 12, 2023

Choose a reason for hiding this comment

reicheratwork May 12, 2023

Choose a reason for hiding this comment

eboasson May 12, 2023

Choose a reason for hiding this comment

reicheratwork May 12, 2023

Choose a reason for hiding this comment

eboasson May 12, 2023

Choose a reason for hiding this comment

reicheratwork May 12, 2023

Choose a reason for hiding this comment

eboasson May 12, 2023

Choose a reason for hiding this comment