Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize some memory copies away. #187

Merged
merged 1 commit into from
Feb 28, 2024
Merged

Conversation

plietar
Copy link
Member

@plietar plietar commented Feb 27, 2024

Couple of hotspots found while running malariasimulation with larger population size. This PR gives a 10% speedup when using population size of 1M.

The most suprising change comes from a modification to the signature of Rcpp-exported functions. Some of them used to accept a const std::vector as an argument. However when doing so Rcpp creates an extra unexpected copy in the wrapping code. We end up with two copies, once from R memory to an std::vector in the generated code, and a second one when the method is called. If argument is non-const, Rcpp does not create the intermediate object and the copy is avoided.

This behaviour comes from the difference between the InputParameter and ConstInputParameter classes in Rcpp's InputParameter.h file.

Other changes include:

  • Return a constant reference from NumericVariable::get_values, instead of a copy.
  • Use std::move inside NumericVariable::queue_update instead of a copy.
  • Use reserve and push_back on a vector instead of filling it with zeros to avoid an unnecessary memset.

With this, most operations that work with values require only one copy, where they used to require two or three.

There are a couple more places that could be optimised further by working directly with R vectors instead of std::vector, but they are more intrusive changes and don't appear as significantly in malariasimulation profiles anyway. I've left these out for now.

Couple of hotspots found while running malariasimulation with larger
population size. This PR gives a 10% speedup when using population size
of 1M.

The most suprising change comes from a modification to the signature of
Rcpp-exported functions. Some of them used to accept a `const
std::vector` as an argument. However when doing so Rcpp creates an extra
unexpected copy in the wrapping code. We end up with two copies, once
from R memory to an std::vector in the generated code, and a second one
when the method is called. If argument is non-const, Rcpp does not
create the intermediate object and the copy is avoided.

This behaviour comes from the difference between the `InputParameter`
and `ConstInputParameter` classes in Rcpp's [InputParameter.h] file.

Other changes include:
- Return a constant reference from `NumericVariable::get_values`,
  instead of a copy.
- Use `std::move` inside `NumericVariable::queue_update` instead of a
  copy.
- Use `reserve` and `push_back` on a vector instead of filling it with
  zeros to avoid an unnecessary memset.

With this, most operations that work with values require only one copy,
where they used to require two or three.

There are a couple more places that could be optimised further by
working directly with R vectors instead of `std::vector`, but they are
more intrusive changes and don't appear as significantly in
malariasimulation profiles anyway. I've left these out for now.

[InputParameter.h]: https://github.com/RcppCore/Rcpp/blob/c63bae6dea4e3994e6f334612c7837fa18ac1e06/inst/include/Rcpp/InputParameter.h
Copy link

codecov bot commented Feb 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.26%. Comparing base (c9cdee3) to head (f3a8d62).
Report is 14 commits behind head on dev.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #187      +/-   ##
==========================================
- Coverage   96.28%   96.26%   -0.02%     
==========================================
  Files          36       36              
  Lines        1722     1823     +101     
==========================================
+ Hits         1658     1755      +97     
- Misses         64       68       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@giovannic giovannic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very surprising!

If Rcpp becomes a big bottleneck, it may be worth considering cpp11. I haven't seen any execution benchmarks, but suspect the modern implementation may not have the same issues.

@plietar plietar merged commit 721837f into mrc-ide:dev Feb 28, 2024
8 checks passed
plietar added a commit that referenced this pull request Sep 6, 2024
On multiple occasions, I have made changes to individual that
inadvertently broke the ABI of the native code. Unfortunately, after the
new version of individual was released and users installed it, the R
tooling wasn't able to detect this situation and did not automatically
force malariasimulation to be re-installed or re-compiled. As a result,
users were using a malariasimulation shared library that was built
against individual 0.X together with an individual shared library at
version 0.(X+1). This causes very subtle and hard to diagnose bugs,
where one piece of code might read from the wrong place.

For example, #187 changed a return type of
`NumericVariable::get_values` from a prvalue to a const-reference,
avoiding unnecessary copies. For most intents and purposes, this is a
API-compatible change, but the ABI of the method is different. After the
update, if a user was still using a installation of malariasimulation
that expected the old signature, the code would be compiled expecting to
receive an `std::vector` by value (ie. three words representing the
pointer to the data, the length and the capacity) instead received a
pointer to such a vector.

Another example is #201, which removed the `num_bits`
field and made it a compile-time constant. Code which used `num_bits`
was now reading from an unintialized piece of memory instead. While the
`num_bits` field was private, header-file code from the individual
package that reads the field can make its way into the malariasimulation
shared library and embed the assumptions about the layout of the class.
Given the heavy use of templates in the package, almost all of it is
re-instantiated and included in the downstream shared library.

In all these cases, recompiling malariasimulation is sufficient to
ensure it uses the new ABI of the individual package. If the user uses
`devtools`, then this simply requires them to call
`devtools::clean_dll`. The issue is that identifying the problem is near
impossible for a user. Nothing in the behaviour of malariasimulation
points to an incompatibility, instead the package only misbehaves in
very suprising ways.

The goal of this change is to export the version of the individual
package in its C++ headers. The constant may then be included and
compiled into the malariasimulation shared object. At load-time,
malariasimulation can compare this version with the result of
`packageVersion("individual")` and show a warning or an error if they
don't match. The user will be directed to re-install malariasimulation.

Some alternatives I considered instead:
1. Don't make ABI breaking changes to individual: I think this is a
   non-starter. We do not want to restrict our abilities to improve and
   optimise the package. Moreover identifying what is or isn't a breaking
   change is quite challenging.
2. Stop exposing a C++ API, and require users to call the R functions
   instead: allowing users to write high-performant processes or utility
   functions for the hot path of their models is quite an important
   aspect of individual.
3. Break the ABI but announce it in release notes so users know to be on
   the look out and to re-compile their copy of malariasimulation: no one
   reads release notes, especially for a package they only use
   indirectly. Moreover it still requires identifying ABI breaking changes.
4. Implement the suggested scheme, but use an ABI version instead of the
   package version. This will avoid false positives, where the package's
   version number is increased even though the ABI hasn't changed. I
   think the rate of releases and the cost of recompiling malariasimulation
   are low enough that it is better to err on the safe side and treat
   every release as if it could be ABI breaking.
plietar added a commit that referenced this pull request Sep 6, 2024
On multiple occasions, I have made changes to individual that
inadvertently broke the ABI of the native code. Unfortunately, after the
new version of individual was released and users installed it, the R
tooling wasn't able to detect this situation and did not automatically
force malariasimulation to be re-installed or re-compiled. As a result,
users were using a malariasimulation shared library that was built
against individual 0.X together with an individual shared library at
version 0.(X+1). This causes very subtle and hard to diagnose bugs,
where one piece of code might read from the wrong place.

For example, #187 changed a return type of
`NumericVariable::get_values` from a prvalue to a const-reference,
avoiding unnecessary copies. For most intents and purposes, this is a
API-compatible change, but the ABI of the method is different. After the
update, if a user was still using a installation of malariasimulation
that expected the old signature, the code would be compiled expecting to
receive an `std::vector` by value (ie. three words representing the
pointer to the data, the length and the capacity) instead received a
pointer to such a vector.

Another example is #201, which removed the `num_bits`
field and made it a compile-time constant. Code which used `num_bits`
was now reading from an unintialized piece of memory instead. While the
`num_bits` field was private, header-file code from the individual
package that reads the field can make its way into the malariasimulation
shared library and embed the assumptions about the layout of the class.
Given the heavy use of templates in the package, almost all of it is
re-instantiated and included in the downstream shared library.

In all these cases, recompiling malariasimulation is sufficient to
ensure it uses the new ABI of the individual package. If the user uses
`devtools`, then this simply requires them to call
`devtools::clean_dll`. The issue is that identifying the problem is near
impossible for a user. Nothing in the behaviour of malariasimulation
points to an incompatibility, instead the package only misbehaves in
very suprising ways.

The goal of this change is to export the version of the individual
package in its C++ headers. The constant may then be included and
compiled into the malariasimulation shared object. At load-time,
malariasimulation can compare this version with the result of
`packageVersion("individual")` and show a warning or an error if they
don't match. The user will be directed to re-install malariasimulation.

I could not figure out a way of exporting the version from `DESCRIPTION`
into header files. It's possibly something `Rcpp::compileAttributes()`
could do, but it doesn't. Instead the version number is duplicated and a
CI check ensures that they are kept in sync.

Some alternatives I considered instead:
1. Don't make ABI breaking changes to individual: I think this is a
   non-starter. We do not want to restrict our abilities to improve and
   optimise the package. Moreover identifying what is or isn't a breaking
   change is quite challenging.
2. Stop exposing a C++ API, and require users to call the R functions
   instead: allowing users to write high-performant processes or utility
   functions for the hot path of their models is quite an important
   aspect of individual.
3. Break the ABI but announce it in release notes so users know to be on
   the look out and to re-compile their copy of malariasimulation: no one
   reads release notes, especially for a package they only use
   indirectly. Moreover it still requires identifying ABI breaking changes.
4. Implement the suggested scheme, but use an ABI version instead of the
   package version. This will avoid false positives, where the package's
   version number is increased even though the ABI hasn't changed. I
   think the rate of releases and the cost of recompiling malariasimulation
   are low enough that it is better to err on the safe side and treat
   every release as if it could be ABI breaking.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants