Fix double std::move on objects in 'lenses::getset' on set call #165

dimula73 · 2022-11-21T16:53:10Z

This is the final version of the patch that fixes both, double-move problem and extra getter call on set(lens,...) operation.

The patch ensures the following requirements:

When view(lens, whole) is called:
- the setter lambda is neither called nor constructed. It avoids unnecessary copies of the 'whole' object.
- the whole object is directly forwarded into the getter lambda, without any copies.
When set(lens, whole, part) is called:
- no getter is called at the toplevel lens (which value was previously discarded).
- on the lower lenses, the getter receives an 'lvalue' reference to the whole object, and setter receives an 'rvalue' reference to the whole object. It ensures that there is no double-move problem.

Fixes #160

Basically, we shouldn't calculate the getter when we know that the value will be dropped. The patch partially fixes arximboldi#160, because now the getter is not called at all during the set operation. But the double-move can still happen when both, getter and setter calls are performed.

error: 'auto' not allowed in function prototype

codecov-commenter · 2022-11-22T07:46:00Z

Codecov Report

Merging #165 (0b6ab3e) into master (24887ac) will increase coverage by 0.19%.
The diff coverage is 96.68%.

@@            Coverage Diff             @@
##           master     #165      +/-   ##
==========================================
+ Coverage   94.30%   94.50%   +0.19%     
==========================================
  Files          80       80              
  Lines        2616     2910     +294     
==========================================
+ Hits         2467     2750     +283     
- Misses        149      160      +11

Impacted Files	Coverage Δ
test/lenses.cpp	`98.31% <96.37%> (-1.69%)`	⬇️
lager/lenses.hpp	`100.00% <100.00%> (ø)`
lager/lenses/attr.hpp	`100.00% <100.00%> (ø)`
lager/lenses/unbox.hpp	`100.00% <100.00%> (ø)`
lager/context.hpp	`88.23% <0.00%> (-2.68%)`	⬇️
lager/lenses/optional.hpp	`92.30% <0.00%> (-1.03%)`	⬇️
lager/lenses/at.hpp	`100.00% <0.00%> (ø)`
lager/lenses/tuple.hpp	`100.00% <0.00%> (ø)`
test/type_erased_lens.cpp	`100.00% <0.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

dimula73 · 2022-12-22T12:56:23Z

Hi, @arximboldi!

The patch should now be ready for the final revew and merge. It fixes all the known issues with use-after-move and adds a comprehensive set of tests for that :)

dimula73 · 2022-12-22T12:59:26Z

lager/lenses.hpp

@@ -103,15 +224,10 @@ namespace lenses {
 //! @{

 template <typename Getter, typename Setter>
-auto getset(Getter&& getter, Setter&& setter)
+auto getset(const Getter& getter, const Setter& setter)


Just a note: the change to const Getter& getter is not a regression, because they were previously bound to a const reference inside a const (immutable) lambda.

The patch ensures the following requirements: 1) When `view(lens, whole)` is called: * the setter lambda is neither called nor constructed. It avoid unnecessary copies of the 'whole' object. * the `whole` object is directly forwarded into the getter lambda, without any copies. 3) When `set(lens, whole, part)` is called: * no getter is called at the toplevel lens (which value was previously discarded). * on the lower lenses, the getter receives an 'lvalue' reference to the `whole` object, and setter receives an 'rvalue' reference to the `whole` object. It ensures that there is no double-move problem. Fixes arximboldi#160

When the moved-from object is assigned, it becomes valid again. None of our code seems to use it though.

Such lens causes: 1) Compilaion failure due to the missing const functor in getset_t (also fixed in the patch) 2) A crash caused by double-move in the attr implementation (cannot be fixed without rewriting attr into getset)

The previous implementation cased severe issues: 1) Double-move of `p` argument when passing a prvalue into a nested chain of two attr lenses, which usually causes a crash. 2) `f = LAGER_FWD(f)` creates a `const decltype(f) f` object inside the lambda, which prevents the value to be perfect- forwarded with `identity_functor` and `const_functor` when setting/fetching it.

dimula73 · 2022-12-23T11:57:19Z

lager/lenses/attr.hpp

-            });
-        };
-    });
+    return getset(


Hi, @arximboldi!

I had to rewrite attr into (the fixed version of) getset, because otherwise it also caused double-move when nested and passed with a prvalue. Basically, a code like that is enough to crash it:

auto birthday_lens = attr(&debug_person::birthday); auto person_lens = attr(&debug_freelancer::person); auto freelancer_birthday_lens = person_lens | birthday_lens; auto birthday = set(freelancer_birthday_lens, debug_freelancer(freelancer_copy_info, person_copy_info, birthday_copy_info), debug_yearday(birthday_copy_info));

I have a weird feeling like all the lenses in lager have to be rewritten like that... :( Not only they do double-move, capturing functors into immutable lambdas prevents proper forwarding of the identity functors in view and set.

The functor should be forwarded into the lens.

arximboldi · 2022-12-25T16:57:00Z

Hi @dimula73 !

Thank you a lot for looking into this... this is definitelly not an easy problem... Oh, and before we dive right in, happy holidays!!

I have some questions, firstly, you mention the view and set cases, what about the over case? In that case it sounds to me, the getter needs to be called.

Also, with regards to double moves... I can imagine some cases where double-moves are saves, for example, somewhere where what happens is something like this:

struct S { std::string a; };

...
auto x = S{"..."};
auto b = std::move(s).a;
auto y = std::move(x);
y.a = std::move(b);
return y;

But I get that what is happening in the following invalid scenario, correct?

...
auto x = S{"..."};
auto&& b = std::move(s).a;
auto y = std::move(x);
y.a = std::move(b);
return y;

There is however something that makes me a bit unease about just trying to apply all sorts of heuristics in that getset implementation... It looks very fiddly, and how can other lenses implementors then provide their own correct lenses in Van-Laarhoven style? Do you expect simply users to implement all their lenses in terms of getset instead of in Van-Laarhoven style?

I guess that's fair enough... at the end of the day, lenses implemented with getset are more intelligible for the general public. It has the problem though that then you need to capture the context twice instead of once (in the getter and setter)...

Which makes me wonder... should we just ditch the Van-Laarhoven representation, and use a simpler lenses definition of the form:

struct name_lens 
{
  decltype(auto) get(auto&& x) { 
    return x.name; 
  }
  auto set(auto x, auto&& y) { 
    x.name = FWD(y);
    return x; 
  }
};

The problem, of course, is lens composability: we could then not use zug::comp to compose them, but we would need to provide our own composition. Not very hard though with something like:

template <class L1, class L2>
struct composed_lens 
{
  L1 l1;
  L2 l2.
  decltype(auto) get(auto&& x) { 
    return l1.get(l2.get(FWD(x))); 
  }
  auto set(auto x, auto&& y) { 
    l1.set(x, l2.set(l1.get(x), FWD(y))));
    return x; 
  }
};

There are still some inefficiencies for some cases because basically we do not know which parts of are consumed by the nested lenses. Sometimes we could do moves that we are not doing if we know that the nested lenses is consuming things indepdendly to some of the other stuff. This can also be said about some of the cases for the Van-Laarhoven lens. Also in this case over may incur extra cost, but maybe in practice we use over less than the other lenses...

Anyway... long story to say that I'm not sure what to do about this. There is some elegance about the Van-Laarhoven lens that maybe it would be nice to preserve. I wonder if we could also improve things by changing the way we call into the functors instead... E.g. instead of passing the value directly into the functor on the "getter bit", we can pass a lambda that returns the value, basically we simulate lazy-evaluation by explicitly passing thunks, for example:

template <typename Member>
auto attr(Member member)
{
    return zug::comp([member](auto&& f) {
        return [&, f = LAGER_FWD(f)](auto&& p) {
            return f([&p] { return LAGER_FWD(p).*member; )})([&](auto&& x) {
                auto r    = LAGER_FWD(p);
                r.*member = LAGER_FWD(x);
                return r;
            });
        };
    });
}

...
// const_functor would not change, but identity_functor would:
template <typename T>
struct identity_functor
{
    T value;

    template <typename Fn>
    auto operator()(Fn&& f) &&
    {
        return make_identity_functor(
            std::forward<Fn>(f)() // extra () here!
              (std::forward<T>(value)));
    }
};

That solves the problem of not calling the getter too often when we are only interested in the setter part.

There is still the problem of being safe for moves without introducing too many intermediate copies. I guess to really do that safely in a general performant way we need to somehow annotate the lens to know wheter it's "independent" or not. E.g whether this is safe or not:

auto lens = ...;
auto whole =  ...;
auto part = view(lens, std::move(whole));
auto whole2 = set(lens, std::move(whole), std::move(part));

For most lenses that just "zoom in" into an individual part, that holds. But for some others that's not generally safe...

Anyway... lots of food for thought... But I agree we should find a solution at some point...

dimula73 · 2022-12-26T08:39:26Z

Hi, @arximboldi!

Happy holidays to you too! :)

I have some questions, firstly, you mention the view and set cases, what about the over case?

Speaking truly, I haven't much tested it. When I looked into it originally, it seemed to me that it doesn't have the extra-get-call issue. Though I'm not sure anymore. It would be better to test it.

But I get that what is happening in the following invalid scenario, correct?

I'm not sure I understand what you mean by s variable in your example. What happens there is:

struct S { std::string a; };
struct T { S s; };

auto lens = attr(&T::s) | attr(&S::a);

// in scenario when we build a lens for a prvalue, T&& binds to 
// it directly and becomes an lvalue
S toplevel_getter(T t) {
    return T.s;
}

// same thing happens in the setter, because it is 
// passed with a prvalue
S toplevel_setter(S s, T t) {
    s.t = t;
    return s;
}

T set_the_top_lens(T t, std::string a) {
    // std::forward translates into std::move for a prvalue
    S subordinate = toplevel_getter(std::move(t));

    // ask the lower level lens
    subordinate = lowerlevel_setter(std::move(subordinate ), a);   

    // std::forward translates into std::move again, so it
    // moves `t` again :(
    return toplevel_setter(std::move(t), subordinate );
}

Do you expect simply users to implement all their lenses in terms of getset instead of in Van-Laarhoven style?

Yes...

...at the end of the day, lenses implemented with getset are more intelligible for the general public. It has the problem though that then you need to capture the context twice instead of once (in the getter and setter)...

Could you elaborate a bit on that topic? As far as I can see, even in the current implementation we do capture the context for the setter even when it is not called. Or you mean something else?

There is some elegance about the Van-Laarhoven lens that maybe it would be nice to preserve

Speaking truly, I don't much find it elegant :) Even after spending really lots of hours with it this year [1] I still cannot reason about it without the help of the debugger. I kind of understand how to write a lens and how to use it, but I cannot reason about internals of it. I just don't know how to visualize it on a piece of paper, which is usually not a good sign :)

E.g. instead of passing the value directly into the functor on the "getter bit", we can pass a lambda that returns the value, basically we simulate lazy-evaluation by explicitly passing thunks

It solves only one(!) issue of the three(!). Namely, it solves the issue of making an extra get() call in the toplevel lens. It neither solves the double-move problem in nested lenses, nor extra copy-constructions in identity_functor and const_functor when passing the value to and fro.

template <typename Member>
auto attr(Member member)
{
    return zug::comp([member](auto&& f) {
        return [&, f = LAGER_FWD(f)](auto&& p) {
//                 ^
//                 + - problem number one: here we create
//                     `const identity_functor<T> f` whose
//                     constness prevents moves of the value;
//                     the value is **copied** at every step
//                     of the call chain

            return f([&p] { return LAGER_FWD(p).*member; )})
//                    ^                      ^
//                    + -------------------- + - the first move happens 
// somewhere here, I'm not sure where

                ([&](auto&& x) {

                auto r    = LAGER_FWD(p);
//                                    ^
//                                    + - second move
                r.*member = LAGER_FWD(x);
                return r;
            });
        };
    });
}

Now when thinking about it, I think that Van-Laarhoven lenses have two requirements that we break when trying to use them in C++:

Van-Laarhoven lens expects founctions to be pure, which is not true in C++, where a function can move-consume its arguments, hence have a side-effect;
Van-Laarhoven lens expects a guarantee that a pure function is not called when its result is known to be discarded. [2]
C++'s tools (perfect forwarding, RVO and NRVO) are supposed to optimize function return values, but Van-Laarhoven lens returns the value using lambda-capture and class-members, which are not optimized.

Theoretically, if we make all our functions inside lenses work with const-ref only, we will make them somewhat "pure". It might solve all these three issues. But...

It will require Lager to drop this requirement:

auto name = attr(&person::name);
auto birthday_month = attr(&person::birthday) | attr(&yearday::month);

CHECK(&view(name, p1) == &p1.name);
int& x = view(birthday_month, p1);

This requirement doesn't seem to be useful for the Lager's cursors framework, but it might be a nice feature of lenses in general.

We could actually have two sets of lenses: const and non-const, but it would be really painful to implement with lambdas anyway...

Which makes me wonder... should we just ditch the Van-Laarhoven representation, and use a simpler lenses definition of the form

As you might have guessed already, I would support this decision ;) Actually, my first attempt to fix the issue was to implement a separate subset of lenses without the use of Van-Laarhoven representation. But I wanted them to be interchangeable and composable with stuff declared with zug::comp, which appeared to be impossible.

PS:
Even though I haven't really found the Van-Laarhoven lenses elegant or nice, I think transduces are a nice thing. I've been actually thinking about pairing them with SYCL engine. It could potentially make some of our algorithms much simpler and maintainable. Though now, when I found out about these capture/forward issues, I have some doubts... do they cause the same move/copy issues?

[1] - I spent almost 11 months on porting Krita's brush editor into lager. In case you are interested in our usage of lager, here is the working branch and the documentation.

[2] - I'm not familiar with Haskel, does it have such a guarantee?

It fixes a critical bug, so we'd better use this one. The upstream MR: arximboldi/lager#165

arximboldi · 2023-04-21T21:39:13Z

Sorry @dimula73 for having left for so long. Your comment on #175 about how this is a must for you to use makes me think that we should probably find a workable solution for this.

Pinging @Siapran as she has ended up digging way deeper than me into lenses. What do you think @Siapran about this PR? Or should we go one step further and completely ditch the Van-Laarhoven lenses?

dimula73 · 2023-05-30T12:56:09Z

Hi, @arximboldi and @Siapran!

Is there any update on the topic? We plan to make the first public (beta-) release of Krita on July, 17th, so it would be nice to have at least some version of Lager released by then (to make Linux distributions happy).

Our final release deadline set on the 1st of September, so distributions will start running like crazy after lager on that date, unless we invent anything.

On the topic of the MR: Krita can technically work without this MR because all our lenses use const-refs for the reading part, so it should be safe. The only worry might be that lager may change API a bit between different released versions. Which, I guess, is fine if the first released version is something like 0.1.0 :)

arximboldi · 2023-06-26T09:05:35Z

Hi @dimula73 !

I have been meditating on this and I have decided that I am cool with moving away from Van-Laarhoven lenses. That's a big revamp but if you wanna to take on that job I will be happy to accept that PR. We would need to provide a get/setter based representation supported by our own definition of comp and operator|. To be fair, I think everything would become so much simpler then. The only downside is that sometimes there would be an extra capture in memory (one for the getter and one for the setter) when using lambdas to define the lens, which is one of the reasons I thought initially Van-Laarhoven was also interesting from a performance POV. In many cases we can work around that by defining an object for the lens though.

arximboldi · 2023-06-26T09:07:18Z

The question remains, what to do about this one? The implementation is complicated and I haven't even managed to work through all the details. However the unit testing is very rigurous, so I would be happy to merge it in a leap of faith, if you think this helps as an interim step before we move away from Van-Laarhoven lenses.

dimula73 · 2023-06-26T13:12:12Z

The question remains, what to do about this one?

I guess the project should be split into two independent steps: one to port all the lenses to getset() and the other one to implement getset() via some different implementation.

What do you think? Are there any lenses that can be implemented easier without getset(), but via some base API?

arximboldi · 2023-06-26T15:06:51Z

Hmmm... I think some lenses could be implemented more efficiently without getset (one closure vs two) however moving from a getset based implementation to a direct one should be easy (it would still be two functions, but sharing the closure). This is what I mean specifically:

A getset based implementation for attr

template <typename Member>
auto attr(Member member)
{
  return getset(
    [member] (auto&& x) -> decltype(auto) {
       return LAGER_FWD(x).*member;
    },
    [member] (auto x, auto&& v){
       x.*member = LAGER_FWD(v);
       return x;
    })
}

Vs. a "direct" one:

template <typename Member>
struct attr_t
{
  Member member;

   template <typename T>
   decltype(auto) get (T&& x) {
      return LAGER_FWD(x).*member;
   }

   template <typename T, typename U>
   T set(T x, U&& v) {
      x.*member = LAGER_FWD(v);
      return x;
   }
}

template <typename Member> auto attr(Member m) { return attr_t<Member>{m}; }

They are very similar they just change in how we close over member in order to share the data between getter and setter.

Am I making sense?

dimula73 · 2023-06-27T11:08:06Z

Yes, this case sounds sensible

dimula73 added 2 commits November 21, 2022 19:35

Fix copilation with gcc9

596983f

error: 'auto' not allowed in function prototype

dimula73 changed the title ~~WIP: Fix extra call to 'getter' in 'lenses::getset' on set call~~ Fix double std::move on objects in 'lenses::getset' on set call Dec 22, 2022

dimula73 commented Dec 22, 2022

View reviewed changes

dimula73 force-pushed the kazakov/fix-double-move-v2 branch from bc73682 to 51e3c0e Compare December 22, 2022 13:00

dimula73 mentioned this pull request Dec 23, 2022

Test implementation of deferred calculation of the 'get()' part of le… #161

Closed

dimula73 added 3 commits December 23, 2022 14:40

Fix the move-from verification code

3ea51d9

When the moved-from object is assigned, it becomes valid again. None of our code seems to use it though.

Add tests for nested attr|getset lens

af07b69

Such lens causes: 1) Compilaion failure due to the missing const functor in getset_t (also fixed in the patch) 2) A crash caused by double-move in the attr implementation (cannot be fixed without rewriting attr into getset)

dimula73 commented Dec 23, 2022

View reviewed changes

Fix compilation of lenses/unbox.hpp

0b6ab3e

The functor should be forwarded into the lens.

kdesysadmin pushed a commit to KDE/krita that referenced this pull request Mar 11, 2023

Switch the version of lager to a forked one

7f786f2

It fixes a critical bug, so we'd better use this one. The upstream MR: arximboldi/lager#165

dimula73 mentioned this pull request Apr 3, 2023

Lager library has no official releases #175

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix double std::move on objects in 'lenses::getset' on set call #165

Fix double std::move on objects in 'lenses::getset' on set call #165

dimula73 commented Nov 21, 2022 •

edited

Loading

codecov-commenter commented Nov 22, 2022 •

edited

Loading

dimula73 commented Dec 22, 2022

dimula73 Dec 22, 2022

dimula73 Dec 23, 2022

arximboldi commented Dec 25, 2022

dimula73 commented Dec 26, 2022

arximboldi commented Apr 21, 2023

dimula73 commented May 30, 2023 •

edited

Loading

arximboldi commented Jun 26, 2023 •

edited

Loading

arximboldi commented Jun 26, 2023

dimula73 commented Jun 26, 2023

arximboldi commented Jun 26, 2023

dimula73 commented Jun 27, 2023

Fix double std::move on objects in 'lenses::getset' on set call #165

Are you sure you want to change the base?

Fix double std::move on objects in 'lenses::getset' on set call #165

Conversation

dimula73 commented Nov 21, 2022 • edited Loading

codecov-commenter commented Nov 22, 2022 • edited Loading

Codecov Report

dimula73 commented Dec 22, 2022

dimula73 Dec 22, 2022

Choose a reason for hiding this comment

dimula73 Dec 23, 2022

Choose a reason for hiding this comment

arximboldi commented Dec 25, 2022

dimula73 commented Dec 26, 2022

arximboldi commented Apr 21, 2023

dimula73 commented May 30, 2023 • edited Loading

arximboldi commented Jun 26, 2023 • edited Loading

arximboldi commented Jun 26, 2023

dimula73 commented Jun 26, 2023

arximboldi commented Jun 26, 2023

dimula73 commented Jun 27, 2023

dimula73 commented Nov 21, 2022 •

edited

Loading

codecov-commenter commented Nov 22, 2022 •

edited

Loading

dimula73 commented May 30, 2023 •

edited

Loading

arximboldi commented Jun 26, 2023 •

edited

Loading