Skip to content

Closures in generic code can create duplicate monomorphizations #83010

Closed
@tgnottingham

Description

@tgnottingham

Summary

When a closure is defined in a generic context like a generic function or a method of a generic class, its type includes, in some sense or another, the parent's generic parameters. If the closure doesn't depend on some of those generic parameters, it can lead to the instantiation of mono items which are functionally identical, but distinct by virtue of false dependencies on unused generic parameters of the parent. I'll call these mono items "duplicates".

These duplicates create needless work and code bloat. At the very least, they go through the codegen and optimization pipeline, and if not optimized away, are included in the binary. I imagine there is other overhead, such as creation of more distinct types.

Example 1

fn foo<T: std::fmt::Display>(t: T) {
    let _ = vec!["hello".to_string()].iter().map(|x| x);
    println!("{}", t);
}

fn main() {
    foo(0i32);
    foo(0i64);
}

The map call is passed a closure |x| x which does not depend the parent function's generic parameter T. The code for that closure is identical in both foo::<i32> and foo::<i64>, so we should be able to only generate one map instantiation and Map type (the return type of map, which is generic over the closure that map was passed), and use it in both foo instances. Yet this creates two map instantiations, and two Map::new instantiations, because the closure type is polluted by the parent's T parameter, which, again, the closure doesn't use in any way.

Excerpt from -Z print-mono-items=lazy:

MONO_ITEM fn <std::slice::Iter<std::string::String> as std::iter::Iterator>::map::<&std::string::String, [closure@src/main.rs:2:50: 2:55]> @@ lzylnwmetevawv6[External]
MONO_ITEM fn <std::slice::Iter<std::string::String> as std::iter::Iterator>::map::<&std::string::String, [closure@src/main.rs:2:50: 2:55]> @@ lzylnwmetevawv6[External]

MONO_ITEM fn std::iter::Map::<std::slice::Iter<std::string::String>, [closure@src/main.rs:2:50: 2:55]>::new @@ 290s40pyalavf195[External]
MONO_ITEM fn std::iter::Map::<std::slice::Iter<std::string::String>, [closure@src/main.rs:2:50: 2:55]>::new @@ 290s40pyalavf195[External]

(I don't know why, but there weren't any mono items in the output for the closure itself in this example.)

If we actually did anything interesting with the Maps, it would create yet more duplicate instantiations. And if the closure captured anything by value that needed dropping, it would create duplicate drop_in_place instantiations for the closure and Map types.

Example 2

This isn't a rare event. Here's an example from the standard library which was quickly and easily found. This case is different, because the parent generics come from the struct for which the function is implemented:

impl<K, V, S> PartialEq for HashMap<K, V, S>
where
    K: Eq + Hash,
    V: PartialEq,
    S: BuildHasher,
{
    fn eq(&self, other: &HashMap<K, V, S>) -> bool {
        if self.len() != other.len() {
            return false;
        }

        self.iter().all(|(key, value)| other.get(key).map_or(false, |v| *value == *v))
    }
}

The closure passed to map_or really only depends on V, yet it has false dependencies on K and S. Creating two HashMaps with the same value type will cause duplicate mono items to be created for this closure and map_or:

use std::collections::HashMap;

fn main() {
    let m1 = HashMap::<String, u8>::new();
    assert_eq!(m1, m1);

    let m2 = HashMap::<i32, u8>::new();
    assert_eq!(m2, m2);
}

Excerpt from -Z print-mono-items=lazy:

MONO_ITEM fn <std::collections::HashMap<i32, u8> as std::cmp::PartialEq>::eq::{closure#0}::{closure#0} @@ i6hou2d2a3r4xaf[External]
MONO_ITEM fn <std::collections::HashMap<std::string::String, u8> as std::cmp::PartialEq>::eq::{closure#0}::{closure#0} @@ i6hou2d2a3r4xaf[External]

MONO_ITEM fn std::option::Option::<&u8>::map_or::<bool, [closure@<std::collections::HashMap<i32, u8> as std::cmp::PartialEq>::eq::{closure#0}::{closure#0}]> @@ 583fmf0zweyhx5vp[External]
MONO_ITEM fn std::option::Option::<&u8>::map_or::<bool, [closure@<std::collections::HashMap<std::string::String, u8> as std::cmp::PartialEq>::eq::{closure#0}::{closure#0}]> @@ 583fmf0zweyhx5vp[External]

Impact

Okay, so there are some duplicate mono items. Maybe even a lot in a real codebase. But is it impactful? I haven't measured, but I suspect it is. Take a look at the Rust and assembly for Option::map_or (for the above closure) in debug mode, and consider the above example:

    pub fn map_or<U, F: FnOnce(T) -> U>(self, default: U, f: F) -> U {
        match self {
            Some(t) => f(t),
            None => default,
        }
    }
Assembly
<<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>>:
	                sub    $0x58,%rsp
	                mov    %rdi,0x18(%rsp)
	                mov    %sil,%al
	                and    $0x1,%al
	                mov    %al,0x37(%rsp)
	                mov    %rdx,0x38(%rsp)
	                movb   $0x0,0x36(%rsp)
	                movb   $0x0,0x35(%rsp)
	                movb   $0x1,0x36(%rsp)
	                movb   $0x1,0x35(%rsp)
	                mov    0x18(%rsp),%rcx
	                test   %rcx,%rcx
	                setne  %al
	                movzbl %al,%r8d
	                mov    %r8d,%ecx
	                mov    %rdx,0x10(%rsp)
	                mov    %sil,0xf(%rsp)
	         /----- je     <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0x4d>
	         |  /-- jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0x4b>
	      /--|--\-X jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0x60>
	      |  \----> movb   $0x0,0x36(%rsp)
	      |         mov    0xf(%rsp),%al
	      |         and    $0x1,%al
	      |         mov    %al,0x27(%rsp)
	/-----|-------- jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0xac>
	|     |         ud2    
	|     \-------> mov    0x18(%rsp),%rax
	|               mov    %rax,0x40(%rsp)
	|               movb   $0x0,0x35(%rsp)
	|               mov    %rax,0x28(%rsp)
	|               mov    0x28(%rsp),%rsi
	|               mov    0x10(%rsp),%rdi
	|               callq  <<std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>
	|               mov    %al,0xe(%rsp)
	|           /-- jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0x89>
	|           \-> mov    0xe(%rsp),%al
	|               and    $0x1,%al
	|               mov    %al,0x27(%rsp)
	+-------------- jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0xac>
	|  /----------> testb  $0x1,0x36(%rsp)
	|  |  /-------- jne    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0xb5>
	|  |  |  /----> mov    0x27(%rsp),%al
	|  |  |  |      and    $0x1,%al
	|  |  |  |      movzbl %al,%eax
	|  |  |  |      add    $0x58,%rsp
	|  |  |  |      retq   
	|  +--|--|--/-X jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0x95>
	\--|--|--|--|-> testb  $0x1,0x35(%rsp)
	   |  |  |  \-- jne    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0xaa>
	   \--|--|----- jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0x95>
	      \--\----X jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0x9c>
	         /--/-X jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0xc0>
	      /--|--|-> testb  $0x1,0x36(%rsp)
	      |  |  \-- jne    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0xb7>
	      |  \----> mov    0x48(%rsp),%rdi
	      |         callq  <_Unwind_Resume@plt>
	      |         ud2    
	      |         mov    %rax,0x48(%rsp)
	      |         mov    %edx,0x50(%rsp)
	      \-------- jmp    <<core::option::Option<&u8>>::map_or::<bool, <std::collections::hash::map::HashMap<alloc::string::String, u8> as core::cmp::PartialEq>::eq::{closure#0}::{closure#0}>+0xb9>
	                nopw   0x0(%rax,%rax,1)

(Wow, that's surprisingly complicated.)

In debug mode, this code is included in the binary twice, once for each closure. In release mode, the code may be optimized out, but it's still processed for some part of compilation. This was a trivial example. Imagine if map_or was a complicated function!

If there are hundreds or thousands of duplicate mono items in moderate to large codebases, which seems quite possible, then this may have a noticeable impact on compile times and binary sizes.

Solution

If possible, we should make closure types depend only on the parent generics which they actually use.

The ClosureSubsts comment explains why parent generics are included:

/// All right, you say, but why include the type parameters from the
/// original function then? The answer is that codegen may need them
/// when monomorphizing, and they may not appear in the upvars. A
/// closure could capture no variables but still make use of some
/// in-scope type parameter with a bound (e.g., if our example above
/// had an extra `U: Default`, and the closure called `U::default()`).
///
/// There is another reason. This design (implicitly) prohibits
/// closures from capturing themselves (except via a trait
/// object). This simplifies closure inference considerably, since it
/// means that when we infer the kind of a closure or its upvars, we
/// don't have to handle cycles where the decisions we make for
/// closure C wind up influencing the decisions we ought to make for
/// closure C (which would then require fixed point iteration to
/// handle). Plus it fixes an ICE. :P

See #27087, where this was introduced. #27086 may also be relevant. Here and here are FIXMEs for that issue, later removed.

I have a feeling the case where the parent generics come from a generic struct on which the function including the closure is defined (example 2) may be harder to solve.

@rustbot label T-compiler A-typesystem I-compiletime I-heavy

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-type-systemArea: Type systemI-compiletimeIssue: Problems and improvements with respect to compile times.I-heavyIssue: Problems and improvements with respect to binary size of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions