Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about: distributed slices (linkme) #545

Open
CAD97 opened this issue Nov 8, 2024 · 3 comments
Open

What about: distributed slices (linkme) #545

CAD97 opened this issue Nov 8, 2024 · 3 comments

Comments

@CAD97
Copy link

CAD97 commented Nov 8, 2024

I don't think we have an issue tracking this yet. linkme implements distributed slices with linker shenanigans such that it's possible to write #[distributed_slice] pub static ITEMS: [Item]; and use that slice to access any number of #[distributed_slice(ITEMS)] static ITEM: Item = /* … */; safely.

This is implemented by expanding to, very roughly (omitting non-linux support, item type validation, and guards against name clashes):

pub static ITEMS: &[Item] = unsafe {
    #[used(linker)]
    #[link_section = "linkme_ITEMS"]
    static mut __LINKME: [Item; 0] = [];
    extern "Rust" {
        #[link_name = "__start_linkme_ITEMS"]
        static __START: Item;
        #[link_name = "__stop_linkme_ITEMS"]
        static __STOP: Item;
    }

    assert!(size_of::<Item>() > 0);
    slice::from_ptr_range(
        &raw const __START,
        &raw const __STOP,
    )
};

#[used(linker)]
#[link_section = "linkme_ITEMS"]
static ITEM: Item = /* … */;

Unfortunately, as currently written, I think this should be considered unsound (a case of deliberate UB):

under no circumstances is it fine to access the same underlying global memory with pointers derived from different static or extern static declarations (except, probably, those with the same link_name).

Originally posted by @RalfJung in rust-lang/reference#1657 (comment)

Namely, because the static ITEM is accessible through both ITEM and the slice ITEMS. Writing this in a sound manner (the static item is only accessible through a single static name) may be possible, but isn't particularly nice and introduces additional indirection, very roughly:

static ITEM: &Item = unsafe {
    #[used(linker)]
    #[link_section = "linkme_ITEMS"]
    static __LINKME: Item = /* … */;
    extern "Rust" {
        #[link_name = "__start_linkme_ITEMS"]
        static __START: Item;
    }

    (&raw const __START)
        .with_addr_of(&raw const __LINKME)
        .as_ref().unwrap_unchecked()
};

…however, on Windows, the situation is even more squirrelly, because __START/__STOP aren't extern static, because Windows doesn't have magic symbols for the boundary of link sections like unixes do. Instead, [Item; 0] statics are created in the right place utilizing section ordering.

@digama0
Copy link

digama0 commented Nov 8, 2024

I'm dubious that we actually want this rule that you can't jump statics with the same pointer. For statics with specific link flags, the addresses are public and possibly chosen to line up with something else, and so I think it should be possible to access these allocations using no-provenance pointers (transmuted integers, or at least integers passed through some from_external_alloc function), and they should all either have no provenance or the same "external" provenance used for allocations shared with the outside world.

@CAD97
Copy link
Author

CAD97 commented Nov 8, 2024

The most straightforward resolution would probably be to extend the "same link_name" exception to also consider #[used] to enable "shenanigans mode." But this still results in a static Rust allocated object being inside the accessed memory region through a differently named static.

A more direct encoding of behavior would be to say that all static with the same link_section MAY1 actually name sub-places within a larger allocated object accessible in a target dependent manner. Each static item still only has subsliced provenance to that single static, but are not guaranteed to be distinct allocated objects from each other.


Additionally, the intermittent discussion of the compiler exposing a similar feature has usually assumed that the compiler doing this kind of aliasing of static items into a shared slice is acceptable behavior. I don't know the specific implementation details of static place linking to know whether the compiler can add/remove hidden linkage indirection to impact whether this is necessarily UB.

(Whether the language should expose such a feature, how it should work, and complications from dynamic linking are not relevant here.)

Footnotes

  1. The exact behavior depends on the target platform/linker specific behavior, obviously.

@RalfJung
Copy link
Member

RalfJung commented Nov 9, 2024

Orthogonal to what was discussed above, __START and __STOP need to use zero-sized types, the current scheme seems unsound for the case where the slice is empty.

Namely, because the static ITEM is accessible through both ITEM and the slice ITEMS. Writing this in a sound manner (the static item is only accessible through a single static name) may be possible, but isn't particularly nice and introduces additional indirection, very roughly:

That's just one static pointing to another, isn't it? I don't see the problem with that.

I am using the LLVM definition of "derived from". *ptr is not "derived from" ptr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants