Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Allocator interface so client can avoid needless copying #4431

Closed
BarabasGitHub opened this issue Feb 11, 2020 · 19 comments
Closed

Improve Allocator interface so client can avoid needless copying #4431

BarabasGitHub opened this issue Feb 11, 2020 · 19 comments
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@BarabasGitHub
Copy link
Contributor

In my opinion the current Allocator interface is suboptimal. Mainly because it does more than just allocate memory, it will also copy data for you. I think it's important to get something like this right, as it will be used pretty much everywhere. I'm giving my opinion because I'd like to see it done right, especially in Zig, which seems to get so many things right already.

Currently the Allocator interface is as follows (I left out some documentation to keep it shorter):

    /// Realloc is used to modify the size or alignment of an existing allocation,
    /// as well as to provide the allocator with an opportunity to move an allocation
    /// to a better location.
    reallocFn: fn (
        self: *Allocator,
        old_mem: []u8,
        old_alignment: u29,
        new_byte_count: usize,
        new_alignment: u29,
    ) Error![]u8,

    /// This function deallocates memory. It must succeed.
    shrinkFn: fn (
        self: *Allocator,
        old_mem: []u8,
        old_alignment: u29,
        new_byte_count: usize,
        new_alignment: u29,
    ) []u8,

I have issues with the semantics of reallocFn (shrinkFn is fine). It is essentially the same as the C library realloc function in that it can both modify (grow/shrink) the existing memory area, or allocate a totally new area and copy the data to there. The last part is where I think this is suboptimal. There are many cases where copying 'the data' is not what the caller wants.

  1. Consider inserting an element in the middle of an array. In this case copying 'the data' includes all elements while actually only the elements up until the insertion point need to be copied, the rest needs to be moved up one element. Using the current reallocFn semantics, all elements after the insertion point will end up being copied twice.
  2. Similar to 1, but worse is when a matrix is stored in one memory block row-by-row and a column needs to be appended (or inserted). The current reallocFn will copy all elements, while actually the rows need to be copied separately to slightly different locations. In addition to having to move the rows, the resulting code will be more complicated, because now the new memory comes already filled with the rows and the old memory block is no longer available. This means that the elements in the rows have to be copied backwards (starting from the end).
  3. Consider an array which has capacity for 100 elements, but only has a few in use. Appending 100 new elements will require reallocation and reallocFn will happily copy all the 100 elements to the new memory, while actually only a few need to be copied and the rest is undefined data.
  4. Though probably less likely, but if there is a data structure which refers to other nodes but stored in one block of memory the copying/moving done by reallocFn is completely useless and will actually destroy the data-structure.

What I would like to see is in the basis an allocate and free, with on top a realloc which only tries to modify the existing memory block and fails if that's not possible. In no case should it try to be smart and move data around, that's better left to the user, who has more knowledge and can move the data to it's correct position.

Please let me know what you think.

@daurnimator daurnimator added the standard library This issue involves writing Zig code for the standard library. label Feb 11, 2020
@ikskuh
Copy link
Contributor

ikskuh commented Feb 11, 2020

I think you have a misunderstanding on what realloc is actually doing.

The data copy is only necessary if the allocator cannot expand the current memory object. Then it is forced to relocate that memory object (move it to a completly different address, not sharing a single byte of memory with the previous location) where there is enough space to store the requested data size.

This also means that you are required to copy all bytes from the previous location to the new one as the new location has no defined memory content.

Your description of points 1. … 4. are all insert operations that operate on a fixed memory block. If you actually need to resize the data and the allocator is required to relocate the memory block, you alway have to copy all bytes, otherwise you result with invalid data.

@BarabasGitHub
Copy link
Contributor Author

BarabasGitHub commented Feb 11, 2020

No, I understand that perfectly and I believe that's not the job of the allocator to do that for the reasons I stated above. There are many situation where it's not just a blind copy and you actually want to have the data in the new memory positioned slightly differently.

The data copy is only necessary if the allocator cannot expand the current memory object. Then it is forced to relocate that memory object (move it to a completly different address, not sharing a single byte of memory with the previous location) where there is enough space to store the requested data size.

No. I am saying it should fail in that case, because copying the data can be expensive and blindly copying everything might not actually be what you want.

Your description of points 1. … 4. are all insert operations that operate on a fixed memory block.

3 and 4 aren't. But yes if you're just appending one by one to a list/array then it just so happens that it's doing the right thing. In all the other cases probably not.

If you actually need to resize the data and the allocator is required to relocate the memory block, you alway have to copy all bytes, otherwise you result with invalid data.

Copy all the bytes, yes probably. But not necessarily in the exact same layout. And sometimes not even all the bytes, because some are unused or maybe some need to change.

@mikdusan
Copy link
Member

What I would like to see is in the basis an allocate and free, with on top a realloc which only tries to modify the existing memory block and fails if that's not possible. In no case should it try to be smart and move data around, that's better left to the user, who has more knowledge and can move the data to it's correct position.

Just to be clear, do you want a realloc that never changes "existing" ptr absolute value but gives the caller a blessing that says "yup, it's been resized now" or "no, can't resize it", presumably the latter only if larger is not possible given ptr is anchored. ?

@andrewrk andrewrk added this to the 0.7.0 milestone Feb 11, 2020
@andrewrk andrewrk changed the title Allocator interface Improve Allocator interface so client can avoid needlee copying Feb 11, 2020
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Feb 11, 2020
@andrewrk andrewrk changed the title Improve Allocator interface so client can avoid needlee copying Improve Allocator interface so client can avoid needless copying Feb 11, 2020
@BarabasGitHub
Copy link
Contributor Author

BarabasGitHub commented Feb 12, 2020

Yes, that's what I meant.

presumably the latter only if larger is not possible given ptr is anchored. ?

Not entirely sure what you mean here, but I think yes. If it can't change the allocated size (and maybe smaller is always possible?) it should tell the caller "no can't resize it".

@marler8997
Copy link
Contributor

marler8997 commented Feb 12, 2020

I ran into this exact same situation in my alternative standard library for D (https://github.com/dragon-lang/mar). It's pretty easy to solve though. Just have the underlying reallocFn function not perform any copy. The caller will know whether the memory was moved based on the return value. This small change allows both the current behavior and for other caller's to implement their own "copy on move" mechanisms.

To implement the current behavior is as simple as creating a function that takes an allocator, calls realloc, then performs the copy if it sees the address was changed. This follows the "separation of concerns" principle where the realloc is concerned with reserving space, and another function can be concerned with preserving data, and both can be swapped out independently for different behavior.

@ikskuh
Copy link
Contributor

ikskuh commented Feb 12, 2020

Just have the underlying reallocFn function not perform any copy.

This will not work as a successful realloc means that the original memory will be released for future allocations and accessing it is undefined behaviour. So checking the pointers/slices afterwards will not work, as previous pointer/slice is not valid anymore.

The moving must happen before the allocator releases the old memory, but after allocating the new memory. This could probably be solved by passing an optional function to realloc that will perform a specialized "resize" operation

@marler8997
Copy link
Contributor

marler8997 commented Feb 12, 2020

The moving must happen before the allocator releases the old memory, but after allocating the new memory. This could probably be solved by passing an optional function to realloc that will perform a specialized "resize" operation

Oh that's right. I'm starting to remember it wasn't quite that simple :)

Looking at my code now what I did was have the underlying function implement a reallocInPlace that either works or returns an error. This way, the reallocInPlace functionality isn't coupled to a particular way of copying data from the old array, as there are many possible ways a caller could want to copy data from the old array. Maybe they only need a small section of the original array, or multiple disjoint sections. This allows the caller to use reallocate in place when possible, but use any mechanism they like to preserve data when it can't be extended in place. With this API, the current behavior and other variations on it can be implemented with simple functions that try reallocInPlace, then handle the default alloc/copy behavior if it fails.

@gereeter
Copy link
Contributor

See also Rust's Alloc trait with grow_in_place and shrink_in_place methods (discussion: 1, 2) and jemalloc's xallocx function. jemalloc and TCMalloc both provide a nallocx function that can be used for the simplest in-place reallocation cases.

(Side note: jemalloc and TCMalloc also both provide a sized deallocation function sdallocx, a completely separate optimization inaccessible from Zig's current allocator interface.)


Personally, I'm a big fan of having in-place reallocation functions.

From an allocator implementation standpoint, the only case I've heard of for implementing realloc as anything more than reallocInPlace + memcpy on failure is to be able to use MREMAP_MAYMOVE on large allocations. Some allocators (e.g. jemalloc) try to avoid MREMAP_MAYMOVE anyway for fragmentation reasons.

From a user perspective, I'd argue that realloc (allowing moves) is just an implementation detail of dynamic arrays and useful for little else. If you want to resize something more complex like a hash table or a ring bugger, you could copy over to the new allocation and then shuffle things around, but it would be more efficient to do the shuffling as part of the copy; you'd only want to do the shuffling in-place if the reallocation itself was in-place.

@BarabasGitHub
Copy link
Contributor Author

Cool, thanks for the links with examples. I did know about those.

@andrewrk andrewrk added the accepted This proposal is planned. label Apr 14, 2020
@andrewrk
Copy link
Member

Here's my proposal:

  • Keep reallocFn with the same semantics. Can fail. Can return a different address, which involves copying data. (Note that reallocFn with previous size of 0 is equivalent to the concept of "alloc")
  • Keep shrinkFn with the same semantics. Cannot fail. Can return a different address, which involves copying data. Make it optional (Allocator interface: make shrinkFn optional #2292 and there is open PR Make shrinkFn optional #4739 to implement it)
  • Add resizeFn to the interface, which is an optional function that the interface need not provide. Can fail. Cannot return a different address. Guaranteed to never copy data. If this function is not provided, Allocator provides a default implementation which always returns error.OutOfMemory, indicating that the resize failed. Note that resizeFn can fail even when shrinking, same as reallocFn, which indicates to the client that the Allocator is not able to recover any bytes from this operation, and the client should keep the bytes. The client will use shrinkFn if it wishes to shrink without fail.

This proposal has synergy with #3803 / #3804.

As example usage, ArrayList will switch to call resizeFn for ensureCapacity, followed by reallocFn with previous length of 0 if that fails, and do its own copying, avoiding copying the undefined memory, as noted in @BarabasGitHub's example, followed by shrinkFn to 0 on the old memory.

@marler8997
Copy link
Contributor

marler8997 commented Apr 14, 2020

With resizeFn added, wouldn't every reallocFn be covered by this implementation?

// note: I'm assuming resizeFn returns error{OutOfMemory}!void

pub fn reallocFn(...) Error![]u8 {
    self.resizeFn(...) catch  {
        if (newSize < originalSize) return Error.OutOfMemory;
        var newBuf = try self.alloc(...);
        mem.cpy(newBuf, oldBuf, oldBuf.len);
        self.free(oldBuf);
        return newBuf;
    };
    return oldBuf; // resizeFn worked, return original buffer
}

@ikskuh
Copy link
Contributor

ikskuh commented Apr 15, 2020

No, as resizeFn is used to allocate (self.alloc calls reallocFn).

As @Tetralux noted on Discord:
It's not immediatly clear what those functions do and how they differ.
I propose that resizeFn should be called resizeInPlaceFn to make the difference clearer how it is distinct from reallocFn

@marler8997
Copy link
Contributor

marler8997 commented Apr 15, 2020

For reallocFn and resizeFn I see 3 operations:

  1. allocate new memory
  2. resize existing memory in place
  3. move data from existing memory to a larger allocated space and free original memory

reallocFn implements all 3 of these operations. The newly proposed interface by @andrewrk allows an allocator to split out operation 2 (resize existing memory in place) so that the caller can invoke that operation without also having to invoke operation 3.

I would propose that the "Allocator interface" (just the function pointers) limit each function to one operation and that we use "composition" to combine operations. Instead of having the one function that performs all 3 operations (reallocFn), I would define one function for the first operation, one function for the 2nd operation, and the combination of 1 2 3 could then be a common member function on Allocator (this is what I mean by "composition"). So we would have:

allocFn: allocates new memory
resizeFn: resizes existing memory in place

And realloc would just be a member function on Allocator that calls a combination of these.

Note that resizeFn would still be optional. Also note that with this proposal, the presence of resizeFn would indicate to the caller whether or not memory could be resized in place, whereas with the reallocFn version, the only way to know that would be to see that resizeFn is not set AND that reallocFn doesn't implement resize in place either.

@andrewrk
Copy link
Member

@marler8997 I like your amendment. Two questions for you:

  • If resizeFn is optional, how does freeing work? Is there freeFn?
  • Are there alignment parameters in resizeFn If so, how do those work?

@marler8997
Copy link
Contributor

Your questions show that my proposal was incomplete as I didn't clarify how shrink and free would work.

Just like allocation, I see that "de-allocation" has 3 analogous operations:

  1. free memory (completely)
  2. shrink memory in place
  3. move data from existing memory to a smaller allocated space and free original memory

Operation 2 (shrink memory in place) would be handled by resizeFn. If resizeFn is not set, then the allocator does not support in-place memory resizing, so to shrink your buffer you would need to move it to a new smaller buffer (operation 3).

Since resizing in place is handled by resizeFn, our de-allocation function has less responsibility.
It only needs to implement freeing memory completely. So the name freeFn makes sense here (rather than shinkFn).

In order to get today's semantics of shrinkFn, we can create a member function that composes the operations implemented by the allocator interface functions. In the case of shrinkFn, we may want to support that in one realloc member function that implements both enlarging and shrinking memory.

So to summarize the full Allocator interface:

/// allocate a new buffer
allocFn: fn(self: *Allocator, len: usize, alignment: u29) Error![]u8,

/// frees buffer returned by allocFn, must succeed
freeFn: fn(self: *Allocator, mem: []u8) void,

/// optional function that takes a buffer returned by allocFn and attempts to resize it in place
resizeFn: fn(self: *Allocator, mem: []u8, new_len: usize) Error!void,

/// resize the buffer holding the data in `old_mem`.  If the buffer is aligned, realloc
/// will attempt to resize the buffer in place.  If that fails, it will create a new buffer,
/// copy the data and free the original.
pub fn realloc(self: *Allocator, old_mem: []u8, old_alignment: u29, new_byte_count: usize, new_alignment: u29) {
    // TODO: implement with a combination of resizeFn/allocFn/freeFn
}

So to answer your questions:

If resizeFn is optional, how does freeing work? Is there freeFn?

With the full interface, yes, it looks like there is a freeFn.

Are there alignment parameters in resizeFn If so, how do those work?

Since the resizeFn only implements "resize existing memory in place", an alignment parameter isn't needed as it doesn't move memory around. However the allocFn function which allocates the memory in the first place would take alignment parameters.

@ikskuh
Copy link
Contributor

ikskuh commented Apr 15, 2020

@marler8997 this would also mean that something like Allocator.resizeInPlace could return two errors: error.OufOfMemory and error.NotSupported so the caller can react to when the allocator doesn't even support in-place resizing (which means, some copy optimizations cannot be done instead of there's no memory for your operation)
whereas Allocator.resize will either succeed or fail with error.OutOfMemory.

It also makes writing an allocator easier as you only have to provide both allocFn and freeFn, which both have a smaller API surface then reallocFn before.

I like this!

@BarabasGitHub
Copy link
Contributor Author

Good to see the discussion and I'm happy with the proposal. Cool. 👍

@marnix
Copy link

marnix commented Sep 2, 2020

What work would still be necessary for this issue, given that #5064 (and some subsequent fixes) have been merged?

@marler8997
Copy link
Contributor

@marnix, this issue was done with #5064. Applications now have the ability to request allocation resizes without falling back to full move/copy. I would close but I don't have permission.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

No branches or pull requests

8 participants