Skip to content

Conversation

@pciolkosz
Copy link
Contributor

We store a stream in async_buffer to express dependencies the buffer should respect when its destructor is run. But sometimes it would be nice to just say a buffer no longer has any dependencies. For example:

      auto buf = cudax::make_async_buffer(stream, resource, ...);
      // use the buffer for some work in stream

      stream.sync();
      // if buffer is not used for other stream ordered work beyond this point
     //   there is no need to keep stream as its dependency

In the above case stream might get destroyed before buf or there might be some work inserted into the stream that is not related to it and it would unnecessarily delay the deallocation.
For these cases this PR changes the type stored in the buffer from stream_ref to cuda::std::optional<stream_ref> along with the getter and setter for it. This way in the above case user can call buf.set_stream(cuda::std::nullopt) and buffer destructor will call resource.deallocate_sync() instead of resource.deallocate() in that case.
The side effect of that change is that whenever user wants to query the stream from a buffer they will need to remember to unpack it from an optional, but I think it's fine.

Currently all ways to create an async_buffer take a stream that gets stored in the buffer. We could provide buffer construction API with no_init argument that does not take a stream and the buffer starts with the stream set to nullopt, but this need some thought first

@pciolkosz pciolkosz requested a review from a team as a code owner October 31, 2025 02:38
@pciolkosz pciolkosz requested a review from fbusato October 31, 2025 02:38
@github-project-automation github-project-automation bot moved this to Todo in CCCL Oct 31, 2025
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Oct 31, 2025
@github-actions

This comment has been minimized.

Comment on lines 161 to 162
_CCCL_HIDE_FROM_ABI async_buffer(const async_buffer& __other)
: __buf_(__other.memory_resource(), __other.stream(), __other.size())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_CCCL_HIDE_FROM_ABI async_buffer(const async_buffer& __other)
: __buf_(__other.memory_resource(), __other.stream(), __other.size())
_CCCL_HIDE_FROM_ABI explicit async_buffer(const async_buffer& __other)
: __buf_(__other.memory_resource(), __other.stream(), __other.size())

I would like to make this constructor explicit, to avoid accidental coppies

_CCCL_HIDE_FROM_ABI constexpr void set_stream(stream_ref __new_stream)
//! @warning This does not synchronize between \p __new_stream and the current stream. It is the user's responsibility
//! to ensure proper stream order going forward
_CCCL_HIDE_FROM_ABI constexpr void set_stream(const ::cuda::std::optional<::cuda::stream_ref>& __new_stream)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point in taking the stream as optinal?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question as @davebayer here. Do we even need to literally use an optional<T> here at all? Can't we just use a sentinel value of cudaStream_t{0} ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is cudaStream_t{0} is a valid stream. We have ~0 casted to cudaStream_t that is recognized as invalid stream, but I think something like buffer.set_stream(cuda::invalid_stream) seems more convoluted than optional.
For me buffer.set_stream(cuda::std::nullopt) was the most intuitive way to say the buffer should have no stream dependency. Especially since we already return optional from buffer.stream(). Maybe there is some better way to express it, but I don't think it should be some special cudaStream_t value or a special stream.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 1, 2025

🥳 CI Workflow Results

🟩 Finished in 59m 07s: Pass: 100%/42 | Total: 4h 03m | Max: 27m 46s | Hits: 96%/20774

See results here.

@pciolkosz
Copy link
Contributor Author

After some discussions I am closing this one. In the future if we still want similar functionality we will use invalid_stream instead

@pciolkosz pciolkosz closed this Nov 4, 2025
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants