Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HadoopStreams to support ByteBufferPositionedReadable input streams #3080

Open
steveloughran opened this issue Nov 26, 2024 · 1 comment · May be fixed by #3096
Open

HadoopStreams to support ByteBufferPositionedReadable input streams #3080

steveloughran opened this issue Nov 26, 2024 · 1 comment · May be fixed by #3096

Comments

@steveloughran
Copy link
Contributor

Describe the enhancement requested

If a stream declares in its StreamCapabilities that it supports
ByteBufferPositionedReadable, then use it for readFully(ByteBuffer)
All streams in Hadoop 3.0.0 + do declare this.

  • use StreamCapabilities to look for ByteBufferReadable.

For detecting ByteBufferReadable, use this probe falling back to the recursive scan.
All streams in the hadoop codebase will report this via StreamCapabilities, but there
may be some third-party streams which do not.

Component(s)

No response

@steveloughran
Copy link
Contributor Author

I'm implementing this, with tests.

steveloughran added a commit to steveloughran/parquet-mr that referenced this issue Nov 27, 2024
steveloughran added a commit to steveloughran/parquet-mr that referenced this issue Nov 27, 2024
Based of the H2 stream test suite but
* parameterized for on/off heap
* expect no changes in buffer contents on out of range reads.

Still one test failure.
steveloughran added a commit to steveloughran/parquet-mr that referenced this issue Dec 3, 2024
steveloughran added a commit to steveloughran/parquet-mr that referenced this issue Dec 3, 2024
Based of the H2 stream test suite but
* parameterized for on/off heap
* expect no changes in buffer contents on out of range reads.

Still one test failure.
steveloughran added a commit to steveloughran/parquet-mr that referenced this issue Dec 3, 2024
* changing how stream capabilities are set up and queried,
  makes it easy to generate streams with different declared
  behaviours.
* pull out common assertions
* lots of javadoc of what each test case is trying to do.

+ all the tests are happy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant