Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-39419: [C++][Parquet] Style: Using arrow::Buffer data_as api rather than reinterpret_cast #39420

Merged
merged 3 commits into from
Jan 5, 2024

Conversation

mapleFU
Copy link
Member

@mapleFU mapleFU commented Jan 2, 2024

Rationale for this change

This patch using {mutable}_data_as<T>() api to replace interpret_cast<{const} T*>. It's just a style fixing.

What changes are included in this PR?

Just api replacement for ::arrow::Buffer

  • reinterpret_cast<T*> -> mutable_data_as<T>()
  • reinterpret_cast<const T*> -> data_as<T>()

Also, for auto {variable_name} = reinterpret_cast<{mutable} T*>( ... ), I changed it to:

  1. const auto* for data_as<T>().
  2. auto* for mutable_data_as<T>()

This didn't change the syntax, but make it more readable.

Are these changes tested?

No need

Are there any user-facing changes?

no

@mapleFU mapleFU requested a review from wgtmac as a code owner January 2, 2024 05:13
Copy link

github-actions bot commented Jan 2, 2024

⚠️ GitHub issue #39419 has been automatically assigned in GitHub to PR creator.

@mapleFU
Copy link
Member Author

mapleFU commented Jan 2, 2024

@github-actions crossbow submit -g cpp

Copy link

github-actions bot commented Jan 2, 2024

Revision: f769b1c

Submitted crossbow builds: ursacomputing/crossbow @ actions-53d0242ebe

Task Status
test-alpine-linux-cpp GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind Azure
test-cuda-cpp GitHub Actions
test-debian-11-cpp-amd64 GitHub Actions
test-debian-11-cpp-i386 GitHub Actions
test-fedora-38-cpp GitHub Actions
test-ubuntu-20.04-cpp GitHub Actions
test-ubuntu-20.04-cpp-bundled GitHub Actions
test-ubuntu-20.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-20.04-cpp-thread-sanitizer GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions

@mapleFU
Copy link
Member Author

mapleFU commented Jan 2, 2024

cc @felipecrv would you mind take a look?

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

cpp/src/parquet/encoding.cc Outdated Show resolved Hide resolved
@@ -2007,7 +2001,7 @@ class DictByteArrayDecoderImpl : public DictDecoderImpl<ByteArrayType>,
// space for binary data.
RETURN_NOT_OK(helper.Prepare());

auto dict_values = reinterpret_cast<const ByteArray*>(dictionary_->data());
const auto* dict_values = dictionary_->data_as<ByteArray>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to specify const and * explicitly here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, const auto* v = ... is equal to auto dict_values = ... It's just a style problem

Let me update it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like auto* because it communicates the assignment is just a pointer assignment and not some expensive copy/move operation. And it's creates a good symmetry with auto& that requires the & otherwise it becomes a copy.

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Jan 2, 2024
@mapleFU
Copy link
Member Author

mapleFU commented Jan 2, 2024

Updated: I just change auto to auto* and const auto*. I think they're no different, and * might be more readable for me.

Will wait some days to see if other like this style, glad to change it

Copy link
Contributor

@felipecrv felipecrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Including the const auto* and all.

@mapleFU
Copy link
Member Author

mapleFU commented Jan 3, 2024

Will merge it before Friday if no negative comments

@mapleFU mapleFU merged commit 01deb94 into apache:main Jan 5, 2024
29 checks passed
@mapleFU mapleFU removed the awaiting merge Awaiting merge label Jan 5, 2024
@mapleFU mapleFU deleted the parquet-using-buffer-as-api branch January 5, 2024 15:45
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 01deb94.

There were 4 benchmark results indicating a performance regression:

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

clayburn pushed a commit to clayburn/arrow that referenced this pull request Jan 23, 2024
… rather than reinterpret_cast (apache#39420)

### Rationale for this change

This patch using `{mutable}_data_as<T>()` api to replace `interpret_cast<{const} T*>`. It's just a style fixing.

### What changes are included in this PR?

Just api replacement for `::arrow::Buffer`

* `reinterpret_cast<T*>` -> `mutable_data_as<T>()`
* `reinterpret_cast<const T*>` -> `data_as<T>()`

Also, for `auto {variable_name} = reinterpret_cast<{mutable} T*>( ... )`, I changed it to:
1. `const auto*` for `data_as<T>()`.
2. `auto*` for `mutable_data_as<T>()`

This didn't change the syntax, but make it more readable.

### Are these changes tested?

No need

### Are there any user-facing changes?

no

* Closes: apache#39419 
* 

Authored-by: mwish <[email protected]>
Signed-off-by: mwish <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
… rather than reinterpret_cast (apache#39420)

### Rationale for this change

This patch using `{mutable}_data_as<T>()` api to replace `interpret_cast<{const} T*>`. It's just a style fixing.

### What changes are included in this PR?

Just api replacement for `::arrow::Buffer`

* `reinterpret_cast<T*>` -> `mutable_data_as<T>()`
* `reinterpret_cast<const T*>` -> `data_as<T>()`

Also, for `auto {variable_name} = reinterpret_cast<{mutable} T*>( ... )`, I changed it to:
1. `const auto*` for `data_as<T>()`.
2. `auto*` for `mutable_data_as<T>()`

This didn't change the syntax, but make it more readable.

### Are these changes tested?

No need

### Are there any user-facing changes?

no

* Closes: apache#39419 
* 

Authored-by: mwish <[email protected]>
Signed-off-by: mwish <[email protected]>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Feb 28, 2024
… rather than reinterpret_cast (apache#39420)

### Rationale for this change

This patch using `{mutable}_data_as<T>()` api to replace `interpret_cast<{const} T*>`. It's just a style fixing.

### What changes are included in this PR?

Just api replacement for `::arrow::Buffer`

* `reinterpret_cast<T*>` -> `mutable_data_as<T>()`
* `reinterpret_cast<const T*>` -> `data_as<T>()`

Also, for `auto {variable_name} = reinterpret_cast<{mutable} T*>( ... )`, I changed it to:
1. `const auto*` for `data_as<T>()`.
2. `auto*` for `mutable_data_as<T>()`

This didn't change the syntax, but make it more readable.

### Are these changes tested?

No need

### Are there any user-facing changes?

no

* Closes: apache#39419 
* 

Authored-by: mwish <[email protected]>
Signed-off-by: mwish <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++][Parquet] Minor: Using arrow::Buffer "data_as" api to replace the reinterpret_cast
3 participants