Skip to content

feat: metadata access support for table #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jun 27, 2025
Merged

Conversation

lishuxu
Copy link
Contributor

@lishuxu lishuxu commented May 27, 2025

No description provided.

@lishuxu lishuxu closed this May 27, 2025
@lishuxu lishuxu reopened this May 27, 2025
@lishuxu lishuxu force-pushed the feature/my-feature branch from fc2ca57 to 6a26f30 Compare May 29, 2025 02:07
/// \brief Return the schema for this table
virtual const std::shared_ptr<Schema>& schema() const = 0;
/// \brief Return the schema for this table, return NotFoundError if not found
virtual Result<std::shared_ptr<Schema>> schema() const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
virtual Result<std::shared_ptr<Schema>> schema() const = 0;
virtual Result<std::shared_ptr<Schema>> Schema() const = 0;

These functions are no longer trivial accessors. We need to rename them with capitalized initial.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name schema will conflicts with the Schema object name, and since this is actually a lazy getter, I think it can be treated as an accessor.

/// \brief Return the partition spec for this table
virtual const std::shared_ptr<PartitionSpec>& spec() const = 0;
/// \brief Return the partition spec for this table, return NotFoundError if not found
virtual Result<std::shared_ptr<PartitionSpec>> spec() const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
virtual Result<std::shared_ptr<PartitionSpec>> spec() const = 0;
virtual Result<std::shared_ptr<PartitionSpec>> Spec() const = 0;

/// \brief Return the sort order for this table
virtual const std::shared_ptr<SortOrder>& sort_order() const = 0;
/// \brief Return the sort order for this table, return NotFoundError if not found
virtual Result<std::shared_ptr<SortOrder>> sort_order() const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
virtual Result<std::shared_ptr<SortOrder>> sort_order() const = 0;
virtual Result<std::shared_ptr<SortOrder>> SortOrder() const = 0;

/// \brief Return the table's current snapshot
virtual const std::shared_ptr<Snapshot>& current_snapshot() const = 0;
/// \brief Return the table's current snapshot, or NotFoundError if not found
virtual Result<std::shared_ptr<Snapshot>> current_snapshot() const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
virtual Result<std::shared_ptr<Snapshot>> current_snapshot() const = 0;
virtual Result<std::shared_ptr<Snapshot>> CurrentSnapshot() const = 0;

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still issues to address. Please also rebase this PR on the latest main branch to resolve the conflict.

BaseTable::BaseTable(std::string name, std::shared_ptr<TableMetadata> metadata)
: name_(std::move(name)), metadata_(std::move(metadata)) {
if (!metadata_) {
throw std::invalid_argument("Table metadata cannot be null");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not resolved yet.

void BaseTable::InitSchema() const {
for (const auto& schema : metadata_->schemas) {
if (schema->schema_id()) {
schemas_map_.emplace(schema->schema_id().value(), schema);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these changes are still required

@lishuxu lishuxu changed the title static table metadata access support feat: static table metadata access support Jun 3, 2025
@lishuxu lishuxu force-pushed the feature/my-feature branch 2 times, most recently from 50c93a4 to 6aef4b4 Compare June 3, 2025 02:50
@lishuxu lishuxu force-pushed the feature/my-feature branch from 5304adf to 345355a Compare June 3, 2025 03:47
@lishuxu lishuxu changed the title feat: static table metadata access support feat: metadata access support for table Jun 18, 2025
}

const std::shared_ptr<PartitionSpec>& Table::spec() const {
std::call_once(init_partition_spec_once_, [this]() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to follow similar implementation as Table::schema(). In this function, we can cache and return PartitionSpec::Unpartitioned() when missing.

/// \brief Return the table's current snapshot
virtual const std::shared_ptr<Snapshot>& current_snapshot() const = 0;
/// \brief Return the table's current snapshot, return null if not found
std::shared_ptr<Snapshot> current_snapshot() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::shared_ptr<Snapshot> current_snapshot() const;
const std::shared_ptr<Snapshot>& current_snapshot() const;

Every table should have at least one snapshot. I think we can safely return a reference.

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


const std::shared_ptr<std::unordered_map<int32_t, std::shared_ptr<Schema>>>&
Table::schemas() const {
if (!schemas_map_) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there can be race conditions here and the same pattern below, no?

@@ -19,13 +19,14 @@

#pragma once

#include <memory>
#include <mutex>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any usage of mutex, what's the intent here?

@Fokko
Copy link
Contributor

Fokko commented Jun 27, 2025

Let's move this in, thanks @lishuxu for working on this, and thanks @lidavidm, @zhjwpku, @wgtmac and @gty404 for the review 🚀

@Fokko Fokko merged commit bfb19f6 into apache:main Jun 27, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants