Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone 8/N: Implement and test ThreadPool, S3Connection{Pool}. #300

Merged
merged 69 commits into from
Oct 1, 2024

Conversation

aliddell
Copy link
Member

Depends on #299.

Copy link
Collaborator

@shlomnissan shlomnissan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no blocking comments here. However, there is some room for improvement.

#include <string_view>

namespace zarr {
const char*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this function is only called within a C++ context, consider returning a C++ string instead.

Comment on lines 17 to 20
uint8_t clevel;
uint8_t shuffle;

BloscCompressionParams();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing the trivial constructor from the implementation file.

Suggested change
uint8_t clevel;
uint8_t shuffle;
BloscCompressionParams();
uint8_t clevel {1};
uint8_t shuffle {1};
BloscCompressionParams() = default;

I also suggest initializing the codec_id member to an empty string or making it an optional value.

class S3Connection
{
public:
explicit S3Connection(const std::string& endpoint,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicit conversions are not a risk in the usual sense because the constructor requires multiple arguments. Therefore, the explicit keyword is not strictly necessary here.

Copy link
Member Author

@aliddell aliddell Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit here is probably vestigial, from an earlier version of this class

@@ -0,0 +1,141 @@
#pragma once

#include <miniocpp/client.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is a good candidate for the PIMPL idiom. Ideally, we should hide minio from the public interface.

* @returns True if the bucket exists, otherwise false.
* @throws std::runtime_error if the bucket name is empty.
*/
bool bucket_exists(std::string_view bucket_name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both bucket_exists and object_exists could benefit from the [[nodiscard]] attribute in their signatures. Alternatively, consider renaming them to is_bucket_exists and is_object_exists for improved clarity.

std::condition_variable cv_;
std::queue<JobT> jobs_;

std::atomic<bool> is_accepting_jobs_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comment applies regarding the preference for inline initialization over the constructor's initializer list.


std::optional<ThreadPool::JobT> pop_from_job_queue_() noexcept;
[[nodiscard]] bool should_stop_() const noexcept;
void thread_worker_();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function name is confusing. I suggest renaming it to process_jobs() if that accurately reflects its implementation.


zarr::ThreadPool::ThreadPool(unsigned int n_threads,
std::function<void(const std::string&)> err)
: error_handler_{ err }
Copy link
Collaborator

@shlomnissan shlomnissan Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::function can be costly in terms of performance. Moreover, it's common practice to use inline lambdas for scenarios like error callbacks. I recommend shifting to function objects to address these situations more efficiently.

: error_handler_ { std::move(err) }

Copy link
Member Author

@aliddell aliddell Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this?

class ErrorCallerbacker {
public:
    ErrorCallbacker(std::unique_ptr<ZarrStream_s> stream) : stream_(stream) {}
    int operator()(const std::string& s) const { stream_->set_error(s); }
private:
    int stream_;
};

The thread pool class would need to be templated I guess to handle different function objects.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore the last part of my comment. My primary point was moving the object instead of copying it.

Comment on lines 8 to 9
n_threads = std::clamp(
n_threads, 1u, std::max(std::thread::hardware_concurrency(), 1u));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A clearer way to write this might be:

const auto max_threads = std::max(std::thread::hardware_concurrency(), 1u);
n_threads = std::clamp(n_threads, 1u, max_threads);

bool
zarr::ThreadPool::push_to_job_queue(JobT&& job)
{
std::unique_lock lock(jobs_mutex_);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar question arises about the need to mix different locking mechanisms. I'm not an expert when it comes to the locking API, so there might be a compelling reason for this approach that I'm overlooking.

@aliddell aliddell deleted the branch main September 27, 2024 17:47
@aliddell aliddell closed this Sep 27, 2024
@aliddell aliddell reopened this Sep 27, 2024
@aliddell aliddell changed the base branch from standalone-sequence-7 to main September 27, 2024 17:48
@aliddell aliddell merged commit 8b20b10 into main Oct 1, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants