Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(functions): Add support for REST based remote functions #10911

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Joe-Abraham
Copy link
Contributor

@Joe-Abraham Joe-Abraham commented Sep 2, 2024

Fixes - #11036

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 2, 2024
@Joe-Abraham Joe-Abraham changed the title Add support for REST based remote functions [WIP] Add support for REST based remote functions Sep 2, 2024
Copy link

netlify bot commented Sep 2, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 97483aa
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/675bf65783bdd90008633603

@Joe-Abraham Joe-Abraham force-pushed the udf branch 12 times, most recently from b88d136 to b85e0e6 Compare September 4, 2024 09:32
@Yuhta Yuhta requested review from pedroerp and mbasmanova September 4, 2024 18:28
@Joe-Abraham Joe-Abraham force-pushed the udf branch 6 times, most recently from 0cd4510 to 74023dc Compare September 9, 2024 08:10
@pedroerp
Copy link
Contributor

pedroerp commented Sep 9, 2024

Pretty cool! I see the PR is still as draft, but I can help review when it's ready. Would also be nice to add some documentation on how to use it, the configs parameters, etc.

@Joe-Abraham Joe-Abraham force-pushed the udf branch 3 times, most recently from abe87e1 to 6c1606e Compare September 13, 2024 05:06
@Joe-Abraham Joe-Abraham force-pushed the udf branch 3 times, most recently from 05115f4 to 2ffec26 Compare September 20, 2024 05:18
@Joe-Abraham Joe-Abraham changed the title [WIP] Add support for REST based remote functions Add support for REST based remote functions Nov 4, 2024
@Joe-Abraham Joe-Abraham force-pushed the udf branch 5 times, most recently from a541614 to 8609347 Compare November 12, 2024 07:18
@Joe-Abraham Joe-Abraham force-pushed the udf branch 7 times, most recently from 4e0627d to b589f78 Compare November 19, 2024 08:06
@czentgr czentgr changed the title Add support for REST based remote functions feat(functions): Add support for REST based remote functions Nov 19, 2024
@czentgr
Copy link
Collaborator

czentgr commented Nov 19, 2024

@Joe-Abraham There are build errors. Can you please take a look?
Also please note that the PR title needs adjusting to the new format. I updated it.

Copy link
Collaborator

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review.

@@ -16,11 +16,23 @@ velox_add_library(velox_functions_remote_thrift_client ThriftClient.cpp)
velox_link_libraries(velox_functions_remote_thrift_client
PUBLIC remote_function_thrift FBThrift::thriftcpp2)

find_package(CURL REQUIRED)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curl is a dependency that comes in from a sepaarte dependency but you can re-use this

set(curl_SOURCE BUNDLED)
velox_resolve_dependency(curl)

This should set up Curl to be used as a dependency for use in velox_link_libraries.
You can look at other dependencies that are introduced like this in the main CMake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code.

#include "velox/vector/VectorStream.h"

#include <fmt/format.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System includes need to go before velox dependencies in alphabetical order (except for the header on line 17 that is for this cpp file).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code.

#include <sstream>
#include <string>

#include "RestClient.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a relative path with the root in the velox main dir and then get into the alphabetical list for the velox includes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code.

@@ -24,3 +24,18 @@ add_executable(velox_functions_remote_server_main RemoteFunctionServiceMain.cpp)
target_link_libraries(
velox_functions_remote_server_main velox_functions_remote_server
velox_functions_prestosql)

add_library(velox_functions_remote_server_rest RemoteFunctionRestService.cpp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so the idea is that a user can implement Velox functions and add them to this REST server and serve them up for execution though a remote velox engine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the idea. And it also used by test file. Similar to the ThriftServer.

@Joe-Abraham
Copy link
Contributor Author

@aditi-pandit Can you please review the changes?

/// (non-remote) function registered with the same name. The `overwrite` flag
/// controls whether to overwrite in these cases.
/// (non-remote) function registered with the same name. The `overwrite`
/// flagwrite controls whether to overwrite in these cases.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : wording... maybe write is not needed here.

#include <folly/io/async/EventBase.h>
#include <sstream>
#include <string>
#include "velox/common/memory/ByteStream.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add empty line between the system and velox includes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added new line

#include "velox/vector/VectorStream.h"

#include "velox/functions/remote/client/RestClient.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this include to the correct alphabetical order in the previous velox includes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected the inclusion list

/// Network address of the servr to communicate with. Note that this can hold
/// a network location (ip/port pair) or a unix domain socket path (see
/// URL of the HTTP/REST server for remote function.
/// Or Network address of the servr to communicate with. Note that this can
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : spelling "server"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected it

}
size_t writeCallback(char* ptr, size_t size, size_t nmemb, void* userdata) {
auto* outputBuf = static_cast<IOBufQueue*>(userdata);
size_t total_size = size * nmemb;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : Use camelCase naming -> totalSize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected it

return totalCopied;
}
size_t writeCallback(char* ptr, size_t size, size_t nmemb, void* userdata) {
auto* outputBuf = static_cast<IOBufQueue*>(userdata);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use camel case "userData"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected it

using namespace folly;
namespace facebook::velox::functions {
namespace {
size_t readCallback(char* dest, size_t size, size_t nmemb, void* userp) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write comments explaining the signature and the parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the documentation


class RestClient : public HttpClient {
public:
std::unique_ptr<folly::IOBuf> performCurlRequest(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write the documentation for this API. What is it for ? What do the parameters mean ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the documentation

memory::memoryManager()->addLeafPool()};
};

class listener : public std::enable_shared_from_this<listener> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some documentation about these classes and what are they for ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added documentation

// called to use the functions mentioned in this map
};

TypePtr deserializeType(const std::string& input) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of repetition between this code and https://github.com/facebookincubator/velox/blob/main/velox/functions/remote/server/RemoteFunctionService.cpp. Please can you refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introduced the RemoteFunctionHelper.h and moved the duplicate code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants