Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for registering components dynamically #11439

Closed
wants to merge 1 commit into from

Conversation

soumiiow
Copy link
Contributor

@soumiiow soumiiow commented Nov 5, 2024

Allow users to dynamically register Velox components. Clients such as Presto can use this feature to dynamically load User Defined Functions, connectors, and types.

Based off: https://github.com/facebookincubator/velox/pull/1005/files

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 5, 2024
Copy link

netlify bot commented Nov 5, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 542dd28
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/67ae9129d5bcfe000887fb9c

@Yuhta Yuhta requested a review from pedroerp November 5, 2024 15:40
@pedroerp
Copy link
Contributor

pedroerp commented Nov 5, 2024

@soumiiow thanks for looking into this. Out of curiosity, why doesn't this work in MacOS?

@@ -15,6 +15,7 @@ add_subdirectory(base)
add_subdirectory(caching)
add_subdirectory(compression)
add_subdirectory(config)
add_subdirectory(dynamicRegistry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use snake case for directory names "dynamic_registry"

Copy link
Contributor

@pedroerp pedroerp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! I few small comments but overall looks good.

#include <dlfcn.h>
#include <iostream>
#include "velox/common/base/Exceptions.h"
namespace facebook::velox {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: new line before namespace definition.

VELOX_USER_FAIL("Couldn't find Velox registry symbol: {}", error);
}
registryItem();
std::cout << "LOADED DYLLIB 1" << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for consistency, could you use LOG(INFO) and print the file name / path of the library loaded?


static constexpr const char* kSymbolName = "registry";

void loadDynamicLibraryFunctions(const char* fileName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably omit the "Functions" from the name, and this can be used to really load anything, as long as you provide the registration functions. Let's name it loadDynamicLibrary()

### 1. Create a cpp file for your dynamic library
For dynamically loaded function registration, the format followed is mirrored of that of built-in function registration with some noted differences. Using [MyDynamicTestFunction.cpp](tests/MyDynamicTestFunction.cpp) as an example, the function uses the extern "C" keyword to protect against name mangling. A registry() function call is also necessary here.

### 2. Register functions dynamically by creating .dylib or .so shared libraries and dropping them in a plugin directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the titles are too long; maybe just add the docs as a refular numbered list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed it out without the title formatting but does this look a bit cluttered now?

auto signaturesBefore = getFunctionSignatures().size();

// Function does not exist yet.
EXPECT_THROW(dynamicFunction(0), VeloxUserError);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use VELOX_ASSERT_THROW() instead to validate the right exception is being thrown?

# `MyDynamicFunction.cpp` as a small .so library, and use the
# MY_DYNAMIC_FUNCTION_LIBRARY_PATH macro to locate the .so binary.
add_compile_definitions(
MY_DYNAMIC_FUNCTION_LIBRARY_PATH="${CMAKE_CURRENT_BINARY_DIR}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please vendor the macro. Maybe something like VELOX_TEST_DYNAMIC_LIBRARY_PATH

Comment on lines 22 to 23
${GMock}
${GTEST_BOTH_LIBRARIES})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use the GTest:: targets

# To test functions being added by dynamically linked libraries, we compile
# `MyDynamicFunction.cpp` as a small .so library, and use the
# VELOX_TEST_DYNAMIC_LIBRARY_PATH macro to locate the .so binary.
add_compile_definitions(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use target_compile_definitions( on the relevant target instead.

if(${VELOX_BUILD_TESTING})
add_subdirectory(tests)
endif()
velox_add_library(velox_dynamic_function_loader DynamicLibraryLoader.cpp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
velox_add_library(velox_dynamic_function_loader DynamicLibraryLoader.cpp)
velox_add_library(velox_dynamic_function_loader DynamicLibraryLoader.cpp)
velox_link_libraries(velox_dynamic_function_loader PRIVATE velox_exception)

Copy link
Collaborator

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @soumiiow. Had bunch of minor comments, except for a bigger one around testing.


// Lookup the symbol.
void* registrySymbol = dlsym(handler, kSymbolName);
auto registryItem = reinterpret_cast<void (*)()>(registrySymbol);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : rename registryFunction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey!! so i got some previous feedback to stay away from the "registryFunction" in the naming so as to not make it seem like this library is to be used exclusively for functions, and to move away from our initial design which was made with only the function loading in mind. Perhaps, would there be a better name for this variable than the work "item"? I can only rlly think of registryItem or registryPtr but would love to hear your suggestions too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumiiow : To me this is almost like the "main" function in a executable program. How about "loadLibrary" or "loadUserLibrary" or "enterUserLibrary" ? There could be code beyond registration here as well.

if (error != nullptr) {
VELOX_USER_FAIL("Couldn't find Velox registry symbol: {}", error);
}
registryItem();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment "Invoke the registry function"

target_link_libraries(name_of_dynamic_fn PRIVATE xsimd fmt::fmt velox_expression)
```

3. In the Prestissimo worker's config.properties file, set the plugin.dir property
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not relevant in Velox. And also since its not used anywhere in the current code, its hard to put this in picture.

@@ -0,0 +1,22 @@
# Velox: Dynamically Loading Registry Libraries in C++

This library adds the ability to load User Defined Functions (UDFs), connectors, or types without having to fork and build Prestissimo, through the use of shared libraries that a Prestissimo worker can access. These are to be loaded on launch of the Presto server. The Presto server searches for any .so or .dylib files and loads them using this library.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to not talk about Prestissimo in this README.

This is a generic utility for dynamically loading a "registry" function from a library. Its sufficient to just say that this is for "Extensibility" features that add custom user code which could include new Velox types, functions, operators and connectors.

Copy link
Collaborator

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumiiow : Thanks for updating your tests. Have a bunch of questions.

@mohsaka
Copy link
Contributor

mohsaka commented Jan 23, 2025

@soumiiow Addressed all of the unaddressed comments and made all of the changes for compilation on mac in this PR
#12111

Specifically this commit,
ac18115

@soumiiow soumiiow changed the title Dynamically Linked Library in CPP feat: Dynamically Linked Library in CPP Jan 27, 2025
@mohsaka mohsaka force-pushed the velox-dylib branch 4 times, most recently from e06974a to 3bbd8db Compare February 12, 2025 00:13
Copy link
Contributor

@pedroerp pedroerp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's awesome. Thanks for adding support for this.

Made a few small comments, but overall it looks good to me. Feel free to tag "ready-to-merge" when they are addressed and I'll get it merged.

///
/// Loading a library twice can cause a components to be registered twice.
/// This can fail for certain Velox components.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove this extra line

VELOX_USER_FAIL("Error while loading shared library: {}", dlerror());
}

LOG(INFO) << fmt::format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOG(INFO) << "Loaded library " << fileName << ". Searching registry symbol " << kSymbolName;

"Loaded library {}. Finding registry symbol.", fileName);
// Lookup the symbol.
void* registrySymbol = dlsym(handler, kSymbolName);
auto loadUserLibrary = reinterpret_cast<void (*)()>(registrySymbol);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to check if loadUserLibrary is nullptr?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, as well as a short comment explaining how it can be nullptr but we can't actually use it when its nullptr.

Dynamic Loading of Velox Extensions
***********************************

This generic utility adds extensibility features to load User Defined Functions (UDFs) without having to fork and build Velox, through the use of shared libraries. Support for connectors and types will be added in the future.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also clarify that this only works if the builds are guaranteed to be based on the same Velox version and hence maintain ABI compatibility? This might not be immediate clear to users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Thanks!

@mohsaka mohsaka force-pushed the velox-dylib branch 7 times, most recently from 2dbe57e to 2ccbceb Compare February 12, 2025 16:54
@majetideepak majetideepak added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Feb 12, 2025
@majetideepak
Copy link
Collaborator

CI is green. @pedroerp can you please help import and merge? Thank you.

@majetideepak majetideepak changed the title feat: Dynamically Linked Library in CPP feat: Add support for loading shared libraries Feb 12, 2025
@majetideepak majetideepak changed the title feat: Add support for loading shared libraries feat: Add support for registering components dynamically Feb 12, 2025
…uccessful mac & linux compilation

Co-authored-by:  mohsaka <[email protected]>
@facebook-github-bot
Copy link
Contributor

@pedroerp has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@czentgr
Copy link
Collaborator

czentgr commented Feb 14, 2025

@pedroerp the facebook builds&tools failed with a warning and the linter also reported a testing failure. Is this something that needs to be addressed in this PR? Thanks!

@mohsaka
Copy link
Contributor

mohsaka commented Feb 14, 2025

@pedroerp the facebook builds&tools failed with a warning and the linter also reported a testing failure. Is this something that needs to be addressed in this PR? Thanks!

@czentgr It's been fixed on the Meta side already. The README needed a newline at the end of the file. If we need the change here, I can make it. But I believe they merge over there and close the PR here, merging back to main here.

I don't want to trigger a full pipeline rerun due to this.

@facebook-github-bot
Copy link
Contributor

@pedroerp merged this pull request in be9de86.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants