Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple interpreter instance? #302

Open
Gnimuc opened this issue Jun 2, 2024 · 8 comments
Open

Support multiple interpreter instance? #302

Gnimuc opened this issue Jun 2, 2024 · 8 comments

Comments

@Gnimuc
Copy link
Contributor

Gnimuc commented Jun 2, 2024

static std::unique_ptr<compat::Interpreter> sInterpreter;
// Valgrind complains about __cxa_pure_virtual called when deleting
// llvm::SectionMemoryManager::~SectionMemoryManager as part of the dtor chain
// of the Interpreter.
// This might fix the issue https://reviews.llvm.org/D107087
// FIXME: For now we just leak the Interpreter.
struct InterpDeleter {
~InterpDeleter() { sInterpreter.release(); }
} Deleter;
static compat::Interpreter& getInterp() {
assert(sInterpreter.get() && "Must be set before calling this!");
return *sInterpreter.get();
}
static clang::Sema& getSema() { return getInterp().getCI()->getSema(); }
static clang::ASTContext& getASTContext() { return getSema().getASTContext(); }

The interpreter is global and unique at the moment. However, clients do not always agree on the same compiler configuration. I think we need to add more APIs that take compat::Interpreter as argument. For example,

const char* GetResourceDirFromInterpreter(const compat::Interpreter* interp) {
    return interp->getCI()->getHeaderSearchOpts().ResourceDir.c_str();
}
@vgvassilev
Copy link
Contributor

vgvassilev commented Jun 2, 2024

We have thought about that and initially this was part of the design. However, this makes the interfaces quite ugly and we moved away from that approach.

Perhaps, we could have some list of interpreters and we could change the "active" interpreter. Would that work for your use case?

@Gnimuc
Copy link
Contributor Author

Gnimuc commented Jun 2, 2024

That could be a solution as well. At least, we need a setInterp() API to allow clients to switch the current interpreter. But it's tricky to do it right in a multi-threading context...

@vgvassilev
Copy link
Contributor

I agree... Do you have an implementation in mind?

@vgvassilev
Copy link
Contributor

I suspect we want to replace the static interpreter instance with an user-provided callback or something like that where the user is responsible for maintaining the top of the interpreter stack?

@Gnimuc
Copy link
Contributor Author

Gnimuc commented Jun 3, 2024

I agree... Do you have an implementation in mind?

No. I'm still experimenting with the library. I'm wondering how cppyy solves this problem.

Another way is to expose a C wrapper over the internal CppInterOp Interpreter, so users can manage those instances on their own. We only need to maintain a stable C API. The CppInterOpInterpreter.h can still be internal.

The following C++ APIs are just helper functions or wrappers over CppInterOpInterpreter.h's public methods. They are simple and short, so the C API version(with the ugly interpreter argument) can be easy to reimplement.

void AddSearchPath(const char *dir, bool isUser, bool prepend);
const char* GetResourceDir();
void AddIncludePath(const char *dir);
int Declare(const char* code, bool silent);
int Process(const char *code);
intptr_t Evaluate(const char *code, bool *HadError/*=nullptr*/);
std::string LookupLibrary(const char* lib_name);
bool LoadLibrary(const char* lib_stem, bool lookup);
void UnloadLibrary(const char* lib_stem);
std::string SearchLibrariesForSymbol(const char* mangled_name, bool search_system /*true*/);
std::string ObjToString(const char *type, void *obj);
void Destruct(TCppObject_t This, TCppScope_t scope, bool withFree /*=true*/);
TCppFuncAddr_t GetFunctionAddress(const char* mangled_name);
CPPINTEROP_API JitCall MakeFunctionCallable(TCppConstFunction_t func);

However, these two APIs are not trivial to reimplement.

intptr_t GetVariableOffset(TCppScope_t var); 
bool InsertOrReplaceJitSymbol(const char* linker_mangled_name, uint64_t address);

I think we could do what MakeFunctionCallable did here:

static JitCall::DestructorCall make_dtor_wrapper(compat::Interpreter& interp,

In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

@vgvassilev
Copy link
Contributor

I agree... Do you have an implementation in mind?

No. I'm still experimenting with the library. I'm wondering how cppyy solves this problem.

Cppyy is not know to have good interfaces to communicate with the underlying infrastructure. It wraps them into multiple layers via the CPyCppyy and the cling-backend. Let's ping @wlav for a better take on this.

Another way is to expose a C wrapper over the internal CppInterOp Interpreter, so users can manage those instances on their own. We only need to maintain a stable C API. The CppInterOpInterpreter.h can still be internal.

Yes, right now CppInterOp is not in a good place in that respect. The intent has been that these interfaces to be C compatible and to not change a lot. However, it is really challenging to provide C-compatible interfaces when we deal with collections (std::vectors) or provide storage.

The following C++ APIs are just helper functions or wrappers over CppInterOpInterpreter.h's public methods. They are simple and short, so the C API version(with the ugly interpreter argument) can be easy to reimplement.

void AddSearchPath(const char *dir, bool isUser, bool prepend);
const char* GetResourceDir();
void AddIncludePath(const char *dir);
int Declare(const char* code, bool silent);
int Process(const char *code);
intptr_t Evaluate(const char *code, bool *HadError/*=nullptr*/);
std::string LookupLibrary(const char* lib_name);
bool LoadLibrary(const char* lib_stem, bool lookup);
void UnloadLibrary(const char* lib_stem);
std::string SearchLibrariesForSymbol(const char* mangled_name, bool search_system /*true*/);
std::string ObjToString(const char *type, void *obj);
void Destruct(TCppObject_t This, TCppScope_t scope, bool withFree /*=true*/);
TCppFuncAddr_t GetFunctionAddress(const char* mangled_name);
CPPINTEROP_API JitCall MakeFunctionCallable(TCppConstFunction_t func);

I think it is really better to provide a C interface. That's why we will be safe.

However, these two APIs are not trivial to reimplement.

intptr_t GetVariableOffset(TCppScope_t var); 
bool InsertOrReplaceJitSymbol(const char* linker_mangled_name, uint64_t address);

I think we could do what MakeFunctionCallable did here:

static JitCall::DestructorCall make_dtor_wrapper(compat::Interpreter& interp,

In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

I agree. Once you are beyond the experimental part of this we can discuss about what are the needs. One thing I'd like to point out is that in an ideal world a lot of this code will land in llvm-project/clang/lib/Interpreter and perhaps there is where we could use C style interfaces, too...

@wlav
Copy link

wlav commented Jun 3, 2024

Preface with a note that the multi-interpreter case is a common one, especially if they could be spun up quickly, or "forked" from the state of a parent interpreter. The latter would be awesome for writing C++ unit tests.

Cppyy is not know to have good interfaces to communicate with the underlying infrastructure.

Well, it worked for 3 applications so far. :) CPyCppyy, PyPy/_cppyy, and the D language bindings. There have been a few others that I'm aware of, e.g. an NLP application, but AFAIK, these were one-offs grad student projects.

Just that I have misgivings about it b/c it's been grown organically and even as that hasn't limited functionality, there are several inefficiencies that I don't think are necessary.

It wraps them into multiple layers via the CPyCppyy and the cling-backend. Let's ping @wlav for a better take on this.

No, CPyCppyy isn't part of it. The layerings are clingwrapper and ROOT/meta. For the former, the above message applies, the latter, well ... :}

We only need to maintain a stable C API.
[..]
it is really challenging to provide C-compatible interfaces when we deal
with collections (std::vectors) or provide storage.

Right, that's how clingwrapper started as well: to bootstrap C++ bindings into PyPy, we had to start with C bindings and that's where several of the inefficiencies stem from. Eg., as you say, it's hard to share a container with C (esp. since the C binding in PyPy is ABI-based), so that's in the current implementation, there are several places where it loops with an
external index over an internal data structure.

In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

I did the same later: re-implemented the C interfaces on top of the C++ ones. However, it's still mostly 1-to-1, the C wrappers taking care of some amount of memory handling.

@wlav
Copy link

wlav commented Jun 3, 2024

Preface with a note that the multi-interpreter case is a common one, especially if they could be spun up quickly, or "forked" from the state of a parent interpreter. The latter would be awesome for writing C++ unit tests.

Cppyy is not know to have good interfaces to communicate with the underlying infrastructure.

Well, it worked for 3 applications so far. :) CPyCppyy, PyPy/_cppyy, and the D language bindings. There have been a few others that I'm aware of, e.g. an NLP application, but AFAIK, these were one-offs grad student projects.

Just that I have misgivings about it b/c it's been grown organically and even as that hasn't limited functionality, there are several inefficiencies that I don't think are necessary.

It wraps them into multiple layers via the CPyCppyy and the cling-backend. Let's ping @wlav for a better take on this.

No, CPyCppyy isn't part of it. The layerings are clingwrapper and ROOT/meta. For the former, the above message applies, the latter, well ... :}

    We only need to maintain a stable C API.
    [..]
    it is really challenging to provide C-compatible interfaces when we deal
    with collections (std::vectors) or provide storage.

Right, that's how clingwrapper started as well: to bootstrap C++ bindings into PyPy, we had to start with C bindings and that's where several of the inefficiencies stem from. Eg., as you say, it's hard to share a container with C (esp. since the C binding in PyPy is ABI-based), so that's in the current implementation, there are several places where it loops with an
external index over an internal data structure.

    In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

I did the same later: re-implemented the C interfaces on top of the C++ ones. However, it's still mostly 1-to-1, the C wrappers taking care of some amount of memory handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants