-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Playing with CPP VTABLE from Nim
Imagine you are trying to communicate with a C++ app from Nim, and the C++ interface only provide a bunch of pure virtual functions classes, and a pointer to the class instance. Only a few extern "C"
functions to initialize the communication, all other communications must be done via that pointer. What will you do?
notepad++ plugin is a normal dll with normal interface. Standard notepad++ plugin requires standard cdecl calling convention for it's binary interface, not a big problem, Nim ffi can handle that perfectly. But things get more interesting when we want to make an external lexer for notepad++ editor engine: Scintilla.
Although that external lexer we want to create must also be reside in the same dll with our notepad++ plugin, it has different calling convention, scintilla requires stdcall calling convention for all functions it needed. Again, this is not a problem for Nim, just use {.stdcall.} pragma. But there is another requirement for notepad++ plugin: all exported function name must not be decorated ala normal stdcall, so we need to activate -Wl,--kill-at
switch when compile the plugin project, and the C/C++ compiler will handle that.
OK, so far no C++ features we already met. But wait, that's not the real interface to Scintilla, it's only entry point for a more complicated interface.
class IDocument {
public:
virtual int SCI_METHOD Version() const = 0;
virtual void SCI_METHOD SetErrorStatus(int status) = 0;
virtual Sci_Position SCI_METHOD Length() const = 0;
virtual void SCI_METHOD GetCharRange(char *buffer, Sci_Position position, Sci_Position lengthRetrieve) const = 0;
virtual char SCI_METHOD StyleAt(Sci_Position position) const = 0;
virtual Sci_Position SCI_METHOD LineFromPosition(Sci_Position position) const = 0;
virtual Sci_Position SCI_METHOD LineStart(Sci_Position line) const = 0;
virtual int SCI_METHOD GetLevel(Sci_Position line) const = 0;
virtual int SCI_METHOD SetLevel(Sci_Position line, int level) = 0;
virtual int SCI_METHOD GetLineState(Sci_Position line) const = 0;
virtual int SCI_METHOD SetLineState(Sci_Position line, int state) = 0;
virtual void SCI_METHOD StartStyling(Sci_Position position, char mask) = 0;
virtual bool SCI_METHOD SetStyleFor(Sci_Position length, char style) = 0;
virtual bool SCI_METHOD SetStyles(Sci_Position length, const char *styles) = 0;
virtual void SCI_METHOD DecorationSetCurrentIndicator(int indicator) = 0;
virtual void SCI_METHOD DecorationFillRange(Sci_Position position, int value, Sci_Position fillLength) = 0;
virtual void SCI_METHOD ChangeLexerState(Sci_Position start, Sci_Position end) = 0;
virtual int SCI_METHOD CodePage() const = 0;
virtual bool SCI_METHOD IsDBCSLeadByte(char ch) const = 0;
virtual const char * SCI_METHOD BufferPointer() = 0;
virtual int SCI_METHOD GetLineIndentation(Sci_Position line) = 0;
};
class ILexer {
public:
virtual int SCI_METHOD Version() const = 0;
virtual void SCI_METHOD Release() = 0;
virtual const char * SCI_METHOD PropertyNames() = 0;
virtual int SCI_METHOD PropertyType(const char *name) = 0;
virtual const char * SCI_METHOD DescribeProperty(const char *name) = 0;
virtual Sci_Position SCI_METHOD PropertySet(const char *key, const char *val) = 0;
virtual const char * SCI_METHOD DescribeWordListSets() = 0;
virtual Sci_Position SCI_METHOD WordListSet(int n, const char *wl) = 0;
virtual void SCI_METHOD Lex(Sci_PositionU startPos, Sci_Position lengthDoc, int initStyle, IDocument *pAccess) = 0;
virtual void SCI_METHOD Fold(Sci_PositionU startPos, Sci_Position lengthDoc, int initStyle, IDocument *pAccess) = 0;
virtual void * SCI_METHOD PrivateCall(int operation, void *pointer) = 0;
};
ILexer is the real interface to external lexer that we must provide from our dll. IDocument is the interface we need to communicate with Scintilla. Both of them are pure virtual functions classes. Nim not support C++ classes nor virtual functions, really? yes, no C++ equivalent class in Nim, no virtual functions in Nim. That's why this wiki is written, to overcome this kind of problem.
First, we will look at how Scintilla obtain pointer to ILexer instance from our dll.
type
VTABLE* = array[0..25, pointer]
ILexer* {.pure, final.} = object
vTable*: ptr VTABLE
LexerFactoryProc* = proc(): ptr ILexer {.stdcall.}
proc GetLexerFactory(idx: int): LexerFactoryProc {.stdcall, exportc, dynlib.} =
if idx == 0:
result = lexFactory
else:
result = nil
Scintilla will call our GetLexerFactory, and the factory is a function returning a pointer to ILexer. You can see how we can simulate ILexer from Nim and then exporting correct instance to be called by Scintilla.
From the above snippet, we can see that C++ virtual functions is implemented via a pointer to array of pointers. That is the very basic concept of how C++ compilers implement virtual functions classes.
Now let's write Nim code that can emulate this behavior
proc Version(x: ptr ILexer): int {.stdcall.} = lvOriginal
proc Release(x: ptr ILexer) {.stdcall.} = discard
proc PropertyNames(x: ptr ILexer): cstring {.stdcall.} = nil
proc PropertyType(x: ptr ILexer, name: cstring): int {.stdcall.} = -1
proc DescribeProperty(x: ptr ILexer, name: cstring): cstring {.stdcall.} = nil
proc PropertySet(x: ptr ILexer, key, val: cstring): int {.stdcall.} = -1
proc DescribeWordListSets(x: ptr ILexer): cstring {.stdcall.} = nil
proc WordListSet(x: ptr ILexer, n: int, wl: cstring): int {.stdcall.} = -1
proc Lex(x: ptr ILexer, startPos, docLen: int, initStyle: int, pAccess: IDocument) {.stdcall.} = discard
proc Fold(x: ptr ILexer, startPos, docLen: int, initStyle: int, pAccess: IDocument) {.stdcall.} = discard
proc PrivateCall(x: ptr ILexer, operation: int, ud: pointer): pointer {.stdcall.} = nil
var lex: ILexer
var vTable: VTABLE
proc lexFactory(): ptr ILexer {.stdcall.} =
vTable[0] = Version
vTable[1] = Release
vTable[2] = PropertyNames
vTable[3] = PropertyType
vTable[4] = DescribeProperty
vTable[5] = PropertySet
vTable[6] = DescribeWordListSets
vTable[7] = WordListSet
vTable[8] = Lex
vTable[9] = Fold
vTable[10] = PrivateCall
lex.vTable = vTable.addr
result = lex.addr
See, VTABLE is only an array of pointers to functions, Nim can handle that easily.
Now, we already know how to provide ILexer implementation from our dll, but what about IDocument? IDocument implementation is provided by Scintilla, how can we call IDocument member functions from Nim?
Armed with knowledge that C++ virtual functions tables is no more than an array of pointers. we can do something like this:
type
IDocument* {.pure, final.} = ptr object
vTable: ptr VTABLE
proc nvVersion*(dv: IDocument): int =
type dvt = proc(x: IDocument): int {.stdcall.}
result = cast[dvt](dv.vTable[0])(dv)
proc nvSetErrorStatus*(dv: IDocument, status: int) =
type dvt = proc(x: IDocument, status: int) {.stdcall.}
cast[dvt](dv.vTable[1])(dv, status)
proc nvLength*(dv: IDocument): int =
type dvt = proc(x: IDocument): int {.stdcall.}
result = cast[dvt](dv.vTable[2])(dv)
proc nvGetCharRange*(dv: IDocument, buf: cstring, pos, len: int) =
type dvt = proc(x: IDocument, buf: cstring, pos, len: int) {.stdcall.}
cast[dvt](dv.vTable[3])(dv, buf, pos, len)
#...and so on
Again, we can see how actually C++ compilers translate ->
operator when calling virtual functions, it will use the right index to the vtable and call the function via pointer, that what we also do in Nim, cast the typeless pointer to proc pointer and call it, don't forget that the first argument is also a pointer to the class instance.
That's it, not too difficult heh? It's not too difficult because we are not dealing with C++ name mangling here, only dealing with pointers and calling convention, Nim can handle that perfectly, although a bit verbose compared to C++ style.
What about C++ ordinary [member] functions, constructor, and destructor?, well that's another story, C++ name mangling can be a nightmare if you try to solve that in Nim, because every C++ compiler have their own flavor/style of name mangling, but if you know the exact formula of C++ name mangling, please write another wiki!
This method have been tested to be interoperable between MS-VCC and GCC(MINGW) on Windows, and should be compatible with LLVM-clang too. But for other C++ compiler, there is no guarantee this method will work. So use it with cautions. Generally this is really unsafe assumption about how C++ compiler works. Don't rely on this explanation if you are not sure what really happened under the hood.
Intro
Getting Started
- Install
- Docs
- Curated Packages
- Editor Support
- Unofficial FAQ
- Nim for C programmers
- Nim for Python programmers
- Nim for TypeScript programmers
- Nim for D programmers
- Nim for Java programmers
- Nim for Haskell programmers
Developing
- Build
- Contribute
- Creating a release
- Compiler module reference
- Consts defined by the compiler
- Debugging the compiler
- GitHub Actions/Travis CI/Circle CI/Appveyor
- GitLab CI setup
- Standard library and the JavaScript backend
Misc