Skip to content
Nathan Jensen edited this page Jul 14, 2015 · 19 revisions

Basics

Jep uses JNI and the CPython API to start up a top/initial Python interpreter inside the JVM. This top interpreter will never be used except to initialize and shut down python. When you create a Jep instance in Java, a sub-interpreter will be created for that Jep instance and will remain in memory until the Jep instance is closed with jep.close(). The top interpreter will remain in the JVM until the JVM exits.

Sandboxed interpreters

Each Jep instance's interpreter is sandboxed apart from the other interpreters. This means a change to the imported modules, global variables, etc in one interpreter will not be reflected in other interpreters. However, this rule does not apply to CPython extensions. There is no way to strictly enforce a CPython extension is implemented in a way that supports sandboxing. A simple example would be if a CPython extension library has a global static variable that is used throughout the library. A change to that static variable in one sub-interpreter would affect the other sub-interpreters since it is the same reference in memory. Note that the same rule applies to Java static variables or singletons. Since only one exists in the JVM, a change to that static variable will be reflected in all Python sub-interpreters.

Threading complications

Due to complications and limitations of JNI, a thread that creates a Jep instance must be reused for all method calls to that Jep instance. Jep will enforce this and throw exceptions mentioning invalid thread access. (In the future we hope to simplify or provide utilities for thread management).

More than one Jep instance should not be run on the same thread at the same time. While this is technically allowed, it can potentially mess up the thread state and lead to deadlock in the Python interpreter. This will probably be changed to throw an exception if encountered in the future.

Objects

Jep will automatically convert Java primitives, Strings, and jep.NDArrays sent into the Python sub-interpreter into Python primitives, strings, and numpy.ndarrays respectively. The Python versions of these objects will have no reference to their original Java counterparts, they are entirely new objects that exist solely in Python's system memory.

A Java object that does not match one of the types listed above will automatically be wrapped as a PyJobject (or one of its related classes). A PyJobject wraps the reference to the original Java object and presents the Python sub-interpreter with an interface for understanding the object as a Python object. From the point-of-view of the Python sub-interpreter, a PyJobject is just another Python object with a select set of attributes (fields and methods) on it.

Jep does not currently have as strong of support for manipulating Python objects with Java code. Python strings and primitives will be automatically converted to their Java equivalent when passed/returned to Java. For some common types Jep will attempt an automatic conversion to a Java type, but in general it is unsafe to attempt to pass a pure python object into Java. (We aim to improve this in a future release). For these scenarios, it is best to either manually convert the Python object to a Java object or manipulate the Python object by using Jep.eval() and methods defined in python. For example, jep.eval("x = foo(y)");.

Memory usage

Jep will use both Java heap memory and native (aka direct or system) memory. All the Java objects will use heap memory as usual, while any Python objects will use native memory as usual. The wrapper objects such as PyJobject will actually be using both heap memory for the Java object and native memory for the associated pointers and metadata of the PyJobject.

When Jep wraps a Java object as a PyJobject, it notifies the JVM that it holds a reference to that Object, ensuring that the JVM will not garbage collect the object. When the Python garbage collector detects that there are no more references to that PyJobject (in Python at least), it will garbage collect the PyJobject wrapper. An example of this is when a variable is defined in a method scope and goes out of scope when the method returns/exits. When Python garbage collects the wrapper object, Jep will release the associated Python memory of the PyJobject and notify the JVM that it no longer has a reference to the object. This then enables the JVM to garbage collect the underlying Java object if there are no more references to it.

Another way to explain the memory management of Jep is to view the JVM as delegating to Python until Python is done with the object. The Java garbage collector defers collecting a Java object inside a Python sub-interpreter until the Python garbage collector collects it, at which point the Java garbage collector then treats it as just another Java object.

If an application shows a high amount of memory usage but low heap usage, it's possible that the application is leaking Python objects. Using Jep.set(String, Object) will set a global variable in the sub-interpreter as will evals such as jep.eval("x = 4 + 5");. Always make sure to close Jep instances when finished with them, and use local scope (e.g. variables existing only in methods) to try and ensure automatic cleanup.

Clone this wiki locally