Skip to content

Conversation

@staticlibs
Copy link
Collaborator

This PR is a continuation of #447 PR to allow using duckdb_jdbc-x.x.x.x-nolib.jar along with a JNI native library, that is loaded directly from file system.

It extends the idea from #421 (and supersedes it) implementing the following logic:

  1. if the driver JAR has a bundled native library (for current JVM os/arch), then this library will be unpacked to the temporary directory and loaded from there. If the library cannot be unpacked or loaded - there is no fallback to other methods (it is expected that -nolib JAR is used for other loading methods)

  2. if the driver JAR does not hava a native library bundled inside it, then it will check whether a JNI native libary with the DuckDB internal naming (like libduckdb_java.so_linux_amd64) exists in file system next to the driver JAR (in the same directory). If library file is found there - then the driver will attempt to load it. If the file is found in file system, then it is expected that is can be loaded and there is no fallback to loading by name.

  3. if the native lib is not found in the same directory, then, like in Load DuckDB native library by name first #421, the driver tries to load it using duckdb_java name (that will be translated by JVM to a platform-specific name like libduckdb_java.so).

Testing: new test added that covers loading from the same dir and loading by name.

Fixes: #444

This PR is a continuation of duckdb#447 PR to allow using
`duckdb_jdbc-x.x.x.x-nolib.jar` along with a JNI native library, that
is loaded directly from file system.

It extends the idea from duckdb#421 (and supersedes it) implementing the
following logic:

1. if the driver JAR has a bundled native library (for current JVM
   os/arch), then this library will be unpacked to the temporary
   directory and loaded from there. If the library cannot be unpacked or
   loaded - there is no fallback to other methods (it is expected that
   `-nolib` JAR is used for other loading methods)

2. if the driver JAR does not hava a native library bundled inside it,
   then it will check whether a JNI native libary with the DuckDB
   internal naming (like `libduckdb_java.so_linux_amd64`) exists in
   file system next to the driver JAR (in the same directory). If
   library file is found there - then the driver will attempt to load
   it. If the file is found in file system, then it is expected that is
   can be loaded and there is no fallback to loading by name.

3. if the native lib is not found in the same directory, then, like in
   duckdb#421, the driver tries to load it using `duckdb_java` name (that will
   be translated by JVM to a platform-specific name like
   `libduckdb_java.so`).

Testing: new test added that covers loading from the same dir and
loading by name.

Fixes: duckdb#444
}

// There is no native library next to the JAR file, so we try to load it by name.
System.loadLibrary("duckdb_java");
Copy link

@JaroslavTulach JaroslavTulach Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this PR try System.loadLibrary first and only if that fails "try harder"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JaroslavTulach

Thanks for the review!

A few questions first to understand the requirements better:

1] With #447, for every platform there are 3 JARs available:

  1. duckdb_jdbc-1.4.2.0.jar: contains multiple native libs with internal naming
  2. duckdb_jdbc-1.4.2.0-linux_amd64.jar: contains single native lib with internal naming
  3. duckdb_jdbc-1.4.2.0-nolib.jar: does not contain a native lib

In your case, when you providing a native lib from the java.library.path with System.loadLibrary (not directly from FS with System.load), is it acceptable to consume the nolib JAR:

<dependency>
   <groupId>org.duckdb</groupId>
   <artifactId>duckdb_jdbc</artifactId>
   <version>1.4.2.0</version>
   <classifier>nolib</classifier>
</dependency>

or is there a scenario when the JAR with a native lib inside will be consumed, but it still must use the external native lib from java.library.path?

2] About the undesired FS access in your case, is this only about writing to FS (when unpacking the native lib to /tmp) or is the read-only attempt to find the native lib next to JAR (that is doing basically readdir + stat) can also cause problems? If the latter is the case, is it possible to get more details - what problems can it cause?

The justification for direct FS loading was to allow user's to consume nolib JAR easier (without requiring to rename the native lib or adjust LD_LIBRARY_PATH), for example for BI tools plugins setup where a bunch pf JARs must be copied to a plugins dir and selected ones a loaded on request.

With normal JVM run a few FS metadata reads are not expected to cause problems (JVM will do a lot of them when loading SOs and JARs anyway). In case of a native-image (or any other possible "not a JAR from FS" scenarios) it was expected that currentJarDir will fail promptly and the logic will proceed to System.loadLibrary after that.

The main concern with making System.loadLibrary the priority in all cases (like in #421) is the possible versioning skew (on updates) between Java and JNI (which is going to contain multiple SOs in future) parts. Where the set of JNI calls (so the whole ABI for the JAR) can well be stable for different versions. And when the same JAR has an easy way to inadvertently load the "outdated" native lib from an externally set LD_LIBRARY_PATH (for example, with large-scale JAR-plugin environments like with Trino or Metabase) - this can lead to problems with troubleshooting. Perhaps it is better to solve this problem with some kind of a direct version check on load instead.

PS: the 1.4.2 update was pushed forward to Nov 12 so there are couple more days to get this part right.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply. First and foremost - the state of this PR seems OK for Enso purposes (e.g. pragmatically we can make it work), it just doesn't feel right. What follows is then call for a rationalistic (more correct, more beautiful) solution...

In your case, when you providing a native lib from the java.library.path with System.loadLibrary (not directly from FS with System.load), is it acceptable to consume the nolib JAR.

  • actually Enso is not using java.library.path! Java offers better ways to control library loading.
  • we have our own classloader and we override findLibrary
    • we perform Enso specific search inside our polyglot/lib directory
    • we are not influenced by global settings like LD_LIBRARY_PATH, etc.
  • this approach models what have been used in NetBeans
  • and is available in any other Java runtime container (OSGi - e.g. Felix, Equinox, ...)

The main concern with making System.loadLibrary the priority in all cases (like in #421) is the possible versioning skew (on updates) between Java and JNI (which is going to contain multiple SOs in future) parts.

  • I see. I am not surprised you want to control the library loading when running from flat -cp
  • Making things easy to use in Maven/Gradle environment is important for adoption of any technology
  • duckdb_jdbc-1.4.2.0.jar: contains multiple native libs with internal naming
  • duckdb_jdbc-1.4.2.0-linux_amd64.jar: contains single native lib with internal naming
  • duckdb_jdbc-1.4.2.0-nolib.jar: does not contain a native lib
  • I didn't know about this split in Maven arch-specific and nolib artifacts #447
  • With the split, shouldn't behavior be:
    • if there is only duckdb_jdbc-1.4.2.0-nolib.jar then your rely only on System.loadLibrary("duckdb") and it is the responsibility of the integrator (like Enso) to make one available to ClassLoader.findLibrary
    • if there is (also) duckdb_jdbc-1.4.2.0-linux_amd64.jar on classpath next to duckdb_jdbc-1.4.2.0-nolib.jar then the library search is altered and tries to load a library as offered by duckdb_jdbc-1.4.2.0-linux_amd64.jar first
    • the same for duckdb_jdbc-1.4.2.0.jar
  • with such a system Enso would only make duckdb_jdbc-1.4.2.0-nolib.jar available to JVM
  • there would be no poking around and just a call to System.loadLibrary
  • Enso would have to find the right library in our HostClassLoader.findLibrary somehow

@giftick
Copy link

giftick commented Nov 8, 2025

Personally, I share the general preference of @JaroslavTulach and don't like java libs writing to, or even unnecessarily reading from, the file system; unless that is their primary function and intuitively expected by the app developer - which is not the case with the DuckDBNative class loading the native lib.

Especially including native libs in a jar and copying them into the file system is IMHO an ugly hack and not the right way, for the reasons I stated in my own issue #444.

But I tend to agree mostly with the 3-stage loading order suggested by @staticlibs (see begin of this thread) for the following reasons:

Regarding Stage 1: A developer/app-packager who opts for a jar that includes native libs (e.g. duckdb_jdbc-x.x.x.x.jar or duckdb_jdbc-1.4.2.0-linux_amd64.jar) instead of duckdb_jdbc-x.x.x.x-nolib.jar probably wants the guarantee that the native lib from that jar is loaded, and not any other that may be present in the java.library.path when the app is launched. (Maybe they have no control over the java.library.path, although I have a hard time imagining that.) That's certainly not me, but as long as duckdb_jdbc-x.x.x.x-nolib.jar exists and loads the native lib from java.library.path without wasting time with other options, I am fine. Hence the suggested stage 1 makes perfect sense for jars that do include native libs. The duckdb_jdbc-x.x.x.x-nolib.jar should skip stage 1 to avoid wasting time with a futile attempt to locate the native lib as a resource.

Regarding Stages 2+3: A check whether a JNI native libary with the DuckDB internal naming (like libduckdb_java.so_linux_amd64) exists in file system next to the driver JAR (in the same directory) would be fast and would not hurt much if DuckDBNative knew that directory immediately without searching both the modulepath and the classpath in the worst case, and in this case might be even faster than the call System.loadLibrary("duckdb_java") which must search the java.library.path if that path includes more than one directory, or module directories with subdirs that must be searched recursively. But I believe that the suggested stage 2 would need to search both the modulepath and the classpath in the worst case, and hence would be slower than System.loadLibrary("duckdb_java") as long as the java.library.path is resonably short, which it usually is.

Also, a developer who maintains the java.library.path and specifies it to the JVM with -Djava.library.path=.. is in full control of the native libs therein, and most probably expects that native libs in that path will be used and not in some other path. But that developer would probably also not have placed the native lib in the same directory as the jar, hence I don't see much potential for a bad surprise here.

Personally I don't care much for the suggested stage 3 and would never use it, because I do maintain my java.library.path, but if others like it as a kind of last resort then I won't complain.

In summary, I suggest the following changes to the suggested 3-stage loading process:

Stage 1: The duckdb_jdbc-x.x.x.x-nolib.jar should skip stage 1 to avoid wasting time with a futile attempt to locate the native lib as a resource.

Stages 2+3 should be swapped: First call System.loadLibrary("duckdb_java"). Only if that failed then try loading the native lib from the directory in which the duckdb_jdbc-*.jar resides (if there is any interest in this option at all).

Thanks and best regards !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DuckDBNative.java should call System.loadLibrary("duckdb_java"), starting faster if native lib is in -Djava.library.path

3 participants