Merge pull request #1049 from quarkiverse/llama3-java-docs

Add basic docs for Llama3.java
quarkiverse · Nov 5, 2024 · 974e59e · 974e59e
2 parents 56d62d8 + ab6580e
commit 974e59e
Showing 1 changed file with 71 additions and 0 deletions.
diff --git a/docs/modules/ROOT/pages/llama3.adoc b/docs/modules/ROOT/pages/llama3.adoc
@@ -0,0 +1,71 @@
+= Llama3.java
+
+include::./includes/attributes.adoc[]
+
+https://github.com/mukel/llama3.java[Llama3.java] provides a way to run large language models (LLMs) locally and in pure Java and embedded in your Quarkus application.
+You can run various https://huggingface.co/mukel[models] such as LLama3, Mistral on your machine.
+
+[#_prerequisites]
+== Prerequisites
+
+To use Llama3.java it is necessary to run on Java 21 or later. This is because it utilizes the new https://openjdk.org/jeps/448[Vector API] for faster inference. Note that the Vector API is still a Java preview features, so it is required to explicitly enable it.
+
+Since the Vector API are still a preview feature in Java 21, and up to the latest Java 23, it is necessary to enable it on the JVM by launching it with the following flags:
+
+[source]
+----
+--enable-preview --enable-native-access=ALL-UNNAMED --add-modules jdk.incubator.vector
+----
+
+or equivalently to configure the quarkus-maven-plugin in the pom.xml file of your project as it follows:
+
+[source,xml,subs=attributes+]
+----
+<configuration>
+    <jvmArgs>--enable-preview --enable-native-access=ALL-UNNAMED</jvmArgs>
+    <modules>
+        <module>jdk.incubator.vector</module>
+    </modules>
+</configuration>
+----
+
+=== Dev Mode
+
+Quarkus LangChain4j automatically handles the pulling of the models configured by the application, so there is no need for users to do so manually.
+
+WARNING: When running Quarkus in dev mode C2 compilation is not enabled and this can make Llama3.java excessively slow. This limitation will be fixed with Quarkus 3.17 when `<forceC2>true</forceC2>` is set.
+
+WARNING: Models are huge, so make sure you have enough disk space.
+
+NOTE: Due to model's large size, pulling them can take time
+
+== Using Llama3.java
+
+To let Llama3.java running inference on your models, add the following dependency into your project:
+
+[source,xml,subs=attributes+]
+----
+<dependency>
+    <groupId>io.quarkiverse.langchain4j</groupId>
+    <artifactId>quarkus-langchain4j-llama3-java</artifactId>
+    <version>{project-version}</version>
+</dependency>
+----
+
+If no other LLM extension is installed, link:../ai-services.adoc[AI Services] will automatically utilize the configured Llama3.java model.
+
+By default, the extension uses as model https://huggingface.co/mukel/Llama-3.2-1B-Instruct-GGUF[`mukel/Llama-3.2-1B-Instruct-GGUF`].
+You can change it by setting the `quarkus.langchain4j.llama3.chat-model.model-name` property in the `application.properties` file:
+
+[source,properties,subs=attributes+]
+----
+quarkus.langchain4j.llama3.chat-model.model-name=mukel/Llama-3.2-3B-Instruct-GGUF
+----
+
+=== Configuration
+
+Several configuration properties are available:
+
+include::includes/quarkus-langchain4j-llama3-java.adoc[leveloffset=+1,opts=optional]
+
+