Skip to content

Commit

Permalink
Merge pull request #1049 from quarkiverse/llama3-java-docs
Browse files Browse the repository at this point in the history
Add basic docs for Llama3.java
  • Loading branch information
geoand authored Nov 5, 2024
2 parents 56d62d8 + ab6580e commit 974e59e
Showing 1 changed file with 71 additions and 0 deletions.
71 changes: 71 additions & 0 deletions docs/modules/ROOT/pages/llama3.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
= Llama3.java

include::./includes/attributes.adoc[]

https://github.com/mukel/llama3.java[Llama3.java] provides a way to run large language models (LLMs) locally and in pure Java and embedded in your Quarkus application.
You can run various https://huggingface.co/mukel[models] such as LLama3, Mistral on your machine.

[#_prerequisites]
== Prerequisites
To use Llama3.java it is necessary to run on Java 21 or later. This is because it utilizes the new https://openjdk.org/jeps/448[Vector API] for faster inference. Note that the Vector API is still a Java preview features, so it is required to explicitly enable it.
Since the Vector API are still a preview feature in Java 21, and up to the latest Java 23, it is necessary to enable it on the JVM by launching it with the following flags:
[source]
----
--enable-preview --enable-native-access=ALL-UNNAMED --add-modules jdk.incubator.vector
----
or equivalently to configure the quarkus-maven-plugin in the pom.xml file of your project as it follows:
[source,xml,subs=attributes+]
----
<configuration>
<jvmArgs>--enable-preview --enable-native-access=ALL-UNNAMED</jvmArgs>
<modules>
<module>jdk.incubator.vector</module>
</modules>
</configuration>
----
=== Dev Mode
Quarkus LangChain4j automatically handles the pulling of the models configured by the application, so there is no need for users to do so manually.
WARNING: When running Quarkus in dev mode C2 compilation is not enabled and this can make Llama3.java excessively slow. This limitation will be fixed with Quarkus 3.17 when `<forceC2>true</forceC2>` is set.
WARNING: Models are huge, so make sure you have enough disk space.
NOTE: Due to model's large size, pulling them can take time
== Using Llama3.java
To let Llama3.java running inference on your models, add the following dependency into your project:
[source,xml,subs=attributes+]
----
<dependency>
<groupId>io.quarkiverse.langchain4j</groupId>
<artifactId>quarkus-langchain4j-llama3-java</artifactId>
<version>{project-version}</version>
</dependency>
----
If no other LLM extension is installed, link:../ai-services.adoc[AI Services] will automatically utilize the configured Llama3.java model.
By default, the extension uses as model https://huggingface.co/mukel/Llama-3.2-1B-Instruct-GGUF[`mukel/Llama-3.2-1B-Instruct-GGUF`].
You can change it by setting the `quarkus.langchain4j.llama3.chat-model.model-name` property in the `application.properties` file:
[source,properties,subs=attributes+]
----
quarkus.langchain4j.llama3.chat-model.model-name=mukel/Llama-3.2-3B-Instruct-GGUF
----
=== Configuration
Several configuration properties are available:
include::includes/quarkus-langchain4j-llama3-java.adoc[leveloffset=+1,opts=optional]

0 comments on commit 974e59e

Please sign in to comment.