Skip to content

Scicrop/javaSentenceBertEmbedding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Java License Twitter

☕ Java Sentence-Bert Embedding & RAG Engine

image

Welcome to the Java ONNX Embedding & Retrieval-Augmented Generation (RAG) Engine! This project showcases how to integrate modern AI models with legacy Java systems using ONNX. While most AI development today happens in Python, many enterprises still rely heavily on Java ecosystems. This solution bridges that gap, allowing seamless embedding generation and document retrieval using popular transformer models. This code/library was first developed for the InfiniteStack of SciCrop and is now open-sourced as a SciCrop Academy initiative.


🔄 Project Overview

  • Language: Java 👷‍♂️
  • AI Framework: DJL (Deep Java Library) with ONNX Runtime support
  • Purpose: Generate text embeddings using transformer models (BERT, Sentence-BERT) and build a RAG system in Java
  • Why Java? Enterprises have vast Java codebases. Integrating AI in Java reduces friction, leverages existing infrastructure, and avoids rewriting core services.

📂 Features

  1. ONNX Model Support: Load and run transformer models (e.g., BERT, Sentence-BERT) converted to ONNX format.
  2. Embedding Generation: Create vector representations of text for downstream NLP tasks.
  3. Semantic Search: Compare embeddings and rank documents by similarity using cosine similarity.
  4. Model Agnostic Design: Easily extend to other ONNX models like RoBERTa, DistilBERT, or custom fine-tuned models.
  5. Modular Code Structure:
    • BertEmbeddingEngine: Handles BERT-based models.
    • MpnetEmbeddingEngine: Supports Sentence-BERT models.
    • EmbeddingChecker: Ranks documents based on query similarity.
    • QueryEngine: Provides CLI interface to run queries.

🔧 Setup Instructions

1. 📦 Prerequisites

  • Java 11 or higher
  • Maven for dependency management
  • ONNX Models (BERT, Sentence-BERT, etc.)

2. 📚 Cloning the Repository

# Clone the project
$ git clone https://github.com/Scicrop/javaSentenceBertEmbedding
$ cd javaSentenceBertEmbedding

3. 🔢 Installing Dependencies

Ensure Maven is installed and run:

# Build the project and download dependencies
$ mvn clean install

4. 🌐 Download & Convert Models

  • BERT Base (Uncased)

    1. Download from Hugging Face: bert-base-uncased
    2. Convert to ONNX:
      python3 -m optimum.exporters.onnx --model bert-base-uncased ./onnx_bert/
    3. Place model.onnx, vocab.txt, and config.json in /opt/infinitestack/onnx_bert/
  • Sentence-BERT (all-mpnet-base-v2)

    1. Download from Hugging Face: all-mpnet-base-v2
    2. Convert to ONNX:
      python3 -m optimum.exporters.onnx --model all-mpnet-base-v2 ./onnx_mpnet/
    3. Place files in /opt/infinitestack/onnx_mpnet/

🔍 Usage Instructions

1. 📄 Generate Embeddings for Documents

Place your text files in a directory (e.g., /tmp/sources/). Then run:

# For BERT embeddings
$ java -jar target/BertDataEmbedd.jar /tmp/sources /tmp/embeddings/ /opt/infinitestack/onnx_bert/

# For Sentence-BERT embeddings
$ java -jar target/MpnetDataEmbedd.jar /tmp/sources /tmp/embeddings/ /opt/infinitestack/onnx_mpnet/

In the above commands embeddings will be saved as .json files in /tmp/embeddings/.


2. 📃 Query the Embeddings

Run a semantic search query against your embeddings:

$ java -jar target/QueryEngine.jar "how was the computer invented?" /tmp/embeddings/

This will output the most relevant document and its similarity score.


📅 Project Evolution

🌐 From BERT to Sentence-BERT

  • We started with BERT to demonstrate basic embedding capabilities.
  • Later, we migrated to Sentence-BERT (MPNet) for better semantic understanding and performance in RAG systems.

🔼 Future Directions

  1. Support for Additional Models: RoBERTa, DistilBERT, custom fine-tuned models.
  2. Vector Database Integration: Connect to Pinecone, FAISS, or Milvus for scalable retrieval.
  3. REST API: Expose embedding generation and query features as APIs.
  4. Spring Boot Integration: Embed this system in existing Java enterprise applications.
  5. Fine-Tuning Support: Integrate with ONNX Runtime Training for on-the-fly model updates.

🌟 Why This Matters

In the world of AI-driven solutions, Python dominates. But many large enterprises have mission-critical applications written in Java. Rewriting these in Python is often not feasible due to cost, security, or compliance concerns.

By enabling RAG systems and semantic search in Java:

  • We bring state-of-the-art AI to legacy enterprise systems.
  • Ensure robust performance using the ONNX Runtime.
  • Provide a pathway for future AI integrations without disrupting existing Java codebases.

🙏 Contributing

Contributions are welcome! Feel free to open issues or submit PRs to enhance functionality, support more models, or improve performance.


🛡️ License

This project is licensed under the Apache License.


🚀 Let's Bring AI to Java!

Stay ahead in the AI revolution while leveraging your trusted Java infrastructure. Let's build intelligent, scalable, and enterprise-ready solutions together! 💡💪

Releases

No releases published

Packages

No packages published

Languages