Embeddings Advanced

This project demonstrates how to use the LangChain library to process and embed text from PDF documents using OpenAI's GPT-4 model.

Prerequisites

Node.js
pnpm (or npm/yarn)
OpenAI API Key
Docker (for running the Chroma database)

Installation

Clone the repository:

git clone <repository-url>
cd embeddings-advanced

Install dependencies:
```
pnpm install
```
Create a .env file in the root directory and add your OpenAI API key:
```
OPENAI_API_KEY=your_openai_api_key
```
Run docker-compose to start the Chroma database:
```
docker-compose up -d
```

Usage

To run the script in development mode, use the following command:

pnpm run dev

Project Structure

src/index.ts: Main script that loads a PDF, splits its text, and processes it using LangChain and OpenAI. package.json: Contains project metadata and dependencies. .env: Environment variables (not included in the repository, create your own based on .env.example).

Dependencies

@langchain/community: Community utilities for LangChain. @langchain/core: Core utilities for LangChain. @langchain/openai: OpenAI integration for LangChain. chromadb: Chroma database for vector storage. pdf-parse: Library to parse PDF documents.

License

This project is licensed under the ISC License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Embeddings Advanced

Prerequisites

Installation

Usage

Project Structure

Dependencies

License

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Embeddings Advanced

Prerequisites

Installation

Usage

Project Structure

Dependencies

License