Skip to content

Commit

Permalink
Merge branch 'main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
ucev authored Jan 1, 2025
2 parents d33c54d + 7f729c9 commit ed2aca0
Show file tree
Hide file tree
Showing 47 changed files with 2,032 additions and 162 deletions.
1 change: 1 addition & 0 deletions docs/api_refs/blacklisted-entrypoints.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
"../../langchain/src/embeddings/tensorflow.ts",
"../../langchain/src/embeddings/hf.ts",
"../../langchain/src/embeddings/hf_transformers.ts",
"../../langchain/src/embeddings/huggingface_transformers.ts",
"../../langchain/src/embeddings/googlevertexai.ts",
"../../langchain/src/embeddings/googlepalm.ts",
"../../langchain/src/embeddings/minimax.ts",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,38 @@ hide_table_of_contents: true

# Docx files

This example goes over how to load data from docx files.
The `DocxLoader` allows you to extract text data from Microsoft Word documents. It supports both the modern `.docx` format and the legacy `.doc` format. Depending on the file type, additional dependencies are required.

# Setup
---

## Setup

To use `DocxLoader`, you'll need the `@langchain/community` integration along with either `mammoth` or `word-extractor` package:

- **`mammoth`**: For processing `.docx` files.
- **`word-extractor`**: For handling `.doc` files.

### Installation

#### For `.docx` Files

```bash npm2yarn
npm install @langchain/community @langchain/core mammoth
```

# Usage
#### For `.doc` Files

```bash npm2yarn
npm install @langchain/community @langchain/core word-extractor
```

## Usage

### Loading `.docx` Files

```typescript
For `.docx` files, there is no need to explicitly specify any parameters when initializing the loader:

```javascript
import { DocxLoader } from "@langchain/community/document_loaders/fs/docx";

const loader = new DocxLoader(
Expand All @@ -23,3 +44,20 @@ const loader = new DocxLoader(

const docs = await loader.load();
```

### Loading `.doc` Files

For `.doc` files, you must explicitly specify the `type` as `doc` when initializing the loader:

```javascript
import { DocxLoader } from "@langchain/community/document_loaders/fs/docx";

const loader = new DocxLoader(
"src/document_loaders/tests/example_data/attention.doc",
{
type: "doc",
}
);

const docs = await loader.load();
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
sidebar_class_name: node-only
---

# Jira

:::tip Compatibility
Only available on Node.js.
:::

This covers how to load document objects from issues in a Jira projects.

## Credentials

- You'll need to set up an access token and provide it along with your Jira username in order to authenticate the request
- You'll also need the project key and host URL for the project containing the issues to load as documents.

## Usage

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/document_loaders/jira.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,15 @@ It runs locally and even works directly in the browser, allowing you to create w

## Setup

You'll need to install the [@xenova/transformers](https://www.npmjs.com/package/@xenova/transformers) package as a peer dependency:
You'll need to install the [@huggingface/transformers](https://www.npmjs.com/package/@huggingface/transformers) package as a peer dependency:

:::tip Compatibility
If you are using a version of community older than 0.3.21, install the older `@xenova/transformers` package and
import the embeddings from `"@langchain/community/embeddings/hf_transformers"` below.
:::

```bash npm2yarn
npm install @xenova/transformers
npm install @huggingface/transformers
```

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
Expand Down
2 changes: 1 addition & 1 deletion environment_tests/test-exports-cjs/src/import.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ async function test() {
const { OpenAI } = await import("@langchain/openai");
const { LLMChain } = await import("langchain/chains");
const { ChatPromptTemplate } = await import("@langchain/core/prompts");
const { HuggingFaceTransformersEmbeddings } = await import("@langchain/community/embeddings/hf_transformers");
const { HuggingFaceTransformersEmbeddings } = await import("@langchain/community/embeddings/huggingface_transformers");
const { Document } = await import("@langchain/core/documents");
const { MemoryVectorStore } = await import("langchain/vectorstores/memory");

Expand Down
2 changes: 1 addition & 1 deletion environment_tests/test-exports-cjs/src/index.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { OpenAI } from "@langchain/openai";
import { LLMChain } from "langchain/chains";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/huggingface_transformers";
import { Document } from "@langchain/core/documents";

// Test exports
Expand Down
6 changes: 4 additions & 2 deletions environment_tests/test-exports-cjs/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { OpenAI } from "@langchain/openai";
import { LLMChain } from "langchain/chains";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/huggingface_transformers";
import { Document } from "@langchain/core/documents";

async function test(useAzure: boolean = false) {
Expand All @@ -25,7 +25,9 @@ async function test(useAzure: boolean = false) {
openAIApiKey: "sk-XXXX",
};

const vs = new MemoryVectorStore(new HuggingFaceTransformersEmbeddings({ model: "Xenova/all-MiniLM-L6-v2" }));
const vs = new MemoryVectorStore(
new HuggingFaceTransformersEmbeddings({ model: "Xenova/all-MiniLM-L6-v2" })
);

await vs.addVectors(
[
Expand Down
2 changes: 1 addition & 1 deletion environment_tests/test-exports-cjs/src/require.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ const { OpenAI } = require("@langchain/openai");
const { LLMChain } = require("langchain/chains");
const { ChatPromptTemplate } = require("@langchain/core/prompts");
const { MemoryVectorStore } = require("langchain/vectorstores/memory");
const { HuggingFaceTransformersEmbeddings } = require("@langchain/community/embeddings/hf_transformers");
const { HuggingFaceTransformersEmbeddings } = require("@langchain/community/embeddings/huggingface_transformers");
const { Document } = require("@langchain/core/documents");

async function test() {
Expand Down
2 changes: 1 addition & 1 deletion environment_tests/test-exports-esm/src/import.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ async function test() {
const { LLMChain } = await import("langchain/chains");
const { ChatPromptTemplate } = await import("@langchain/core/prompts");
const { MemoryVectorStore } = await import("langchain/vectorstores/memory");
const { HuggingFaceTransformersEmbeddings } = await import("@langchain/community/embeddings/hf_transformers");
const { HuggingFaceTransformersEmbeddings } = await import("@langchain/community/embeddings/huggingface_transformers");
const { Document } = await import("@langchain/core/documents");

// Test exports
Expand Down
2 changes: 1 addition & 1 deletion environment_tests/test-exports-esm/src/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { OpenAI } from "@langchain/openai";
import { LLMChain } from "langchain/chains";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/huggingface_transformers";
import { Document } from "@langchain/core/documents";
import { CallbackManager } from "@langchain/core/callbacks/manager";

Expand Down
6 changes: 4 additions & 2 deletions environment_tests/test-exports-esm/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { OpenAI } from "@langchain/openai";
import { LLMChain } from "langchain/chains";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/huggingface_transformers";
import { Document } from "@langchain/core/documents";

async function test(useAzure: boolean = false) {
Expand All @@ -24,7 +24,9 @@ async function test(useAzure: boolean = false) {
openAIApiKey: "sk-XXXX",
};

const vs = new MemoryVectorStore(new HuggingFaceTransformersEmbeddings({ model: "Xenova/all-MiniLM-L6-v2", }));
const vs = new MemoryVectorStore(
new HuggingFaceTransformersEmbeddings({ model: "Xenova/all-MiniLM-L6-v2" })
);

await vs.addVectors(
[
Expand Down
2 changes: 1 addition & 1 deletion environment_tests/test-exports-esm/src/require.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ const { OpenAI } = require("@langchain/openai");
const { LLMChain } = require("langchain/chains");
const { ChatPromptTemplate } = require("@langchain/core/prompts");
const { MemoryVectorStore } = require("langchain/vectorstores/memory");
const { HuggingFaceTransformersEmbeddings } = require("@langchain/community/embeddings/hf_transformers");
const { HuggingFaceTransformersEmbeddings } = require("@langchain/community/embeddings/huggingface_transformers");
const { Document } = require("@langchain/core/documents");

async function test() {
Expand Down
6 changes: 5 additions & 1 deletion examples/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,8 @@ HANA_HOST=HANA_DB_ADDRESS
HANA_PORT=HANA_DB_PORT
HANA_UID=HANA_DB_USER
HANA_PWD=HANA_DB_PASSWORD
ARK_API_KEY=ADD_YOURS_HERE # https://console.volcengine.com/
ARK_API_KEY=ADD_YOURS_HERE # https://console.volcengine.com/
JIRA_HOST=ADD_YOURS_HERE
JIRA_USERNAME=ADD_YOURS_HERE
JIRA_ACCESS_TOKEN=ADD_YOURS_HERE
JIRA_PROJECT_KEY=ADD_YOURS_HERE
26 changes: 26 additions & 0 deletions examples/src/document_loaders/jira.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import { JiraProjectLoader } from "@langchain/community/document_loaders/web/jira";

const host = process.env.JIRA_HOST || "https://jira.example.com";
const username = process.env.JIRA_USERNAME;
const accessToken = process.env.JIRA_ACCESS_TOKEN;
const projectKey = process.env.JIRA_PROJECT_KEY || "PROJ";

if (username && accessToken) {
// Created within last 30 days
const createdAfter = new Date();
createdAfter.setDate(createdAfter.getDate() - 30);
const loader = new JiraProjectLoader({
host,
projectKey,
username,
accessToken,
createdAfter,
});

const documents = await loader.load();
console.log(`Loaded ${documents.length} Jira document(s)`);
} else {
console.log(
"You must provide a username and access token to run this example."
);
}
2 changes: 1 addition & 1 deletion examples/src/models/embeddings/hf_transformers.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/huggingface_transformers";

const model = new HuggingFaceTransformersEmbeddings({
model: "Xenova/all-MiniLM-L6-v2",
Expand Down
2 changes: 1 addition & 1 deletion examples/src/use_cases/local_retrieval_qa/chain.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import { HNSWLib } from "@langchain/community/vectorstores/hnswlib";
import { Ollama } from "@langchain/community/llms/ollama";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/huggingface_transformers";
import { formatDocumentsAsString } from "langchain/util/document";
import { PromptTemplate } from "@langchain/core/prompts";
import {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import { HNSWLib } from "@langchain/community/vectorstores/hnswlib";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/huggingface_transformers";

const loader = new CheerioWebBaseLoader(
"https://lilianweng.github.io/posts/2023-06-23-agent/"
Expand Down
2 changes: 1 addition & 1 deletion examples/src/use_cases/local_retrieval_qa/qa_chain.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import { HNSWLib } from "@langchain/community/vectorstores/hnswlib";
import { Ollama } from "@langchain/community/llms/ollama";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/huggingface_transformers";
import { PromptTemplate } from "@langchain/core/prompts";

const loader = new CheerioWebBaseLoader(
Expand Down
2 changes: 1 addition & 1 deletion langchain-core/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@langchain/core",
"version": "0.3.26",
"version": "0.3.27",
"description": "Core LangChain.js abstractions and schemas",
"type": "module",
"engines": {
Expand Down
2 changes: 1 addition & 1 deletion libs/langchain-aws/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@langchain/aws",
"version": "0.1.2",
"version": "0.1.3",
"description": "LangChain AWS integration",
"type": "module",
"engines": {
Expand Down
4 changes: 4 additions & 0 deletions libs/langchain-community/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
JIRA_HOST=ADD_YOURS_HERE
JIRA_USERNAME=ADD_YOURS_HERE
JIRA_ACCESS_TOKEN=ADD_YOURS_HERE
JIRA_PROJECT_KEY=ADD_YOURS_HERE
8 changes: 8 additions & 0 deletions libs/langchain-community/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,10 @@ embeddings/hf_transformers.cjs
embeddings/hf_transformers.js
embeddings/hf_transformers.d.ts
embeddings/hf_transformers.d.cts
embeddings/huggingface_transformers.cjs
embeddings/huggingface_transformers.js
embeddings/huggingface_transformers.d.ts
embeddings/huggingface_transformers.d.cts
embeddings/ibm.cjs
embeddings/ibm.js
embeddings/ibm.d.ts
Expand Down Expand Up @@ -930,6 +934,10 @@ document_loaders/web/imsdb.cjs
document_loaders/web/imsdb.js
document_loaders/web/imsdb.d.ts
document_loaders/web/imsdb.d.cts
document_loaders/web/jira.cjs
document_loaders/web/jira.js
document_loaders/web/jira.d.ts
document_loaders/web/jira.d.cts
document_loaders/web/figma.cjs
document_loaders/web/figma.js
document_loaders/web/figma.d.ts
Expand Down
3 changes: 3 additions & 0 deletions libs/langchain-community/langchain.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ export const config = {
"embeddings/gradient_ai": "embeddings/gradient_ai",
"embeddings/hf": "embeddings/hf",
"embeddings/hf_transformers": "embeddings/hf_transformers",
"embeddings/huggingface_transformers": "embeddings/huggingface_transformers",
"embeddings/ibm": "embeddings/ibm",
"embeddings/jina": "embeddings/jina",
"embeddings/llama_cpp": "embeddings/llama_cpp",
Expand Down Expand Up @@ -288,6 +289,7 @@ export const config = {
"document_loaders/web/gitbook": "document_loaders/web/gitbook",
"document_loaders/web/hn": "document_loaders/web/hn",
"document_loaders/web/imsdb": "document_loaders/web/imsdb",
"document_loaders/web/jira": "document_loaders/web/jira",
"document_loaders/web/figma": "document_loaders/web/figma",
"document_loaders/web/firecrawl": "document_loaders/web/firecrawl",
"document_loaders/web/github": "document_loaders/web/github",
Expand Down Expand Up @@ -356,6 +358,7 @@ export const config = {
"embeddings/tensorflow",
"embeddings/hf",
"embeddings/hf_transformers",
"embeddings/huggingface_transformers",
"embeddings/ibm",
"embeddings/jina",
"embeddings/llama_cpp",
Expand Down
Loading

0 comments on commit ed2aca0

Please sign in to comment.