Skip to content

Commit

Permalink
[1.x] Backport [2.1.0, 2.2.0) (#379)
Browse files Browse the repository at this point in the history
* Update README, docs (#347)

* Copy edits for brevity, adds large screenshot

* Adds screenshot, adds principles to contributor docs

* Uses screenshot class for home page image

* Fix reference to screenshot

* Copy edits

* Update index.md

Styles magic command

* Copy edits based on @dlqqq's feedback

* fix newline typo in improve_code (#364)

* Remove frontend js unit tests as not planned (#371)

* Added alias for bedrock titan model (#368)

* Added alias for bedrock titan model

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* relax pinning on importlib_metadata, typing_extensions (#363)

== pinning is almost always to be avoided in packages,
and tight pinning on backports like importlib_metadata is extremely unlikely to be desirable

* Loads vector store index lazily (#374)

* Upgrades LangChain to 0.0.277

* Pinned Pydantic version, updated pydantic references

* add .yarn to .gitignore in 1.x for local dev

---------

Co-authored-by: michaelchia <[email protected]>
Co-authored-by: Andrii Ieroshenko <[email protected]>
Co-authored-by: Piyush Jain <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Min RK <[email protected]>
  • Loading branch information
6 people authored Sep 5, 2023
1 parent 22d5896 commit 93fee0a
Show file tree
Hide file tree
Showing 13 changed files with 111 additions and 66 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -130,3 +130,4 @@ dev.sh
.vscode

.jupyter_ystore.db
.yarn
41 changes: 32 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,45 @@
# Jupyter AI

Welcome to Jupyter AI, which brings generative AI to Jupyter. Jupyter AI provides a user-friendly
Jupyter AI connects generative AI with Jupyter notebooks. Jupyter AI provides a user-friendly
and powerful way to explore generative AI models in notebooks and improve your productivity
in JupyterLab and the Jupyter Notebook. More specifically, Jupyter AI offers:

* An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground.
This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, VSCode, etc.).
* A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant.
* Support for a wide range of generative model providers and models
(AI21, Anthropic, Cohere, Hugging Face, OpenAI, SageMaker, etc.).
* Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere,
Hugging Face, and OpenAI.

Documentation is available on [ReadTheDocs](https://jupyter-ai.readthedocs.io/en/latest/).

![A screenshot of Jupyter AI showing the chat interface and the magic commands](docs/source/_static/jupyter-ai-screenshot.png)

## Requirements

You will need to have installed the following software to use Jupyter AI:

- Python 3.8 - 3.11
- JupyterLab 4

In addition, you will need access to at least one model provider.

## Setting Up Model Providers in a Notebook

To use any AI model provider within this notebook, you'll need the appropriate credentials, such as API keys.

Obtain the necessary credentials (e.g., API keys) from your model provider's platform.

You can set your keys using environment variables or in a code cell in your notebook.
In a code cell, you can use the %env magic command to set the credentials as follows:

```python
# NOTE: Replace 'PROVIDER_API_KEY' with the credential key's name,
# and replace 'YOUR_API_KEY_HERE' with the key.
%env PROVIDER_API_KEY=YOUR_API_KEY_HERE
```

For more specific instructions for each model provider, refer to [the model providers documentation](https://jupyter-ai.readthedocs.io/en/latest/users/index.html#model-providers).

## Installation

You can use `conda` or `pip` to install Jupyter AI. If you're using macOS on an Apple Silicon-based Mac (M1, M1 Pro, M2, etc.), we strongly recommend using `conda`.
Expand Down Expand Up @@ -41,14 +69,9 @@ and create an environment that uses Python 3.11:
$ conda activate jupyter-ai
$ pip install jupyter_ai

If you are not using JupyterLab and you only want to install the Jupyter AI `%%ai` magic, skip the `pip install jupyter_ai` step above, and instead, run:

$ pip install jupyter_ai_magics


## The `%%ai` magic command

The `%%ai` magic works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Visual Studio Code, etc.).
The `%%ai` magic works anywhere the IPython kernel runs, including JupyterLab, Jupyter Notebook, Google Colab, and Visual Studio Code.

Once you have installed the `%%ai` magic, you can enable it in any notebook or the IPython shell by running:

Expand Down
Binary file added docs/source/_static/jupyter-ai-screenshot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions docs/source/contributors/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@

This page is intended for people interested in building new or modified functionality for Jupyter AI.

## Design principles

Maintainers of Jupyter AI have adopted principles that contributors should also follow. These principles, which build on top of [the Zen of Python](https://peps.python.org/pep-0020/), are intended to earn users' trust by keeping their data under their control. The following list is non-exhaustive; maintainers have discretion to interpret and revise these principles.

1. Jupyter AI is **vendor-agnostic**. Jupyter AI does not discriminate between available models, and gives users a choice of model providers. A feature in Jupyter AI may be specific to one model or model provider if it cannot be used with other models or providers.
2. Jupyter AI **only responds to an explicit prompt**; it does not watch files and it does not send prompts automatically. Any change that watches user files must be opt-in only.
3. Jupyter AI is **transparent** with its chat prompts. The chat interface and magic commands use system messages and prompt templates that are open source, so that users know what gets sent to language models.
4. Jupyter AI is **traceable**; users know when it has been used to generate content. When Jupyter AI generates a notebook, the notebook says that it was generated by Jupyter AI. When a user runs a Jupyter AI magic command in a notebook, output cells say, in their metadata, that they were generated by Jupyter AI.
5. Jupyter AI uses a **human-centered design**. The chat interface should look and feel like chat applications that are generally available. The magic commands should look and work like other IPython magic commands. Settings screens should be used minimally, and wherever they are used, they should be readable and understandable, even for users not fluent in the user interface language.

Issues and pull requests that violate the above principles may be declined. If you are unsure about whether your idea is a good fit for Jupyter AI, please [open an issue](https://github.com/jupyterlab/jupyter-ai/issues/new/choose) so that our maintainers can discuss it with you.

## Prerequisites

You can develop Jupyter AI on any system that can run a supported Python version up to and including 3.11, including recent Windows, macOS, and Linux versions.
Expand Down
15 changes: 15 additions & 0 deletions docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,21 @@ in JupyterLab and the Jupyter Notebook. More specifically, Jupyter AI offers:
* Support for a wide range of generative model providers and models
(AI21, Anthropic, Cohere, Hugging Face, OpenAI, SageMaker, etc.).

<img src="_static/jupyter-ai-screenshot.png"
alt='A screenshot of Jupyter AI showing the chat interface and the magic commands'
class="screenshot" />

## JupyterLab support

**Each major version of Jupyter AI supports *only one* major version of JupyterLab.** Jupyter AI 1.x supports
JupyterLab 3.x, and Jupyter AI 2.x supports JupyterLab 4.x. The feature sets of versions 1.0.0 and 2.0.0
are the same. We will maintain support for JupyterLab 3 for as long as it remains maintained.

The `main` branch of Jupyter AI targets the newest supported major version of JupyterLab. All new features and most bug fixes will be
committed to this branch. Features and bug fixes will be backported
to work on JupyterLab 3 only if developers determine that they will add sufficient value.
**We recommend that JupyterLab users who want the most advanced Jupyter AI functionality upgrade to JupyterLab 4.**

## Contents

```{toctree}
Expand Down
1 change: 1 addition & 0 deletions packages/jupyter-ai-magics/jupyter_ai_magics/magics.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
"gpt3": "openai:text-davinci-003",
"chatgpt": "openai-chat:gpt-3.5-turbo",
"gpt4": "openai-chat:gpt-4",
"titan": "bedrock:amazon.titan-tg1-large",
}


Expand Down
12 changes: 6 additions & 6 deletions packages/jupyter-ai-magics/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,17 +22,17 @@ dynamic = ["version", "description", "authors", "urls", "keywords"]

dependencies = [
"ipython",
"pydantic",
"importlib_metadata~=5.2.0",
"langchain==0.0.223",
"typing_extensions==4.5.0",
"pydantic~=1.0",
"importlib_metadata>=5.2.0",
"langchain==0.0.277",
"typing_extensions>=4.5.0",
"click~=8.0",
"jsonpath-ng~=1.5.3",
"jsonpath-ng>=1.5.3,<2",
]

[project.optional-dependencies]
dev = [
"pre-commit~=3.3.3"
"pre-commit>=3.3.3,<4"
]

test = [
Expand Down
11 changes: 0 additions & 11 deletions packages/jupyter-ai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,17 +122,6 @@ To execute them, run:
pytest -vv -r ap --cov jupyter_ai
```

#### Frontend tests

This extension is using [Jest](https://jestjs.io/) for JavaScript code testing.

To execute them, execute:

```sh
jlpm
jlpm test
```

#### Integration tests

This extension uses [Playwright](https://playwright.dev/docs/intro/) for the integration tests (aka user level tests).
Expand Down
2 changes: 1 addition & 1 deletion packages/jupyter-ai/jupyter_ai/chat_handlers/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ async def improve_code(code, llm=None, verbose=False):
chain = CodeImproverChain.from_llm(llm=llm, verbose=verbose)
improved_code = await chain.apredict(code=code)
improved_code = "\n".join(
[line for line in improved_code.split("/n") if not line.startswith("```")]
[line for line in improved_code.split("\n") if not line.startswith("```")]
)
return improved_code

Expand Down
57 changes: 35 additions & 22 deletions packages/jupyter-ai/jupyter_ai/chat_handlers/learn.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import argparse
import json
import os
from typing import Any, Awaitable, Coroutine, List
from typing import Any, Awaitable, Coroutine, List, Optional, Tuple

from dask.distributed import Client as DaskClient
from jupyter_ai.config_manager import ConfigManager
from jupyter_ai.document_loaders.directory import get_embeddings, split
from jupyter_ai.document_loaders.splitter import ExtensionSplitter, NotebookSplitter
from jupyter_ai.models import (
Expand All @@ -29,7 +30,7 @@
METADATA_SAVE_PATH = os.path.join(INDEX_SAVE_DIR, "metadata.json")


class LearnChatHandler(BaseChatHandler, BaseRetriever):
class LearnChatHandler(BaseChatHandler):
def __init__(
self, root_dir: str, dask_client_future: Awaitable[DaskClient], *args, **kwargs
):
Expand Down Expand Up @@ -59,10 +60,10 @@ def __init__(
if not os.path.exists(INDEX_SAVE_DIR):
os.makedirs(INDEX_SAVE_DIR)

self._load_or_create()
self._load()

def _load_or_create(self):
"""Loads the vector store and creates a new one if none exists."""
def _load(self):
"""Loads the vector store."""
embeddings = self.get_embedding_model()
if not embeddings:
return
Expand All @@ -73,14 +74,12 @@ def _load_or_create(self):
)
self.load_metadata()
except Exception as e:
self.create()
self.log.error("Could not load vector index from disk.")

async def _process_message(self, message: HumanChatMessage):
if not self.index:
self._load_or_create()

# If index is not still there, embeddings are not present
if not self.index:
# If no embedding provider has been selected
em_provider_cls, em_provider_args = self.get_embedding_provider()
if not em_provider_cls:
self.reply(
"Sorry, please select an embedding provider before using the `/learn` command."
)
Expand Down Expand Up @@ -153,7 +152,11 @@ async def learn_dir(self, path: str, chunk_size: int, chunk_overlap: int):
em_provider_cls, em_provider_args = self.get_embedding_provider()
delayed = get_embeddings(doc_chunks, em_provider_cls, em_provider_args)
embedding_records = await dask_client.compute(delayed)
self.index.add_embeddings(*embedding_records)
if self.index:
self.index.add_embeddings(*embedding_records)
else:
self.create(*embedding_records)

self._add_dir_to_metadata(path, chunk_size, chunk_overlap)
self.prev_em_id = em_provider_cls.id + ":" + em_provider_args["model_id"]

Expand Down Expand Up @@ -212,7 +215,6 @@ def delete(self):
for path in paths:
if os.path.isfile(path):
os.remove(path)
self.create()

async def relearn(self, metadata: IndexMetadata):
# Index all dirs in the metadata
Expand All @@ -234,15 +236,16 @@ async def relearn(self, metadata: IndexMetadata):
You can ask questions about these docs by prefixing your message with **/ask**."""
self.reply(message)

def create(self):
def create(
self,
embedding_records: List[Tuple[str, List[float]]],
metadatas: Optional[List[dict]] = None,
):
embeddings = self.get_embedding_model()
if not embeddings:
return
self.index = FAISS.from_texts(
[
"Jupyternaut knows about your filesystem, to ask questions first use the /learn command."
],
embeddings,
self.index = FAISS.from_embeddings(
text_embeddings=embedding_records, embedding=embeddings, metadatas=metadatas
)
self.save()

Expand All @@ -264,9 +267,6 @@ def load_metadata(self):
j = json.loads(f.read())
self.metadata = IndexMetadata(**j)

def get_relevant_documents(self, query: str) -> List[Document]:
raise NotImplementedError()

async def aget_relevant_documents(
self, query: str
) -> Coroutine[Any, Any, List[Document]]:
Expand All @@ -289,3 +289,16 @@ def get_embedding_model(self):
return None

return em_provider_cls(**em_provider_args)


class Retriever(BaseRetriever):
learn_chat_handler: LearnChatHandler = None

def _get_relevant_documents(self, query: str) -> List[Document]:
raise NotImplementedError()

async def _aget_relevant_documents(
self, query: str
) -> Coroutine[Any, Any, List[Document]]:
docs = await self.learn_chat_handler.aget_relevant_documents(query)
return docs
6 changes: 3 additions & 3 deletions packages/jupyter-ai/jupyter_ai/extension.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import time

from dask.distributed import Client as DaskClient
from jupyter_ai.chat_handlers.learn import Retriever
from jupyter_ai_magics.utils import get_em_providers, get_lm_providers
from jupyter_server.extension.application import ExtensionApp

Expand Down Expand Up @@ -93,9 +94,8 @@ def initialize_settings(self):
dask_client_future=dask_client_future,
)
help_chat_handler = HelpChatHandler(**chat_handler_kwargs)
ask_chat_handler = AskChatHandler(
**chat_handler_kwargs, retriever=learn_chat_handler
)
retriever = Retriever(learn_chat_handler=learn_chat_handler)
ask_chat_handler = AskChatHandler(**chat_handler_kwargs, retriever=retriever)
self.settings["jai_chat_handlers"] = {
"default": default_chat_handler,
"/ask": ask_chat_handler,
Expand Down
10 changes: 5 additions & 5 deletions packages/jupyter-ai/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,16 @@ classifiers = [
dependencies = [
"jupyter_server>=1.6,<3",
"jupyterlab>=3.5,<4",
"pydantic",
"pydantic~=1.0",
"openai~=0.26",
"aiosqlite~=0.18",
"importlib_metadata~=5.2.0",
"langchain==0.0.223",
"aiosqlite>=0.18",
"importlib_metadata>=5.2.0",
"langchain==0.0.277",
"tiktoken", # required for OpenAIEmbeddings
"jupyter_ai_magics",
"dask[distributed]",
"faiss-cpu", # Not distributed by official repo
"typing_extensions==4.5.0"
"typing_extensions>=4.5.0",
]

dynamic = ["version", "description", "authors", "urls", "keywords"]
Expand Down
9 changes: 0 additions & 9 deletions packages/jupyter-ai/src/__tests__/jupyter_gai.spec.ts

This file was deleted.

0 comments on commit 93fee0a

Please sign in to comment.