Skip to content

Commit

Permalink
feat: count prompt tokens
Browse files Browse the repository at this point in the history
  • Loading branch information
FlorentLvr committed Sep 27, 2023
1 parent 00c9277 commit 00234df
Showing 1 changed file with 254 additions and 0 deletions.
254 changes: 254 additions & 0 deletions Naas Chat Plugin/Naas_Chat_Plugin_Count_prompt_tokens.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,254 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a92ad770-55bd-450c-8e6e-de64c0347cdd",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"<img width=\"10%\" alt=\"Naas\" src=\"https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160\"/>"
]
},
{
"cell_type": "markdown",
"id": "a5b2c509-2c29-49e8-af91-4f3f1e386da3",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"# Naas Chat Plugin - Count prompt tokens"
]
},
{
"cell_type": "markdown",
"id": "d77fe283-4edd-42d3-a909-8e207d4b842f",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**Tags:** #naaschatplugin #naas #naas_driver #chat #plugin #ai #tokens"
]
},
{
"cell_type": "markdown",
"id": "b90f2d91-c886-4e36-8265-b09d06bb1c7f",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**Author:** [Florent Ravenel](https://www.linkedin.com/in/florent-ravenel)"
]
},
{
"cell_type": "markdown",
"id": "6bbf7807-dda4-4b8a-b016-fe258a0fa33f",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**Last update:** 2023-09-25 (Created: 2023-09-25)"
]
},
{
"cell_type": "markdown",
"id": "214749fe-7f0b-4755-b7ea-1d200c234cc6",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**Description:** This notebook will count the number of tokens in a prompt."
]
},
{
"cell_type": "markdown",
"id": "a1329bea-ad80-4981-874c-8776b11f89a8",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**References:**\n",
"- [How to count tokens with tiktoken](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb)\n",
"- [Naas Chat Plugin driver](https://github.com/jupyter-naas/drivers/blob/main/naas_drivers/tools/naas_chat_plugin.py)"
]
},
{
"cell_type": "markdown",
"id": "cff349c8-2816-4ae4-9229-027c068eeb51",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"## Input"
]
},
{
"cell_type": "markdown",
"id": "80266a9e-fe54-4f3e-aeb2-01483bbc53f9",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"### Import libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc038b10-2679-42bc-909e-09a298339df4",
"metadata": {
"papermill": {},
"tags": []
},
"outputs": [],
"source": [
"from naas_drivers import naas_chat_plugin"
]
},
{
"cell_type": "markdown",
"id": "e930da4d-39a2-424e-8808-77fd0a3829bf",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"### Setup variables\n",
"- `prompt`: The input string to be tokenized.\n",
"- `encoding_name`: The name of the encoding to be used for tokenization."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6cf9bd9b-ce83-45a5-ab0f-15c978458b42",
"metadata": {
"papermill": {},
"tags": []
},
"outputs": [],
"source": [
"prompt = \"As a school teacher, You will help correct my text to ensure there are no mistakes. Please present yourself and review and correct a text everytime I write something to you.\"\n",
"encoding_name = 'cl100k_base'"
]
},
{
"cell_type": "markdown",
"id": "9070993a-eec9-4f78-a7f9-2e18c743a20d",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"## Model"
]
},
{
"cell_type": "markdown",
"id": "2d959298-e04a-459f-85d9-1b12bb5754c9",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"### Count number of tokens in prompt"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ac0af791-9f6d-4da8-82d0-1defed0bc59f",
"metadata": {
"papermill": {},
"tags": []
},
"outputs": [],
"source": [
"num_tokens = naas_chat_plugin.num_tokens_from_string(\n",
" prompt,\n",
" encoding_name\n",
")"
]
},
{
"cell_type": "markdown",
"id": "81ce4521-731f-42cb-8ace-0e7611c750b3",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"## Output"
]
},
{
"cell_type": "markdown",
"id": "34b625a0-f39b-4c7e-82f5-cc58a11ec902",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"### Display result"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "39437f23-bcf8-40df-9538-32b3cac1c8fb",
"metadata": {
"papermill": {},
"tags": []
},
"outputs": [],
"source": [
"print(\"Tokens:\", num_tokens)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b3ef64bd-8c6b-4096-bc0b-eff3efc174d3",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

0 comments on commit 00234df

Please sign in to comment.